We close with a list of things of interest we have discovered while writing this text.
Abdulkadiroğlu, Atila, Joshua D Angrist, Yusuke Narita, and Parag A Pathak. 2017. “Research Design Meets Market Design: Using Centralized Assignment for Impact Evaluation.” Econometrica 85 (5): 1373–1432.
Bloom, Howard S., Stephen W. Raudenbush, Michael J. Weiss, and Kristin Porter. 2016.
“Using Multisite Experiments to Study Cross-Site Variation in Treatment Effects: A Hybrid Approach With Fixed Intercepts and a Random Treatment Coefficient.” Journal of Research on Educational Effectiveness 10 (4): 0–0.
https://doi.org/10.1080/19345747.2016.1264518.
Brown, Morton B., and Alan B. Forsythe. 1974.
“The Small Sample Behavior of Some Statistics Which Test the Equality of Several Means.” Technometrics 16 (1): 129–32.
https://doi.org/10.1080/00401706.1974.10489158.
Davison, A. C., and D. V. Hinkley. 1997. Bootstrap Methods and Their Applications. Cambridge: Cambridge University Press.
Dong, Nianbo, and Rebecca Maynard. 2013.
“PowerUp! : A Tool for Calculating Minimum Detectable Effect Sizes and Minimum Required Sample Sizes for Experimental and Quasi-Experimental Design Studies.” Journal of Research on Educational Effectiveness 6 (1): 24–67.
https://doi.org/10.1080/19345747.2012.673143.
Efron, Bradley. 2000.
“The Bootstrap and Modern Statistics.” Journal of the American Statistical Association 95 (452): 1293–96.
https://doi.org/10.2307/2669773.
Faul, Franz, Edgar Erdfelder, Axel Buchner, and Albert-Georg Lang. 2009.
“Statistical Power Analyses Using G*Power 3.1: Tests for Correlation and Regression Analyses.” Behavior Research Methods 41 (4): 1149–60.
https://doi.org/10.3758/BRM.41.4.1149.
Fryda, Tomas, Erin LeDell, Navdeep Gill, Spencer Aiello, Anqi Fu, Arno Candel, Cliff Click, et al. 2014.
“H2o: R Interface for the ’H2O’ Scalable Machine Learning Platform.” Comprehensive R Archive Network.
https://doi.org/10.32614/CRAN.package.h2o.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013.
Bayesian Data Analysis. 0th ed.
Chapman and Hall/CRC.
https://doi.org/10.1201/b16018.
Good, Phillip. 2013. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer Science & Business Media.
Hunter, Kristen B., Luke Miratrix, and Kristin Porter. 2024.
“PUMP: Estimating Power, Minimum Detectable Effect Size, and Sample Size When Adjusting for Multiple Outcomes in Multi-Level Experiments.” Journal of Statistical Software 108 (6): 1–43.
https://doi.org/10.18637/jss.v108.i06.
James, G. S. 1951.
“The Comparison of Several Groups of Observations When the Ratios of the Population Variances Are Unknown.” Biometrika 38 (3/4): 324.
https://doi.org/10.2307/2332578.
Jones, Owen, Robert Maillardet, and Andrew Robinson. 2012.
Introduction to Scientific Programming and Simulation Using R. New York:
Chapman and Hall/CRC.
https://doi.org/10.1201/9781420068740.
Kern, Holger L., Elizabeth A. Stuart, Jennifer Hill, and Donald P. Green. 2014.
“Assessing Methods for Generalizing Experimental Impact Estimates to Target Populations.” Journal of Research on Educational Effectiveness 9 (1): 103–27.
https://doi.org/10.1080/19345747.2015.1060282.
Lehmann, Erich Leo et al. 1975. “Statistical Methods Based on Ranks.” Nonparametrics. San Francisco, CA, Holden-Day 2.
Long, J. Scott, and Laurie H. Ervin. 2000.
“Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model.” The American Statistician 54 (3): 217–24.
https://doi.org/10.1080/00031305.2000.10474549.
Mehrotra, Devan V. 1997.
“Improving the Brown-Forsythe Solution to the Generalized Behrens-Fisher Problem.” Communications in Statistics - Simulation and Computation 26 (3): 1139–45.
https://doi.org/10.1080/03610919708813431.
Miratrix, Luke W, Michael J Weiss, and Brit Henderson. 2021. “An Applied Researcher’s Guide to Estimating Effects from Multisite Individually Randomized Trials: Estimands, Estimators, and Estimates.” Journal of Research on Educational Effectiveness 14 (1): 270–308.
Robert, Christian, and George Casella. 2010.
Introducing Monte Carlo Methods with R. New York, NY: Springer.
https://doi.org/10.1007/978-1-4419-1576-4.
Staiger, Douglas O, and Jonah E Rockoff. 2010. “Searching for Effective Teachers with Imperfect Information.” Journal of Economic Perspectives 24 (3): 97–118.
Sundberg, Rolf. 2003. “Conditional statistical inference and quantification of relevance.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65 (1): 299–315.
Tipton, Elizabeth. 2013.
“Stratified Sampling Using Cluster Analysis: A Sample Selection Strategy for Improved Generalizations from Experiments.” Evaluation Review 37 (2): 109–39.
https://doi.org/10.1177/0193841X13516324.
Welch, B. L. 1951.
“On the Comparison of Several Mean Values: An Alternative Approach.” Biometrika 38 (3/4): 330.
https://doi.org/10.2307/2332579.
Westfall, Peter H, and Kevin SS Henning. 2013. Understanding Advanced Statistical Methods. Vol. 543. CRC Press Boca Raton, FL.
White, Halbert. 1980. “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity.” Econometrica 48 (4): 817–38.
Wickham, Hadley. 2014.
“Tidy Data.” Journal of Statistical Software 59 (10): 1–23.
https://doi.org/10.18637/jss.v059.i10.