B Further readings and resources

We close with a list of things of interest we have discovered while writing this text.

  • Morris, White, & Crowther (2019). Using simulation studies to evaluate statistical methods.

  • High-level simulation design considerations.

  • Details about performance criteria calculations.

  • Stata-centric.

  • SimDesign R package (Chalmers, 2019)

  • Tools for building generic simulation workflows.

  • Chalmers & Adkin (2019). Writing effective and reliable Monte Carlo simulations with the SimDesign package.

  • DeclareDesign (Blair, Cooper, Coppock, & Humphreys)

  • Specialized suite of R packages for simulating research designs.

  • Design philosophy is very similar to “tidy” simulation approach.

  • SimHelpers R package (Joshi & Pustejovsky, 2020)

  • Helper functions for calculating performance criteria.

  • Includes Monte Carlo standard errors.

Abdulkadiroğlu, Atila, Joshua D Angrist, Yusuke Narita, and Parag A Pathak. 2017. “Research Design Meets Market Design: Using Centralized Assignment for Impact Evaluation.” Econometrica 85 (5): 1373–1432.
Bloom, Howard S., Stephen W. Raudenbush, Michael J. Weiss, and Kristin Porter. 2016. Using Multisite Experiments to Study Cross-Site Variation in Treatment Effects: A Hybrid Approach With Fixed Intercepts and a Random Treatment Coefficient.” Journal of Research on Educational Effectiveness 10 (4): 0–0. https://doi.org/10.1080/19345747.2016.1264518.
Brown, Morton B., and Alan B. Forsythe. 1974. “The Small Sample Behavior of Some Statistics Which Test the Equality of Several Means.” Technometrics 16 (1): 129–32. https://doi.org/10.1080/00401706.1974.10489158.
Davison, A. C., and D. V. Hinkley. 1997. Bootstrap Methods and Their Applications. Cambridge: Cambridge University Press.
Dong, Nianbo, and Rebecca Maynard. 2013. PowerUp! : A Tool for Calculating Minimum Detectable Effect Sizes and Minimum Required Sample Sizes for Experimental and Quasi-Experimental Design Studies.” Journal of Research on Educational Effectiveness 6 (1): 24–67. https://doi.org/10.1080/19345747.2012.673143.
Efron, Bradley. 2000. “The Bootstrap and Modern Statistics.” Journal of the American Statistical Association 95 (452): 1293–96. https://doi.org/10.2307/2669773.
Faul, Franz, Edgar Erdfelder, Axel Buchner, and Albert-Georg Lang. 2009. “Statistical Power Analyses Using G*Power 3.1: Tests for Correlation and Regression Analyses.” Behavior Research Methods 41 (4): 1149–60. https://doi.org/10.3758/BRM.41.4.1149.
Fryda, Tomas, Erin LeDell, Navdeep Gill, Spencer Aiello, Anqi Fu, Arno Candel, Cliff Click, et al. 2014. “H2o: R Interface for the ’H2OScalable Machine Learning Platform.” Comprehensive R Archive Network. https://doi.org/10.32614/CRAN.package.h2o.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 0th ed. Chapman and Hall/CRC. https://doi.org/10.1201/b16018.
Good, Phillip. 2013. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer Science & Business Media.
Hunter, Kristen B., Luke Miratrix, and Kristin Porter. 2024. “PUMP: Estimating Power, Minimum Detectable Effect Size, and Sample Size When Adjusting for Multiple Outcomes in Multi-Level Experiments.” Journal of Statistical Software 108 (6): 1–43. https://doi.org/10.18637/jss.v108.i06.
James, G. S. 1951. “The Comparison of Several Groups of Observations When the Ratios of the Population Variances Are Unknown.” Biometrika 38 (3/4): 324. https://doi.org/10.2307/2332578.
Jones, Owen, Robert Maillardet, and Andrew Robinson. 2012. Introduction to Scientific Programming and Simulation Using R. New York: Chapman and Hall/CRC. https://doi.org/10.1201/9781420068740.
Kern, Holger L., Elizabeth A. Stuart, Jennifer Hill, and Donald P. Green. 2014. Assessing Methods for Generalizing Experimental Impact Estimates to Target Populations.” Journal of Research on Educational Effectiveness 9 (1): 103–27. https://doi.org/10.1080/19345747.2015.1060282.
Lehmann, Erich Leo et al. 1975. “Statistical Methods Based on Ranks.” Nonparametrics. San Francisco, CA, Holden-Day 2.
Long, J. Scott, and Laurie H. Ervin. 2000. “Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model.” The American Statistician 54 (3): 217–24. https://doi.org/10.1080/00031305.2000.10474549.
Mehrotra, Devan V. 1997. “Improving the Brown-Forsythe Solution to the Generalized Behrens-Fisher Problem.” Communications in Statistics - Simulation and Computation 26 (3): 1139–45. https://doi.org/10.1080/03610919708813431.
Miratrix, Luke W, Michael J Weiss, and Brit Henderson. 2021. “An Applied Researcher’s Guide to Estimating Effects from Multisite Individually Randomized Trials: Estimands, Estimators, and Estimates.” Journal of Research on Educational Effectiveness 14 (1): 270–308.
Robert, Christian, and George Casella. 2010. Introducing Monte Carlo Methods with R. New York, NY: Springer. https://doi.org/10.1007/978-1-4419-1576-4.
Staiger, Douglas O, and Jonah E Rockoff. 2010. “Searching for Effective Teachers with Imperfect Information.” Journal of Economic Perspectives 24 (3): 97–118.
Sundberg, Rolf. 2003. Conditional statistical inference and quantification of relevance.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65 (1): 299–315.
Tipton, Elizabeth. 2013. “Stratified Sampling Using Cluster Analysis: A Sample Selection Strategy for Improved Generalizations from Experiments.” Evaluation Review 37 (2): 109–39. https://doi.org/10.1177/0193841X13516324.
Welch, B. L. 1951. “On the Comparison of Several Mean Values: An Alternative Approach.” Biometrika 38 (3/4): 330. https://doi.org/10.2307/2332579.
Westfall, Peter H, and Kevin SS Henning. 2013. Understanding Advanced Statistical Methods. Vol. 543. CRC Press Boca Raton, FL.
White, Halbert. 1980. “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity.” Econometrica 48 (4): 817–38.
Wickham, Hadley. 2014. “Tidy Data.” Journal of Statistical Software 59 (10): 1–23. https://doi.org/10.18637/jss.v059.i10.