Introduction       Types of Mistakes        Suggestions      Resources       Table of Contents     About    Glossary    Blog


Here are some of the resources that have been used in creating this website, plus others that are worth reading or consulting. Also see references in the footnotes of individual pages on this site.

Agresti, Alan (2010) Analysis of Ordinal Categorical Data, Wiley

American Statistical Association, Ethical Guidelines for Statistical Practice
Some of the items mentioned in this website (e.g. cautions regarding multiple inference) are considered matters of ethical practice.

Beimer P and L Lyberg (2003), Introduction to Survey Quality, Wiley.
An introduction to sources of errors in surveys.

Berk, Richard 
(2004). Regression Analysis: A Constructive Critique, Sage.  Preview available on Google Books; you can read the preface (by Jan De Leeuw) online at
The title aptly describes the spirit of the book.

Bethlehem, Jelke (2009). Applied Survey Methods: A Statistical Perspective, Wiley.
Includes an overview of the survey process; questionnaire design;  sampling designs; sources of error; nonresponse; online surveys; guidelines on use of graphs. Unfortunately, the coverage of confidence intervals is weak, and there is no discussion of multiple inference.

F. Betz, T. Hothorn, P. Westfall (2010). Multiple Comparisons Using R, CRC Press
A concise yet quite comprehensive account of multiple testing, covering a variety of methodologies, with a unifying theme of maximum statistics. Includes descriptions of software implementations available in the R package.

K. P. Burnham and D. R. Anderson (2002), Model Selection and Multimodel Inference: A Practical Infomation-Theoretic Approach, 2nd ed., Springer
Athorough discussion of Akaike's Information Criterion and related methods, plus methods of taking model-selection uncertainty into account when estimating parameters. Definitely recommended, especially if you are working with observational data.

Chance News,
Quoting from the home page: "Chance News reviews current issues in the news that use probability or statistical concepts. It uses Wikipedia software to allow readers to add articles or change existing articles using the edit option." Most entries include discussion questions for use in class.

Cook, R. Dennis and Sanford Weisberg (1999). Applied Regression Including Computing and Graphics, Wiley.
Stronger on model checking, diagnostics, and cautions about common misapplications than most regression textbooks. Also serves as a user's manual for the regression software arc, which has user-friendly features for transforming toward multivariate normality and for various regression diagnostic techniques.  (Unfortunately, the Unix and Macintosh versions of arc are no longer well supported.) I used it for a number of years as a textbook. You can find my lecture notes at (However, I think that I might do some things differently were I to teach the course again -- e.g., place more emphasis on cautions regarding multiple inference and importance of model validation.)

Cressie, Noel and Christopher K. Wikle (2011), Statistics for Spatio-Temporal Data, Wiley.
Probably the premier reference for analyzing spatial and temporal data. Chapter 1 is an easy-to-read discussion of the importance of the subject and the problems it may involve; Chapter 2 requires considerable background in mathematics and mathematical statistics.

Dean, Angela and Daniel Voss (1999). Design and Analysis of Experiments, Springer.
Stronger than most Analysis of Variance textbooks in linking design and analysis, and in emphasizing the problem of multiple inference. I have used it for several years as a textbook. You can find my lecture notes at

Doshi P., M Jones and T. Jefferson (2012). Rethinking credible evidence synthesis, British Medical Journal 344, Article Number: d7898 DOI: 10.1136/bmj.d7898S.

Dudoit and M. J. van der Laan (2008), Multiple Testing Procedures with Application to Genomics, Springer
Points out how published reports of clinical trials may omit important information that is in the clincial trial reports.

Eddington, Eugene S., Randomization Tests, 1995, Marcel Dekker

B. Efron (2010), Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, Cambridge.
Probably the most up-to-date reference on methods for dealing with multiple inference, esepcially for large data sets. Also includes a good summary of older developments.

Freedman. David A. (2005). Statistical Models: Theory and Practice. Cambridge University Press 
A text for a second course in statistics, focusing mainly on applications to the social and health sciences and on regression and related topics; full of cautions.

Freedman, David A. (2010), ed. by David Collier, Jasjeet S. Sekhon, and Philip B. Stark, Statistical Models and Causal Inference: A Dialogue with the Social Sciences, Cambridge
Freedman passed away in 2008, but several of his writings were collected posthumously in this book. Definitely worth reading.
Also, Philip Stark maintains a website at, where many of Freedman's preprints and other notes may be downloaded. "On types of Scientific Enquiry" and "Oasis or Mirage?" are particularly recommended.

Gigerenzer, Gerd et al (2007)."Helping doctors and patients make sense of health statistics," Psychological Science in the Public Interest, vo. 8, No. 2, pp. 53 - 96. Download from
Discusses a number of confusions that affect medical care. Also discusses ways to explain the topics that can help improve understanding.  A somewhat shortened variation has appeared as "Knowing your chances: What health stats really mean," Scientific American Mind, April/May/June 2009, pp. 44 - 51.

Good, Phillip I. and James W. Hardin (2006), Common errors in Statistics (and How to Avoid Them), Wiley (Third edition 2009)
Recommended reading. A notable quote (p. ix): "...access to statistical software will no more make one a statistician, than access to a  chainsaw will make one a lumberjack. Allowing these tools to do our thinking for us is a sure recipe for disaster -- just ask any emergency room physician." Includes discussion of formulating hypotheses,  experimental design, choice of estimator and test statistic, model assumptions, strengths and limitations of various statistical procedures, reporting results, interpreting results, graphics, model selection, validation. Extensive references for further reading. One weakness is lack of discussion of assumptions of statistical procedures.

Good, P. (2005) Introduction to Statistics Through Resampling Methods and Microsoft Office Excel. Wiley

Harris, A. H. S., R. Reeder and J. K. Hyun (2009), Common statistical and research design problems in manuscripts submitted to high-impact psychiatry journals: What editors and reviewers want authors to know, Journal of Psychiatric Research, vol 43 no15, 1231 -1234
Discussion of common serious statistical and design problems in manuscripts submitted to major psychiatry journals, based on a survey of editors and reviewers of those journals. Intended to help researchers and authors improve the quality of research and manuscripts submitted to journals, and to forestall the waste of time and resources that occurs when papers are rejected because of poor quality and then resubmitted to other journals in the hope that another journal will accept them.

Hochberg, Y. and Tamhane, A. (1987) Multiple Comparison Procedures, Wiley

Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124, available at
See also commentary: Steven Goodman and Sander Greenland. "Why Most Published Research Findings Are False: Problems in the Analysis". PLoS Medicine 4 (4): e168. doi:10.1371/journal.pmed.0040168; Pauker SG (2005) The Clinical Interpretation of Research. PLoS Med 2(11): e395. doi:10.1371/journal.pmed.0020395; Wren JD (2005) Truth, Probability, and Frameworks. PLoS Med 2(11): e361. doi:10.1371/journal.pmed.0020361; The PLoS Medicine Editors (2005) Minimizing Mistakes and Embracing Uncertainty. PLoS Med 2(8): e272. doi:10.1371/journal.pmed.0020272
Popular press accounts include: David H. Freedman, Lies, Damned Lies, and Medical Science, The Atlantic, November 2010,
Jonah Lehrer, The Truth Wears Off: Is there something wrong with the scientific method?, The New Yorker, December 13, 2010,

Ioannidis, John P. A., An Epidemic of False Claims, Scientific American, June, 2011,

Koenker, Roger (2005). Quantile Regression, Econometric Society Monographs, Cambridge.
Extensive exposition of the subject. See also for more elementary introductions, software possibilities, and errata.

Liu, Wei (2011) Simultaneous Inference in Regression, CRC Press. Liu also has Matlab® programs for calculating the confidence bands available from his website.(Click on the link to the book.)

Marshall, E. (2011). Unseen world of  clincical trials emerges from US database, Science 333:145.
An interview with the director of, pointing out some potential problems with the design and analysis of clinical trials.

Meng, Xiao-Li (2009) "Desired and Feared -- What Do We Do Now and Over the Next 50 Years?", The American Statistician, vol. 63 No. 3, pp. 202 - 210. Download pdf from Andrew Gelman's website.
A discussion, by the chair of the Harvard Statistics Department, of some of the challenges and opportunities facing the profession. Sections 6 - 8 (pp. 205 - 208) are particularly relevant to the topic of this web site.

Moore, David S., together with various co-authors, has written various introductory statistics texts  (e.g., The Basic Practice of Statistics, and Introduction to the Practice of Statistics, with George P. McCabe), published by Freeman,  that are among the best for pointing out many of the common errors in using statistics.

Moore, Thomas (2010), Using baboon “mothering” behavior to teach permutation tests, Cause Webinar,           
Video and power-point slides. A gentle introduction to permutation tests. 

Rice Virtual Lab in Statistics, Simulations/Demonstrations,
Several simulations that can help illustrate various concepts and potential pitfalls in using statistics.

Robbins, N. (2004), Creating More Effective Graphs, Wiley
Many examples of poor graphs and better alternativieis.

Ryan, Thomas P. (2009), Modern Regression Methods, Wiley.
A good resource for those teaching, using or interpreting regression. Points out many common misunderstandings, misapplications, and misinterpretations. An extensive chapter on diagnostics and remedial measures. Discussion of many controversies. Extensive references are included.

Seber, George A. F. and Mohammad M. Salehi (2013) , Adaptive Sampling Designs: Inference for Sparse and Clustered Populations, Springer

S. Senn and S. Julious (2009), Measurements in clinical trials: A neglected issue for statisticians? Statistics in Medicine 28: 3189-3209
Discussion of some statistical issues involved in choosing predictor and outcome variables.

Strasak, A. M et al (2007a). Statistical errors in medical reseaerch - a review of common pitfalls, Swiss Medical  Weekly 2007; 137, 44 - 49, available at
Discussion (as well as presentation in table form) of 47 common statistical pitfalls in medical research. Although aimed at medical researchers, the article can serve as a guideline for researchers in many other fields. See also teh companion article by Young.

Strasak, A. M. et al (2007b), The Use of Statistics in Medical Research, The American Statistician. February 1, 2007, 61(1): 47-55
A survey of articles in the 2004 volumes of  The New England Journal of Medicine and Nature Medicine examining use of statistics and errors in using or reporting statistical techniques.

Utts, Jessica (2005) Seeing Through Statistics, Brooks/Cole (Thompson)
An introduction to statistics aimed at the consumer rather than the producer. Each chapter starts with several "thought questions." Includes a sections on reading a news report of a study and one on wording of questions; several "Cautions", "Warnings" and  "Difficulties and Disasters" sections; and lots of case studies.

Van Belle, Gerald (2008). Statistical Rules of Thumb, 2nd ed., Wiley.
Lots of suggestions that can help forestall mistakes, but not all-inclusive. Includes a section on Evidence Based Medicine.

Wainer, Howard (1997) Visual Revelations, Copernicus (Springer-Verlag).
Chapter 1 ("How to Display Data Badly," pp. 11 - 46) gives many examples of poor graphical displays, as well as better alternatives. Chapters 8 - 10 (pp. 87 - 102) point out shortcomings of three commonly used graphical formats (pie charts, double Y-axis graphs, and tabular presentations), with suggestions on improvements, alternatives, and/or when to use and when not to use these formats. The whole book has lots of interesting examples.

Wainer, Howard  (2009) Picturing the Uncertain World, Princeton University Press
Full of case studies focusing on how the way data are presented can influence what we see or don't see.

P. H. Westfall and S. S. Young (1993), Resampling-based Multiple Testing: Examples and Methods for p-Value Adjustment, Wiley
Uses an "adjusted p-value" approach to multiple testing, based on resampling methods.

Woloshin, Steven, Schwartz, Lisa, and Welch, H. Gilbert (2008). Know Your Chances, University of California Press.
A primer on health risks at a very basic level of quantitative literacy.

Young, James (2007), Statistical errors in medical research - a chronic disease? Swiss Medical  Weekly 2007; 137, 41 - 43, available at
A commentary and elaboration on Strasak et al (2007b) above. (Young is Statistical Advisor for the Swiss Medical Weekly)

Last updated June 4, 2013