John A List, Azeem M Shaikh, Yang Xu
Cited by*: 33 Downloads*: 278

Empiricism in the sciences allows us to test theories, formulate optimal policies, and learn how the world works. In this manner, it is critical that our empirical work provides accurate conclusions about underlying data patterns. False positives represent an especially important problem, as vast public and private resources can be misguided if we base decisions on false discovery. This study explores one especially pernicious influence on false positives-multiple hypothesis testing (MHT). While MHT potentially affects all types of empirical work, we consider three common scenarios where MHT influences inference within experimental economics: jointly identifying treatment effects for a set of outcomes, estimating heterogenous treatment effects through subgroup analysis, and conducting hypothesis testing for multiple treatment conditions. Building upon the work of Romano and Wolf (2010), we present a correction procedure that incorporates the three scenarios, and illustrate the improvement in power by comparing our results with those obtained by the classic studies due to Bonferroni (1935) and Holm (1979). Importantly, under weak assumptions, our testing procedure asymptotically controls the familywise error rate - the probability of one false rejection - and is asymptotically balanced. We showcase our approach by revisiting the data reported in Karlan and List (2007), to deepen our understanding of why people give to charitable causes.
Uri Gneezy, John A List, Jeffrey A Livingston, Xiangdong Qin, Sally Sadoff, Yang Xu
Cited by*: 1 Downloads*: 124

Tests measuring and comparing educational achievement are an important policy tool. We experimentally show that offering students extrinsic incentives to put forth effort on such achievement tests has differential effects across cultures. Offering incentives to U.S. students, who generally perform poorly on assessments, improved performance substantially. In contrast, Shanghai students, who are top performers on assessments, were not affected by incentives. Our findings suggest that in the absence of extrinsic incentives, ranking countries based on low-stakes assessments is problematic because test scores reflect differences in intrinsic motivation to perform well on the test itself, and not just differences in ability.
  • 1 of 1