Maria De Paola, Francesca Gioia, Vincenzo Scoppa
Cited by*: None Downloads*: None

We ran a field experiment to investigate whether competing in rank-order tournaments with different prize spreads affects individual performance. Our experiment involved students from an Italian University who took an exam that was partly evaluated on the basis of relative performance. Students were matched in pairs on the basis of their high school grades and each pair was randomly assigned to one of three different tournaments. Random assignment neutralizes selection effects and allows us to investigate if larger prize spreads increase individual effort. We do not find any positive effect of larger prizes on performance. Furthermore, we show that the effect of prize spreads on students' performance depends on their degree of risk-aversion: competing in tournaments with large spreads negatively affects the performance of risk-averse students, while it does not produce any effect on students who are more prone to taking risks.
Bharat Chandar, Ali Hortacsu, John A List, Ian Muir, Jeffrey M Wooldridge
Cited by*: None Downloads*: None

Field experiments conducted with the village, city, state, region, or even country as the unit of randomization are becoming commonplace in the social sciences. While convenient, subsequent data analysis may be complicated by the constraint on the number of clusters in treatment and control. Through a battery of Monte Carlo simulations, we examine best practices for estimating unit-level treatment effects in cluster-randomized field experiments, particularly in settings that generate short panel data. In most settings we consider, unit-level estimation with unit fixed effects and cluster-level estimation weighted by the number of units per cluster tend to be robust to potentially problematic features in the data while giving greater statistical power. Using insights from our analysis, we evaluate the effect of a unique field experiment: a nationwide tipping field experiment across markets on the Uber app. Beyond the import of showing how tipping affects aggregate outcomes, we provide several insights on aspects of generating and analyzing cluster-randomized experimental data when there are constraints on the number of experimental units in treatment and control.
Michal Krawczyk, Ernesto Reuben
Cited by*: None Downloads*: None

This article reports results of a field experiment in which two hundred e-mails were sent to authors of recent articles in economics that had promised to send the interested reader supplementary material, such as alternative econometric specifications, "upon request." The e-mails were sent either by a researcher affiliated at Columbia University, New York or the University of Warsaw, Poland; furthermore, the authors' position (assistant professor) was specified in half the e-mails only. Overall, 64% of the approached authors responded to our message, of which two thirds (44% of the entire sample) delivered the requested materials. The frequency and speed of responding and delivering were very weakly affected by the position and affiliation of the sender. Gender of affiliation of the author, number of citations or journal impact factory or the type of object in question seemed to make no difference. However, authors of published articles were much more likely to share than authors of working papers.
John A List, Ian Muir, Gregory Sun
Cited by*: None Downloads*: None

This study investigates how to use regression adjustment to reduce variance in experimental data. We show that the estimators recommended in the literature satisfy an orthogonality property with respect to the parameters of the adjustment. This observation greatly simplifies the derivation of the asymptotic variance of these estimators and allows us to solve for the efficient regression adjustment in a large class of adjustments. Our efficiency results generalize a number of previous results known in the literature. We then discuss how this efficient regression adjustment can be feasibly implemented. We show the practical relevance of our theory in two ways. First, we use our efficiency results to improve common practices currently employed in field experiments. Second, we show how our theory allows researchers to robustly incorporate machine learning techniques into their experimental estimators to minimize variance.
John A List
Cited by*: None Downloads*: None

In 2019, I put together a summary of data from my field experiments website that pertained to natural field experiments. Several people have asked me if I have an update. In this document I update all figures and numbers to show the details for 2022. I also include the description from the 2019 paper below.
Maria De Paola, Francesca Gioia, Vincenzo Scoppa
Cited by*: None Downloads*: None

We investigate whether and how social ties affect performance in teams by implementing a field experiment in which a sample of undergraduate students are randomly assigned either to teams composed by friends or to teams composed by individuals not linked by friendship relationships. Students undertake an intermediate exam divided into two parts: one graded on the basis of individual performance and the other graded on the basis of team performance. We find that students assigned to socially connected teams perform significantly better than control students in both the team part and the individual part of the exam, suggesting that social ties are relevant both for solving free-riding problems and for inducing knowledge spillovers among teammates. The positive effect of friendship persists over time: treated students obtain better grades also after the conclusion of the experiment.
Omar Al-Ubaydli, John A List, Dana L Suskind
Cited by*: None Downloads*: None

Policymakers are increasingly turning to insights gained from the experimental method as a means of informing public policies. Whether-and to what extent-insights from a research study scale to the level of the broader public is, in many situations, based on blind faith. This scale-up problem can lead to a vast waste of resources, a missed opportunity to improve people's lives, and a diminution in the public's trust in the scientific method's ability to contribute to policymaking. This study provides a theoretical lens to deepen our understanding of the science of how to use science. Through a simple model, we highlight three elements of the scale-up problem: (1) when does evidence become actionable (appropriate statistical inference); (2) properties of the population; and (3) properties of the situation. We argue that until these three areas are fully understood and recognized by researchers and policymakers, the threats to scalability will render any scaling exercise as particularly vulnerable. In this way, our work represents a challenge to empiricists to estimate the nature and extent of how important the various threats to scalability are in practice, and to implement those in their original research.
Alec Brandon, John A List, Robert D Metcalfe, Michael K Price, Florian Rundhammer
Cited by*: None Downloads*: None

This study considers the response of household electricity consumption to social nudges during peak load events. Our investigation considers two social nudges. The first targets conservation during peak load events, while the second promotes aggregate conservation. Using data from a natural field experiment with 42,100 households, we find that both social nudges reduce peak load electricity consumption by 2 to 4% when implemented in isolation and by nearly 7% when implemented in combination. These findings suggest an important role for social nudges in the regulation of electricity markets and a limited role for crowd out effects.
John A List
Cited by*: None Downloads*: None

This review summarizes the historical place of the seminal contribution of Kahneman et al. (1990), from origins to theory to catalyst of an entire area of scholarship. This new literature has produced evidence both in concert with the original KKT conclusions as well as evidence refuting certain insights from KKT. The general theme of my summary is that even imperfect papers can have deep impact, both within and outside the academy; a lesson that today's critics should consider as young experimentalists continue to fight the tyranny of the top 5.
Syon Bhanot
Cited by*: None Downloads*: None

Social norms messaging campaigns are increasingly used to influence human behavior, with social science research generally finding that they have modest but meaningful effects. One aspect of these campaigns in practice has been the inclusion of injunctive norms messaging, designed to convey a social judgement about one's behaviors (often in the form of encouraging or discouraging language, or a visual smiley or frowny face). While some prominent research has provided support for the use of such messaging as a tool for positive behavior change, causal evidence on the effect of injunctive norms messaging as a motivator (as opposed to just one part of a multifaceted messaging campaign) is limited. This paper presents a field experiment on water conservation behavior conducted by an organization in California, involving over 40,000 households, which provides some of the most precise evidence to date regarding the effect of injunctive norms on decision making. I find that not only do injunctive norms encourage conservation behavior, there is also no evidence that they discourage individuals from further attending norms messaging-regardless of whether the social judgement conveyed is negative or positive. Taken together, this suggests that injunctive norms are a useful tool in "nudge"-style campaigns tackling behavior change.
Greg Allenby, Russell Belk, Catherine Eckel, Robert Fisher, Ernan Haruvy, John A List, Yu Ma, Peter Popkowski Leszczyc, Yu Wang, Sherry Xin Li
Cited by*: None Downloads*: None

We offer a unified conceptual, behavioral, and econometric framework for optimal fundraising that deals with both synergies and discrepancies between approaches from economics, consumer behavior, and sociology. The purpose is to offer a framework that can bridge differences and open a dialogue between disciplines in order to facilitate optimal fundraising design. The literature is extensive, and our purpose is to offer a brief background and perspective on each of the approaches, provide an integrated framework leading to new insights, and discuss areas of future research.
Matilde Giaccherini, David H Herberich, David Jimenez-Gomez, John A List, Giovanni Ponti, Michael K Price
Cited by*: None Downloads*: None

This paper uses a field experiment to estimate the effects of prices and social norms on the decision to adopt and efficient technology. We find that prices and social norms influence the adoption and decision along different margins: while prices operate on both the extensive and intensive margins, social norms operate mostly through the extensive margin. This has both positive and normative implications, and suggests that economics and psychology may be strong complements in the diffusion process. To complement the reduced form results, we estimate a structural model that points to important household heterogeneity: whereas some consumers welcome the opportunity to purchase and learn about the new technology, for others the inconvenience and social pressure of the ask results in negative welfare. As a whole, our findings highlight that the design of optimal technological diffusion policies will require multiple instruments and a recognition of household heterogeneity.
Ariel Goldszmidt, John A List, Robert D Metcalfe, Ian Muir, Jenny Wang
Cited by*: None Downloads*: None

The value of time determines relative prices of goods and services, investments, productivity, economic growth, and measures of income inequality. Economists in the 1960s began to focus on the value of non-work time, pioneering a deep literature exploring the optimal allocation and value of time. By leveraging key features of these classic time allocation theories, we use a novel approach to estimate the value of time (VOT) via two large-scale natural field experiments with the ridesharing company Lyft. We use random variation in both wait times and prices to estimate a consumer's VOT with a data set of more than 14 million observations across consumers in US cities. We find that the VOT is roughly $19 per hour (or 75% (100%) of the after-tax mean (median) wage rate) and varies predictably with choice circumstances correlated with the opportunity cost of wait time. Our VOT estimate is larger than what is currently used by the US Government, suggesting that society is under-valuing time improvements and subsequently under-investing public resources in time-saving infrastructure projects and technologies.
Daniel J Benjamin, James O Berger, Magnus Johannesson, Brian A Nosek, E. J Wagenmakers, Richard Berk, Kenneth A Bollen, Bjorn Brembs, Lawrence Brown, Colin F Camerer, David Cesarini, Christopher D. Chambers, Merlise Clyde, Thomas D Cook, Paul De Boeck, Zoltan Dienes, Anna Dreber, Kenny Easwaran, Charles Efferson, Ernst Fehr, Fiona Fidler, Andy P. Field, Malcom Forster, Edward I. George, Tarun Ramadorai, Richard Gonzalez, Steven Goodman, Edwin Green, Donald P Green, Anthony Greenwald, Jarrod D. Hadfield, Larry V. Hedges, Leonhard Held, Teck Hau Ho, Herbert Hoijtink, James Holland Jones, Daniel J Hruschka, Kosuke Imai, Guido Imbens, John P.A. Ioannidis, Minjeong Jeon, Michael Kirchler, David Laibson , John A List, Roderick Little, Arthur Lupia, Edouard Machery, Scott E. Maxwell, Michael McCarthy, Don Moore, Stephen L. Morgan, Marcus Munafo, Shinichi Nakagawa, Brendan Nyhan, Timothy H Parker, Luis Pericchi, Marco Perugini, Jeff Rouder, Judith Rousseau, Victoria Savalei, Felix D. Schonbrodt, Thomas Sellke, Betsy Sinclair, Dustin Tingley, Trisha Van Zandt, Simine Vazire, Duncan J. Watts, Christopher Winship, Robert L. Wolpert, Yu Xie, Cristobal Young, Jonathan Zinman, Valen E. Johnson
Cited by*: 1 Downloads*: 965

We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
Michael Fix, Raymond J Struyk
Cited by*: 32 Downloads*: 955

Auditing is a technique used to test for discrimination. The concept is straightforward: Two individuals are matched on all relevant characteristics except the one presumed to lead to discrimination. Each person then applies for the same job, housing, mortgage loan, or credit card. The differential treatment they receive provides a measure of discrimination. The authors argue that the value of auditing has grown in the current legal and political environment because it can detect subtle forms of discrimination.
John A List, Azeem M Shaikh, Yang Xu
Cited by*: 33 Downloads*: 278

Empiricism in the sciences allows us to test theories, formulate optimal policies, and learn how the world works. In this manner, it is critical that our empirical work provides accurate conclusions about underlying data patterns. False positives represent an especially important problem, as vast public and private resources can be misguided if we base decisions on false discovery. This study explores one especially pernicious influence on false positives-multiple hypothesis testing (MHT). While MHT potentially affects all types of empirical work, we consider three common scenarios where MHT influences inference within experimental economics: jointly identifying treatment effects for a set of outcomes, estimating heterogenous treatment effects through subgroup analysis, and conducting hypothesis testing for multiple treatment conditions. Building upon the work of Romano and Wolf (2010), we present a correction procedure that incorporates the three scenarios, and illustrate the improvement in power by comparing our results with those obtained by the classic studies due to Bonferroni (1935) and Holm (1979). Importantly, under weak assumptions, our testing procedure asymptotically controls the familywise error rate - the probability of one false rejection - and is asymptotically balanced. We showcase our approach by revisiting the data reported in Karlan and List (2007), to deepen our understanding of why people give to charitable causes.
John A List, Anya Samek, Dana L Suskind
Cited by*: 0 Downloads*: 258

Behavioral economics and field experiments within the social sciences have advanced well beyond academic curiosum. Governments around the globe as well as the most powerful firms in modern economies employ staffs of behavioralists and experimentalists to advance and test best practices. In this study, we combine behavioral economics with field experiments to reimagine a new model of early childhood education. Our approach has three distinct features. First, by focusing public policy dollars on prevention rather than remediation, we call for much earlier educational programs than currently conceived. Second, our approach has parents at the center of the education production function rather than at its periphery. Third, we advocate attacking the macro education problem using a public health methodology, rather than focusing on piecemeal advances.
Ufuk Akcigit, Fernando Alvarez, Stephane Bonhomme, George M Constantinides, Douglas W Diamond, Eugene F Fama, David W Galenson, Michael Greenstone, Lars Peter Hansen, Uhlig Harald, James J Heckman, Ali Hortacsu, Emir Kamenica, Greg Kaplan, Anil K Kashyap, Steven D Levitt, John A List, Robert E Lucas Jr., Magne Mogstad, Roger Myerson, Derek Neal, Canice Prendergast, Raghuram G Rajan, Philip J Reny, Azeem M Shaikh, Robert Shimer, Hugo F Sonnenschein, Nancy L Stokey, Richard H Thaler, Robert H Topel, Robert Vishny, Luigi Zingales
Cited by*: 0 Downloads*: 207

No abstract available
John A List, Robert D Metcalfe, Michael H Taylor, Ivo Vlaev
Cited by*: 44 Downloads*: 193

Tax collection problems date back to the earliest recorded history of mankind. This paper begins with a simple theoretical construct of paying (rather than declaring) taxes, which we argue has been an overlooked aspect of tax compliance. This construct is then tested in two large natural field experiments. Using administrative data from more than 200,000 individuals in the UK, we show that including social norms and public goods messages in standard tax payment reminder letters considerably enhances tax compliance. The field experiments increased taxes collected by the Government in the sample period and were cost-free to implement, demonstrating the potential importance of such interventions in increasing tax compliance.
Jie Bai
Cited by*: 0 Downloads*: 157

There is often a lack of reliable high quality provision in many markets in developing countries. I designed an experiment to understand this phenomenon in a setting that features typical market conditions in a developing country: the retail watermelon market in a major Chinese city. I begin by demonstrating empirically that there is substantial asymmetric information between sellers and buyers on sweetness, the key indicator of quality for watermelons, yet sellers do not sort and price watermelons by quality. I then randomly introduce one of two branding technologies into 40 out of 60 markets-one sticker label that is widely used and often counterfeited and one novel laser-cut label. I track sellers' quality, pricing and sales over an entire season and collect household panel purchasing data to examine the demand side's response. I find that laser branding induced sellers to provide higher quality and led to higher sales profits, establishing that reputational incentives are present and can be made to pay. However, after the intervention was withdrawn, all markets reverted back to baseline. To rationalize the experimental findings, I build an empirical model of consumer learning and seller reputation. The structural estimates suggest that consumers are hesitant to upgrade their perception about quality under the existing branding technology, which makes reputation building a low return investment. While the new technology enhances consumer learning, the resulting increase in profits is not sufficient to cover the fixed cost of the technology for small individual sellers. Counterfactual analysis shows that information frictions and fragmented markets lead to significant under-provision of quality. Third-party interventions that subsidize initial reputation building for sellers could improve welfare.