John A. List
Cited by*: Downloads*:

A discussion on how ChatGPT can be used to help design experiments that can be scaled.
John A. List
Cited by*: Downloads*:

In 2019, I put together a summary of data from my field experiments website that pertained to natural field experiments (Harrison and List, 2004). Several people have asked me for updates. In this document I update all figures and numbers to show the details for 2024. I also include the description from the original paper below.
Anouk L. Schippers, Adriaan R Soetevent
Cited by*: Downloads*:

Informal peer-to-peer services to share or barter goods often succumb to free riding behavior because they lack the tools to enforce compliance and reciprocity. We collect unique quantitative data on a form of unregulated peer-to-peer in-kind exchange that appears internationally viable: the free exchange of books via privately owned public bookcases, also known as little free libraries. Other than previously studied honor-based exchanges, little free libraries use a non-monetary one-to-one book exchange rate. We find surprisingly limited free riding in this market. Users return 9 books for every 10 taken. An incentivized survey points to strong social norms and preferences for cooperation among owners and users as key behavioral primitives that can explain the observed high and stable level of reciprocal exchange.
Toke R. Fosgaard, Adriaan R Soetevent
Cited by*: Downloads*:

Given the replacement of cash with cell phone payments, people who are asked to donate to charity can easily promise a donation but delay the transfer until a later date. This may be a way to get out of the ask-situation with a positive image while maintaining the flexibility not to donate. This study explores whether charities can make people keep their promises by making such promises more explicit and more formal. In a door-to-door fund-raising field experiment, we vary the strength of the promise that donors make. Besides a control group where people can promise to donate, we apply two treatment groups. In the first treatment, donors are asked to verbally pledge a precise amount. In a second treatment, this amount is in addition put on paper with the solicitor's signature added. Both treatments are aimed at making it morally more expensive not to keep promises. Our results show that: (1) the majority of people do not follow through on their promise to donate; (2) donors who pledge an explicit amount more often keep their promise. The more formal the commitment, the closer the amount donated is to the amount promised; (3) many participants refuse to pledge a donation amount when asked, and those who refuse donate significantly less.
Gert-Jan Romensen, Adriaan R Soetevent
Cited by*: Downloads*:

An often-voiced concern with relative performance feedback is that it may not improve workplace productivity if workers become demotivated and see no way to improve. Targeting feedback at specific productivity measures over which workers have direct control may in such cases prevent demotivation and focus attention. Does targeting improve worker productivity? We partner with a large bus company and experimentally vary the nature and number of peer-comparison messages which 409 bus drivers receive in their monthly feedback report. Messages are targeted at concrete driving behaviors and aimed at improving comfort and fuel efficiency. Using over 800,000 trip-level observations, we find that these targeted peer-comparison messages do not improve aggregate (fuel economy) or disaggregate measures (such as acceleration) of driving behavior. Further analyses also reveal no temporal or heterogeneous effects of the targeted messages.
Sander Onderstal, Arthur J.H.C. Schram, Adriaan R Soetevent
Cited by*: Downloads*:

In a door-to-door fundraising field experiment, we study the impact of fundraising mechanisms on charitable giving. We approached about 4500 households, each participating in an all-pay auction, a lottery, a non-anonymous voluntary contribution mechanism (VCM), or an anonymous VCM. In contrast to the VCMs, households in the all-pay auction and the lottery competed for a prize. Although the all-pay auction is the superior fundraising mechanism both in theory and in the laboratory, it did not raise the highest revenue per household in the field and even raised significantly less than the anonymous VCM. Our experiment reveals that this can be attributed to substantially lower participation in the all-pay auction than in the other mechanisms while the average donation for those who contribute is only slightly (and statistically insignificantly) higher. We explore various explanations for this lower participation and favor one that argues that competition in the all-pay mechanism crowds out intrinsic motivations to contribute.
Adriaan R Soetevent
Cited by*: Downloads*:

This paper examines the impact of payment choice on charitable giving with a door-to-door fund-raising field experiment. Respondents can donate cash only, use debit only, or have both options. Cash donations have lower visibility vis-a-vis solicitors than debit card donations. When debit replaces cash, participation drops by 87 percent. Conditional on participation, donors in the Debit-only treatment give more than donors in Cash-only. In Cash&Debit, almost all donors prefer cash; participation decreases compared to Cash-only. Physical attractiveness of both female and male solicitors increases contributions. Solicitor self-confidence has a negative impact.
Michael G. Cuna, Lenka Fiala, Min Sok Lee, John A. List, Sutanuka Roy
Cited by*: Downloads*:

This study examines how mothers' risk and ambiguity preferences affect early childhood investments and outcomes by assessing over 6,000 mothers in Rajasthan, India. Results show that more risk and ambiguity averse mothers make greater investments in their children's nutrition between ages 0-6. These investments correlate with superior cognitive and non-cognitive skills in children, even after controlling for socioeconomic factors. Notably, higher maternal risk and ambiguity aversion can mitigate negative impacts of socioeconomic disadvantages (maternal illiteracy, belonging to historically discriminated groups, limited media access) on all measures of early-life skills, highlighting the importance of understanding preferences in addressing inequities.
John A. List
Cited by*: Downloads*:

Contingent valuation is a widely used method for estimating the value of nonmarket commodities. Yet, a persistent issue is whether responses to Contingent Valuation Method (CVM) questions accurately reflect true values. Recent studies indicate that hypothetical bias is a significant factor that creates a gap between intentions and actions. I use a novel approach within non-market valuation - a List Experiment in the field - to test whether it can attenuate the hypothetical bias observed within CVM surveys. Using data from 400 subjects in a field experiment, I find initial promising results.
John A. List
Cited by*: Downloads*:

In 2019 I put together a summary of data from my field experiments website that pertained to framed field experiments (see List 2024; 2025). Several people have asked me if I have an update. In this document I update all figures and numbers to show the details for 2024. I also include the description from the 2019 paper below.
Uditi Karna, Min Sok Lee, John A. List, Andrew Simon, Haruka Uchida
Cited by*: Downloads*:

Educational disparities remain a key contributor to increasing social and wealth inequalities. To address this, researchers and policymakers have focused on average differences between racial groups or differences among students who are falling behind. This focus potentially leads to educational triage, diverting resources away from high-achieving students, including those from racial minorities. Here we focus on the "racial excellence gap" - the difference in the likelihood that students from racial minorities (Black and Hispanic) reach the highest levels of academic achievement compared with their non-minority (white and Asian) peers. There is a shortage of evidence that systematically measures the magnitude of the excellence gap and how it evolves. Using longitudinal, statewide, administrative data, we document eight facts regarding the excellence gap from third grade (typically ages 8-9) to high school (typically ages 14-18), link the stability of excellence gaps and student backgrounds, and assess the efficacy of public policies. We show that excellence gaps in maths and reading are evident by the third grade and grow slightly over time, especially for female students. About one third of the gap is explained by a student's socioeconomic status, and about one tenth is explained by the school environment. Top-achieving racial minority students are also less likely to persist in excellence as they progress through school. Moreover, state accountability policies that direct additional resources to reduce non-race-based inequality had minimal effects on the racial excellence gaps. Documenting these patterns is an important step towards eliminating excellence gaps and removing the "racial glass ceiling".
John A. List, Haruka Uchida
Cited by*: Downloads*:

Excellence gaps - disparities in advanced academic achievement - between racial groups appear by age 8 or 9 and persist throughout secondary school in the United States. About one-third of the gap is due to socio-economic status and one-tenth to school factors, indicating that policies should address both educational and local environments.
Michael G. Cuna, Musharraf Cyan, M. Taha Kasim, John A. List, Michael K Price
Cited by*: Downloads*:

From newborns to the elderly, exposure to violence and conflict has been found to have deleterious effects. In this study, we explore a unique type of violence: exposure to the Taliban. Pairing a field experiment with a field survey among citizens in Khyber Pakhtunkhwa (KP), Pakistan, we examine how exposure to violence affects general trust, subjective well-being, and confidence in institutions. In our field experiment, we observe that exposure to conflict significantly alters the relative valuation of monetary rewards for oneself compared to those for a comparable peer. Specifically, individuals subjected to violence demonstrate a marked tendency to prioritize their own financial gain over that of a similar other. In the survey, we find that exposure to violence is associated with reduced general trust, trust in informal institutions, and subjective well-being. Interestingly, being exposed to violence increases trust in formal institutions. Our combined results highlight that the interplay between violence and trust dynamics is complex and highly consequential. In turn, the policy implications highlight the need for a multifaceted strategy to support individuals and communities affected by violence, ensuring both immediate relief and long-term resilience.
John A. List
Cited by*: Downloads*:

The traditional approach in experimental economics is to use a between-subject design: the analyst places each unit in treatment or control simultaneously and recovers outcome differences via differencing conditional expectations. Within-subject designs represent a significant departure from this method, as the same unit is observed in both treatment and control conditions sequentially. While some might consider the design choice straightforward (always opt for a between-subject design), I contend that researchers should meticulously weigh the advantages and disadvantages of each design. In doing so, I propose a categorization for within-subject designs based on the plausibility of recovering an internally valid estimate. In one instance, which I denote as stealth designs, the analyst should unequivocally choose a within-subject design rather than a between-subject design.
John A. List
Cited by*: Downloads*:

In 2019, I put together a summary of data from my field experiments website that pertained to artefactual field experiments. Several people have asked me if I have an update. In this document I update all figures and numbers to show the details for the year 2024. I also include the description from the 2019 paper below. The definition of artefactual field experiments comes originally from Harrison and List (2004) and is advanced in List (2006; 2024, 2025).
Majid Ahmadi, Gwen-Jiro Clochard, Jeff Lachman, John A. List
Cited by*: Downloads*:

When multiple forces potentially underlie discriminatory behavior, pinning down the precise sources becomes a challenge, making proposed policy solutions speculative. This study introduces an empirical approach, tightly linked to theory, to dissect two specific channels of discrimination: customer bias and managerial bias. To illustrate our framework, we integrate proprietary data with several publicly available datasets to uncover channels of discrimination within the Major League Baseball draft. Our analysis reveals that customer preferences significantly influence the drafting of players at the top end of the draft - those likely to gain immediate public attention and eventually play for the club. Conversely, we observe managerial homophily in the latter parts of the draft, where players who attract little attention and have minimal chances of playing for the club are selected. The observed preferential bias at both ends of the draft incurs a substantial opportunity cost. However, bias at the top end unduly affects competitiveness. Our findings provide significant implications for future research on measuring discrimination and addressing the challenge of multiple channels.
John A. List
Cited by*: Downloads*:

To identify effective policies that can scale, a third option should be added to traditional A/B tests, that accounts for the realities of a programme implemented at scale. By flipping the traditional research and policy-development model, researchers can generate policy-based evidence to help policymakers scale the best policies.
John A. List
Cited by*: Downloads*:

AEA Presentation 2025.
Guglielmo Briscese, John A. List
Cited by*: Downloads*:

Field experiments provide the clearest window into the true impact of many policies, allowing us to understand what works, what does not, and why. Yet, their widespread use has not been accompanied by a deep understanding of the political economy of their adoption in policy circles. This study begins with a large-scale natural field experiment that demonstrates the ineffectiveness of a widely implemented intervention. We leverage this result to understand how policymakers and a representative sample of the U.S. population update their beliefs of not only the policy itself, but the use of science and the trust they have in government. Policymakers, initially overly optimistic about the program's effectiveness, adjust their views based on evidence but show reduced demand for experimentation, suggesting experiment aversion when results defy expectations. Among the U.S. public, support for policy experiments is high and remains robust despite receiving disappointing results, though trust in the implementing institutions declines, particularly in terms of perceptions of competence and integrity. Providing additional information on the value of learning from unexpected findings partially mitigates this trust loss. These insights, from both the demand and supply side, reveal the complexities of managing policymakers' expectations and underscore the need to educate the public on the value of open-mindedness in policy experimentation.
Stefano Carattini, Robert Dur, John A. List
Cited by*: None Downloads*: None

Many policies that are generally considered socially desirable by the scientific community, based on modeling and causal empirical analyses, are not very widespread. The main driver is often lack of public support at baseline ("ex ante"). Yet, there is evidence that when voters hold biased beliefs ex ante about a given policy, experiencing the policy firsthand may lead them to correct their beliefs and increase public support. If it was widely documented that opposition to sound policies in part dissipates when voters experience a given policy, then more policy-makers may be inclined to experiment with policies that scientists recommend but that are unpopular ex ante. Systematically combining policy evaluation with causal analysis of public support would allow scholars to create a body of knowledge on the conditions under which policies become more (or less) popular after implementation and what are the drivers of changes in beliefs and public support.