For a full list of publications and patents see below or go to Google Scholar
Courts around the world are putting their data online, making information about caseloads, parties, and decisions available to the public. Yet, this data is far from complete, and often only reflects a portion of courts’ dockets. We offer and validate a set of tools for leveraging serialized bureaucratic data from courts to estimate the proportion of cases available to the public and the time courts take to make decisions. Using data from more than 3,000 courts in China, our methods allow us to assess patterns of missingness in court data across provinces and cities by type of case and to conduct the largest quantitative analysis to date on court delay in China. By providing an extensive validation of both new and existing tools for estimating missingness and delay, we provide a set of recommendations for researchers looking to augment incomplete bureaucratic data around the world.
Luke Sanford, Co-authors
Sociological Methods & Research, Year
This article shows that over the last three decades, competitive elections were associated with increased deforestation. Protection of forested areas provides long-term public goods, while their destruction provides short-term private goods for particular voters. Politicians facing a competitive election offer voters access to forested areas mainly for small-scale farming or commercial use of timber in exchange for electoral support. I test this theory of political deforestation using satellite generated global forest cover data and the results of over 1,000 national-level elections between 1982 and 2016. I find that countries that undergo a democratic transition lose an additional 0.8 percentage points of their forest cover each year, that years with close elections have over 1 percentage point per year higher forest cover loss compared to nonelection years, and that as the margin of victory in an election decreases by 10 points, the amount of deforestation increases by 0.7 percentage points per year. These increases are on the order of 5–10 times the average rate of forest loss globally. This suggests democratization is associated with underprovision of environmental public goods and contested elections are partially responsible for this underprovision.
Sanford, Luke
American Journal of Political Science (2021)
Minority communities in the United States often experience higher-than-average exposures to air pollution. However, the relative contribution of institutional biases to these disparities can be difficult to disentangle from other factors. Here, we use the economic shutdown associated with the 2020 COVID-19 shelter-in-place orders to causally estimate pollution exposure disparities caused by the in-person economy in California. Using public and citizen-science ground-based monitor networks for respirable particulate matter, along with satellite records of nitrogen dioxide, we show that sheltering in place produced disproportionate air pollution reductions for non-White (especially Hispanic and Asian) and low-income communities. We demonstrate that these racial and ethnic effects cannot be explained by weather patterns, geography, income or local economic activity as measured by local changes in mobility. They are instead driven by regional economic activity, which produces local harms for diffuse economic benefits. This study thus provides indirect, yet substantial, evidence of systemic racial and ethnic bias in the generation and control of pollution from the portion of the economy most impacted in the early pandemic period.
Richard Bluhm, Pascal Polonik, Kyle S. Hemes, Luke C. Sanford, Susanne A. Benz, Morgan C. Levy, Katharine L. Ricke, Jennifer A. Burney
News coverage by Newsweek
Experimental methods for estimating the impacts of text on human evaluation have been widely used in the social sciences. However, researchers in experimental settings are usually limited to testing a small number of pre-specified text treatments. While efforts to mine unstructured texts for features that causally affect outcomes have been ongoing in recent years, these models have primarily focused on the topics or specific words of text, which may not always be the mechanism of the effect. We connect these efforts with NLP interpretability techniques and present a method for flexibly discovering clusters of similar text phrases that are predictive of human reactions to texts using convolutional neural networks. When used in an experimental setting, this method can identify text treatments and their effects under certain assumptions. We apply the method to two data sets. The first enables direct validation of the model’s ability to detect phrases known to cause the outcome. The second demonstrates its ability to flexibly discover text treatments with varying textual structures. In both cases, the model learns a greater variety of text treatments compared to benchmark methods, and these text features quantitatively meet or exceed the ability of benchmark methods to predict the outcome.
Megan Ayers, Luke Sanford, Margaret Roberts, Eddie Yang
Findings of the Association for Computational Linguistics ACL (2024)
Estimating Missingness and Delay in Court Data
Luke Sanford, Co-authors
Sociological Methods & Research, Year
Democratization, Elections, and Public Goods: The Evidence from Deforestation
Sanford, Luke
American Journal of Political Science (2021)
Disparate air pollution reductions during California’s COVID-19 economic shutdown
Richard Bluhm, Pascal Polonik, Kyle S. Hemes, Luke C. Sanford, Susanne A. Benz, Morgan C. Levy, Katharine L. Ricke, Jennifer A. Burney
Nature Sustainability (2022)
Discovering influential text using convolutional neural networks
Megan Ayers, Luke Sanford, Margaret Roberts, Eddie Yang
Findings of the Association for Computational Linguistics ACL (2024)