Health By Wealth: Optical character recognition and LLMs with population-level probate and administrative data uncover substantial inequalities in health over the very long run

Charles Rahal , University of Oxford
Naomi Muggleton, Warwick University
Aaron Reeves, London School of Economics and Political Science
Paul Moore, University of Oxford
Linda Li, London School of Economics and Political Science
Alexandra Rottenkolber, Linköping University

Health inequalities in life expectancy between the rich and poor remain a persistent challenge. Yet, their long-run evolution has been poorly understood due to limited historical data combining socio-economic status and mortality. Existing evidence is often fragmented across time and place, leaving unresolved when and why disparities changed. We address this by constructing a novel dataset spanning 130 years, linking 16 million digitized probate records to 66 million death registrations, providing unprecedented detail on wealth and mortality in England and Wales. This enables macro-level analysis from individual microdata. Between 1860 and 1990, life expectancy inequalities narrowed markedly, particularly after 1940. The gap at age 20 between the wealthiest 1\% and poorest groups declined from 17.3 years for women and 15.8 years for men to just 2–3 years in recent decades, coinciding with the emergence of the welfare state. We further link genealogical data to the same probate records to assess intergenerational effects. Contrary to earlier debate, we find clear evidence that individuals born into wealthier families lived longer before 1900. Our results show that health inequalities, though enduring, can be reduced. The remarkable rise in life expectancy after 1850 was driven largely by gains among the least wealthy, reflecting collective societal efforts—through public health initiatives, poverty reduction, and universal healthcare—to extend healthy lives to all.

See extended abstract

 Presented in Session 70. Flash Session New Data, Methods and Comparative Perspectives in Historical Demography