Producing Consistent Long-Term Time-Series of Cause-of-Death Statistics with Verbal Autopsies Lessons from the Harmonization of 25 Years of Data in the Africa Health Research Institute

Ariane Sessego , INED
Siyabonga Nxumalo, Africa Health Research Institute
Dickman Gareta, Africa Health Research Institute
Alison Castle, Massachussets General Hospital
Kobus Herbst, Africa Health Research Institute

Producing consistent cause-of-death (CoD) statistics over time remains a central challenge in demography, with constantly evolving classifications and data collection methods. This issue is particularly acute when CoD estimates rely on verbal autopsies (VAs), structured interviews with caregivers of the deceased. While automated algorithms provide a unified tool for CoD assignment, their application to data collected with different instruments raises questions about comparability. This study presents the experience of the Africa Health Research Institute’s Health and Demographic Surveillance System in rural KwaZulu-Natal, South Africa, in harmonizing 25 years (2000–2025) of VA data (27,622 deaths, three VA questionnaires) and evaluating the consistency of algorithmic CoD assignment across these instruments. For data harmonization, we developed VAConvert, an open-source framework designed to ensure transparency and adaptability. It uses correspondence tables to map historical VA data to the input formats required by algorithms (InterVA5, InSilicoVA), enabling unified CoD estimation. Harmonization quality varied: 33% of indicators expected by the algorithms could not be linked to any question in the earliest questionnaire (2000–2016), versus 8% for the most recent (2022–present), while all indicators were mapped for the WHO 2016 instrument (2017–2021). To assess the impact of missing indicators, 5,335 WHO 2016 VAs will be re-estimated with selected symptoms omitted. We expect missing indicators to most affect early data, and aim to provide guidelines for the analysis of long-term CoD trends. By developing transparent harmonization tools and evaluating their limits, we aim to contribute to robust analysis of long-term mortality trends in low-and middle-income settings.

See extended abstract

 Presented in Session 78. Assessing and Improving the Quality of Mortality Data