New CDC mortality data release from the Documenting COVID-19 project

Many readers may know that, since last fall, I’ve been working part-time at the Documenting COVID-19 project: a public records, data, and investigative project at Columbia University’s Brown Institute for Media Innovation and the public records site MuckRock.

One major focus at Documenting COVID-19 is our Uncounted investigation, an effort to understand how COVID-19 deaths—and other deaths indirectly caused by the pandemic—have gone under-reported in the last two years. The CDC has reported nearly one million official COVID-19 deaths; but that figure doesn’t include over 300,000 deaths of natural causes that occurred over what researchers expected in 2020 and 2021.

These natural causes logged on Americans’ death certificates—such as diabetes, heart disease, and respiratory conditions—may have been linked to COVID-19. In fact, about 158,000 deaths during the pandemic were specifically linked to natural causes that the CDC considers potentially COVID-related. But the official records make it hard to say for sure.

In a story with USA TODAY published late last year, Documenting COVID-19 found massive gaps and inconsistencies in the U.S.’s death system, which likely contributed to these undercounts. These include: a lack of standardization for medical examiners and coroners’ offices, workers in these positions becoming overwhelmed during the pandemic, and failures in some cases to order COVID-19 tests for patients or push back when families insisted a death wasn’t COVID-related.

Documenting COVID-19 is working on further follow-up stories in this investigation. But we also want to empower other reporters—especially local reporters—and researchers to investigate pandemic deaths. To that end, our team recently released a GitHub data repository that provides county-level CDC mortality data from 2020 and 2021.

The data come from the CDC’s provisional mortality database; our team signed a data-use agreement with the agency so that we can use their API to gather data more quickly and efficiently than what’s possible with the CDC’s WONDER portal.

Here’s a brief summary of what’s in the repository, taken from a write-up by my colleague Dillon Bergin:

  • Leading external causes of death in the 113 CDC code list, by underlying cause of death;
  • Natural causes of death associated with COVID-19, using the CDC’s categories for excess deaths associated with COVID-19, by underlying cause of death;
  • All deaths by race and ethnicity, with age-adjusted rate, regardless of underlying cause of death;
  • Information to help contextualize the CDC data, including excess mortality numbers modeled by demographers at Boston University, vaccination rates, and a Department of Justice survey released in December of all medical examiner and coroner offices in the country.

And here are some other links related to Uncounted and the CDC’s mortality data:

If you’re a journalist who wants to use these data, the Documenting COVID-19 team is happy to help! If you have questions or want support, feel free to reach out to the team at, or to me specifically at

Sign up for the COVID Data Dispatch newsletter

More federal data

Wastewater data gap follow-up: an update from Biobot
Last week, I pointed out a data gap on the CDC’s National Wastewater Surveillance System (NWSS) dashboard: hundreds of sewershed sites on the dashboard have not been updated with recent data in weeks. This week, I'm excited to share an …
Interpreting limited data in our undercounted surge
There’s no sugarcoating it: we are in an extremely confusing and frustrating phase of the pandemic. We see the rising (yet undercounted) case numbers, we hear from friends and family members who have recently tested positive. And yet the CDC’s …
More transparency needed on CDC wastewater data
Something strange is going on with the CDC’s National Wastewater Surveillance System (NWSS) data. Hundreds of sites on the CDC NWSS dashboard have been labeled as showing “no recent data” for a couple of weeks.
Five reasons why Long COVID research in the U.S. is so difficult
Over a year after the NIH received $1 billion to study Long COVID, the agency's flagship study is floundering and frustrating patient advocates. Here are five reasons why Long COVID research is tough in the U.S., taken from my reporting …
The “one million deaths” milestone fails to capture the pandemic’s true toll
This week, many headlines declared that the U.S. has reached one million COVID-19 deaths. While a major milestone, this number is actually far below the full impact of the pandemic; looking at excess deaths and demographic breakdowns allows us to …
Seroprevalence, incomplete data in the wake of the Omicron wave
More than half of Americans have some antibodies from a recent coronavirus infection, according to a new CDC report. The study was published Tuesday in the CDC’s Morbidity and Mortality Weekly Report (MMWR), accompanied by a press conference and other …

Leave a Reply