Tag: mortality data

  • How official death data underestimate COVID-19’s inequities

    How official death data underestimate COVID-19’s inequities

    In the last week of December, I had a major story published at MuckRock, USA TODAY, and local newsrooms in Arizona, Oregon, and Texas. The story explains that official COVID-19 statistics underestimate the pandemic’s true toll—particularly on people of color, who are more likely to have their deaths inaccurately represented in mortality data.

    This story was part of Uncounted, MuckRock’s broader project to investigate death certificate errors and other death reporting issues uncovered by looking at all excess deaths during the pandemic, not just those deaths officially marked as COVID-19. It relies on data from the CDC’s provisional mortality statistics and excess death estimates by a team of demography researchers at Boston University led by Andrew Stokes.

    I’ve copied the introductory section of the story here, because I don’t think anything else I write would do a better job at summarizing it. I encourage you to read the full piece; it is the biggest (and likely most important) story that I wrote in 2022.

    It’s not always easy to identify a COVID-19 death.

    If someone dies at home, if they have symptoms not typically associated with the disease or if they die when local health systems are overwhelmed, their death certificate might say “heart disease” or “natural causes” when COVID-19 is, in fact, at fault.

    New research shows such inaccuracies also are more likely for Americans who are Black, Hispanic, Asian or Native.

    The true toll of the COVID-19 pandemic on many communities of color – from Portland, Oregon, to Navajo Nation tribal lands in Arizona, New Mexico and Utah, to sparsely populated rural Texas towns – is worse than previously known.

    Incorrect death certificates add to the racial and ethnic health disparities exacerbated by the pandemic, which stem from long-entrenched barriers to medical care, employment, education, housing and other factors. Mortality data from the Centers for Disease Control and Prevention point to COVID-19’s disastrous impacts, in a new analysis by the Documenting COVID-19 Project at Columbia University’s Brown Institute for Media Innovation and MuckRock, in collaboration with Boston University’s School of Global Public Health; the USA TODAY Network; the Arizona Center for Investigative Reporting; Willamette Week in Portland; and the Texas Observer.

    The data shows that deaths from causes the CDC and physicians routinely link to COVID– including heart disease, respiratory illnesses, diabetes and hypertension–have soared and remained high for certain racial and ethnic groups.

    In Arizona’s Navajo and Apache counties, which share territory with Navajo Nation, COVID deaths among Native Americans drove nation-leading excess death rates in 2020 and 2021. While COVID death rates among Natives dropped during the second year of the pandemic thanks to local health efforts, other causes of death such as car accidents and alcohol poisoning increased significantly from 2020 to 2021.

    In Portland, deaths from causes indirectly related to the pandemic went up in 2021 even as official COVID deaths remained relatively constant. Black residents were disproportionately impacted by some of these causes, such as heart disease and overdose deaths – despite a county-wide commitment to addressing racism as a public health threat.

    In Texas, smaller, rural counties served by Justices of the Peace were more likely to report potential undercounting of COVID deaths than larger, urban counties served by medical examiners. Justices of the Peace receive limited training in filling out death certificates and often do not have sufficient access to postmortem COVID testing, local experts say.

    Experts point to several reasons for increased inaccurate death certificates among non-white Americans. These include resources available for death investigations, the use of general or unknown causes on death certificates, and how the race and ethnicity fields of these certificates are filled out.

    Such barriers to accurate death reporting add on to existing health disparities that made non-white Americans more susceptible to COVID in 2021, despite widespread vaccination campaigns and health equity efforts.

    “Even if you try to level the playing field, from the jump, certain populations are dealing with things that put them at greater risk,” said Enrique Neblett, a health equity expert at the University of Michigan’s School of Public Health. These issues include higher exposure to COVID, as people of color are overrepresented among essential workers, as well as higher rates of chronic conditions that confer risk for severe disease. “Those things aren’t eliminated just by increasing access to a vaccine,” Neblett said.

    It is critical to improve data collection and reporting for deaths beyond those officially labeled as COVID because data is a “major political determinant of health,” said Daniel Dawes, executive director of the Satcher Health Leadership Institute at the Morehouse School of Medicine. Information on how people are dying in a particular community can shape priorities for local public health departments and funding for health initiatives.

    “If there is no data, there is no problem,” Dawes said.

  • New CDC report vastly underestimates deaths with Long COVID

    New CDC report vastly underestimates deaths with Long COVID

    The 3,500 Long COVID-related deaths identified by the CDC’s review of death certificates are likely a significant undercount of mortality caused by this condition, experts say. Chart by Karen Wang; see the interactive version on MuckRock.

    On Wednesday, the CDC’s National Center for Health Statistics (NHCS) released a major report on deaths from Long COVID. To identify a small (but significant) number of deaths, NCHS researchers searched through the text of death certificates for Long COVID-related terms. Their study demonstrates how bad our current health data systems are at capturing the results of chronic disease.

    My colleagues and I at MuckRock did a similar analysis to the CDC’s, searching death certificate data that we received through public records requests and partnerships in Minnesota, New Mexico, and counties in California and Illinois. You can read our full story here and explore the death certificate data we analyzed on GitHub.

    Here are the main findings from both analyses:

    • The CDC study is an important milestone in recognizing the reality of Long COVID: this is a serious, chronic disease that can lead to death for some patients. It’s not just an outcome of acute COVID-19.
    • From its national death certificate search, NCHS identified 3,544 deaths with Long COVID as a cause or contributing factor. This is almost certainly a major undercount, experts told me (and told other reporters who wrote about the study.)
    • This number is an undercount because we’re essentially seeing two poor-quality data systems intersect. Long COVID is undercounted in clinical settings because we lack standard diagnostic tools and widespread medical education about it—most doctors wouldn’t think to put it on a death certificate as a result. And the U.S.’s death investigation system is uneven and under-resourced, leading to inconsistencies in tracking even well-known medical conditions.
    • On top of these problems, when Long COVID is diagnosed, it tends to be among people who had severe cases of acute COVID-19 followed by difficulty recovering, experts told me. David Putrino and Ziyad Al-Aly, two leading Long COVID researchers, both pointed to the NCHS’s trend towards identifying Long COVID deaths among older adults (over age 75) as an example of this pattern in action, since this group is at higher risk for more severe acute symptoms.
    • The NCHS count of deaths thus misses Long COVID patients with symptoms similar to myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), which often arises after a milder initial case. It also misses people who have vascular impacts from a COVID-19 case, like a premature heart attack or stroke months after infection—something Al-Aly and his team have studied in depth. And, crucially, the NCHS count misses people who died from suicide, after suffering from severe mental health consequences of Long COVID.
    • While the NCHS count of Long COVID deaths is far too low to be accurate, the researchers did find more deaths as the pandemic went on—with the highest number in February 2022, following the first Omicron surge. This pattern could suggest increased recognition of Long COVID among the medical community.
    • The NCHS primarily identified Long COVID deaths among white people, even though acute COVID-19 has disproportionately impacted people of color in the U.S. Experts say this mismatch could reflect gaps in access to a diagnosis and care for Long COVID: if white people are more likely to be seen by a doctor who can accurately diagnose them, they will be overrepresented in Long COVID datasets. Putrino called this “a health disparity on top of a health disparity.”
    • MuckRock’s analysis of death certificate data in select states similarly found that most deaths labeled as Long COVID were among seniors and white people. The trends varied by state, though, reflecting differences in populations and in local death reporting systems. For example in New Mexico, which has a statewide medical examiner’s office (rather than a looser system of county coroners), three-fourths of the Long COVID deaths were among Hispanic or Indigenous Americans.
    • Our story also includes details about the RECOVER initiative’s autopsy study, which aims to use extensive postmortem testing on people who might have died from acute COVID-19 or Long COVID to identify biological patterns. Like the rest of RECOVER, this study is moving slowly and facing logistical challenges: about 85 patients have been enrolled so far, an investigator at New York University said.

    Overall, the NCHS study suggests an urgent need for more medical education about Long COVID, especially as the CDC works to implement a new death code specific to this chronic condition. We also need broader outreach about the consequences of Long COVID. To quote from the story:

    “Institutions like the CDC should do more to educate people about the long-term problems that could follow a COVID-19 case, said Hannah Davis, the patient researcher. “We need public warnings about risks of heart attack, stroke and other clotting conditions, especially in the first few months after COVID-19 infection,” she said, along with warnings about potential links to conditions like diabetes, Alzheimer’s and cancer.

    And we need other methods of studying Long COVID outcomes that don’t rely on a deeply flawed death investigation system. These could include studies of excess mortality following COVID-19 cases, Long COVID patient registries that monitor people long-term, and collaborations with patient groups to track suicides.

    For any reporters and editors who may be interested, MuckRock’s story is free for other outlets to republish.

    More Long COVID reporting

  • Sources and updates, November 20

    • CDC update on COVID-19 mortality trends: This week, the CDC published a detailed report about how deaths from COVID-19 have changed in 2022. Overall, between 2,000 and 4,500 COVID-19 deaths were reported each week between April and September 2022, the CDC researchers found; this is lower than at earlier points in the pandemic, but still represents a loss of more than 100,000 Americans over the course of a year. Older adults and those who were un- or under-vaccinated had a higher risk of death from COVID-19, the researchers found; racial and ethnic disparities have “decreased, but persisted.”
    • Moderna reports new data on its bivalent booster: Several studies in the last couple of weeks have indicated that the new, Omicron-specific boosters from Pfizer and Moderna are more effective against new variants than the older vaccines. Moderna provided additional data this week, reporting that its new booster led to five times more antibodies that neutralize Omicron BA.4 and BA.5 compared to earlier booster shots. While Moderna’s study hasn’t yet been peer-reviewed, the results are promising in following a trend from past studies, STAT’s Matthew Herper reports.
    • Booster shots could keep kids from missing school: Speaking of the new boosters: a new report from the Commonwealth Fund provides analysis of the boosters’ potential impact on school-aged children, as all kids older than five are eligible for the shots. If 80% of eligible Americans receive their bivalent boosters by the end of 2022, the report suggests, this could save over 46 million days of isolation and over 50,000 hospitalizations for school-aged children, along with other benefits. Even getting kids boosted at the level of flu vaccination in 2020-2021 would prevent millions of days of school from being lost.
    • Test to treat is inaccessible to rural Americans: A new study, published this week in JAMA Network Open, examined equity issues with the Biden administration’s Test to Treat initiative. The initiative was designed to provide locations where Americans could get a COVID-19 test and then, if they received a positive result, quickly receive a free antiviral drug. But many people don’t live near available locations, the researchers found: “approximately 15% of the overall US population, 30% of American Indian or Alaskan Native people, and 59% of the rural population lived more than 60 minutes from the nearest site,” they write.
    • Perception of local COVID-19 levels: A lot of people are acting with incorrect knowledge of their local COVID-19 risk, a new study in the CDC’s Morbidity and Mortality Weekly Report suggests. Researchers from several medical and public health institutions surveyed people who had recently tested positive for COVID-19 in Detroit, Michigan and DuPage County, Illinois, during June and July, 2022. About half of the 5,000 people surveyed said that they thought local COVID-19 transmission was “low or moderate,” even though it was actually at high levels in both places.

  • Sources and updates, July 10

    • CDC adds (limited) Long COVID data to its dashboard: This week, the CDC’s COVID Data Tracker added a new page, reporting data from a study of “post-COVID conditions” (more colloquially known as Long COVID). The study, called Innovative Support for Patients with SARS-CoV-2 Infections (INSPIRE), follows patients who test positive for up to 18 months and tracks their continued symptoms. Among about 4,100 COVID-positive patients in the study, over 10% still had symptoms at three months after their infections, and over 1% still had symptoms at 12 months. This is just one study among many tracking Long COVID, but it is an important step for the CDC to add these data to their dashboard.
    • Air change guidance by state: In recognition of the role ventilation can play in reducing COVID-19 spread, some states have put out recommendations for minimum air changes per hour (ACH), a metric for tracking indoor air quality. Researcher Devabhaktuni Srikrishna has compiled the recommendations on his website, Patient Knowhow, with a map showing ACH guidance by state. (I recently interviewed Srikrishna for an upcoming story about ventilation.)
    • COVID-19 is a leading cause of death in the U.S.: A new study from researchers at the National Institutes of Health’s National Cancer Institute confirms that COVID-19 was the third-leading cause of death in the U.S., in both 2020 and 2021. The researchers utilized death records from the CDC in their analysis, comparing COVID-19 to common causes such as cancer and heart disease. COVID-19 was a top cause of death for every age group over age 15, the study found.
    • COVID-19 disparities in Louisiana: Another notable study this week: researchers at the University of Maryland, College Park examined the roles of social, economic, and environmental factors in COVID-19 deaths in Louisiana, focusing on Black residents. “We find that Black communities in parishes with both higher and lower population densities experience higher levels of stressors, leading to greater COVID-19 mortality rate,” the researchers wrote. The study’s examination of environmental racism in relation to COVID-19 seems particularly novel to me; I hope to see more research in this area.
    • Tracking coronavirus variants in wastewater: And one more new study: a large consortium of researchers, led by scientists at the University of California San Diego, explores the use of wastewater surveillance to track new variants. Variants can show up in wastewater up to two weeks earlier than they show up in samples from clinical (PCR) testing, the researchers found. In addition, some variants identified in wastewater are “not captured by clinical genomic surveillance.”
    • Global COVID-19 vaccine and treatment initiative ending: The ACT-Accelerator, a collaboration between the World Health Organization and other health entities and governments, has run out of funding. This is bad news for low- and middle-income countries that relied on the program for COVID-19 vaccines and treatments—many of which are still largely unvaccinated, more than a year after vaccines became widely available in high-income countries. Global health equity initiatives will likely continue in another form, but funding will be a continued challenge.

  • Nine areas of data we need to manage the pandemic

    Nine areas of data we need to manage the pandemic

    PCR testing has greatly declined in recent months; we need new data sources to help replace the information we got from it. Chart via the CDC.

    Last week, I received a question from my grandmother. She had just read my TIME story about BA.4 and BA.5, and was feeling pessimistic about the future. “Do you think we’ll ever get control of this pandemic?” she asked.

    This is a complicated question. And it’s one that I’ve been reflecting on as well, as I approach the two-year anniversary of the COVID-19 Data Dispatch and consider how this publication might shift to meet the current phase of the pandemic. I am not an infectious disease or public health expert, but I wanted to share a few thoughts on this; to stay in my data lane, I’m focusing on data that could help the U.S. better manage COVID-19.

    The coronavirus is going to continue mutating, evolving past immune system defenses built by prior infection and vaccination. Scientists will need to continue updating vaccines and treatments to match the virus, or we’ll need a next-generation vaccine that can protect against all coronavirus variants.

    Candidates for such a vaccine, called a “pan-coronavirus vaccine,” are under development by the U.S. Army and at several other academic labs and pharmaceutical companies. But until a pan-coronavirus vaccine becomes available, we’ll need to continue tracking new variants and the surges they produce. We also need to better track Long COVID, a condition that our current vaccines do not protect well against.

    Eventually, COVID-19 will likely be just another respiratory virus that we watch out for during colder months and large indoor gatherings, broadly considered “endemic” by scientists. But it’s important to note—as Dr. Ellie Murray did in her excellent Twitter thread about how pandemics end—that endemicity does not mean we stop tracking COVID-19. In fact, thousands of people work to monitor and respond to another endemic virus, the flu.

    With that in mind, here are nine categories of data that could help manage the pandemic:

    • More comprehensive wastewater surveillance: As I’ve written here and at FiveThirtyEight, sewers can offer a lot of COVID-19 information through a pipeline that’s unbiased and does not depend on testing access. But wastewater monitoring continues to be spotty across the country, as the surveillance can be challenging to set up—and more challenging for public health officials to act on. Also, current monitoring methods exclude those 21 million households that are not connected to public sewers. As wastewater surveillance expands, we will better be able to pinpoint new surges right as they’re starting.
    • Variant surveillance from wastewater: Most of the U.S.’s data on circulating variants currently comes from a selection of PCR test samples that are run through genomic sequencing tests. But this process is expensive, and the pool of samples is dwindling as more people use at-home rapid tests rather than PCR. It could be cheaper and more comprehensive to sequence samples from wastewater instead, Marc Johnson explained to me recently. This is another important aspect of expanding our wastewater monitoring.
    • Testing random samples: Another way to make up for the data lost by less popular PCR testing is conducting surveillance tests on random samples of people, either in the U.S. overall or in specific cities and states. This type of testing would provide us with more information on who is getting sick, allowing public health departments to respond accordingly. The U.K.’s Office for National Statistics conducts regular surveys like this, which could serve as a model for the U.S.
    • More demographic data: Related to random sample testing: the U.S. COVID-19 response still needs more information on who is most impacted by the pandemic, as well as who needs better access to vaccines and treatments. Random sampling and surveys, as well as demographic data connected to distributions of treatments like Paxlovid, could help address this need.
    • Vaccine effectiveness data: I have written a lot about how the U.S. does not have good data on how well our COVID-19 vaccines work, thanks to our fractured public health system. This lack of data makes it difficult for us to identify when vaccines need to be updated, or who needs another round of booster shots. Connecting more vaccination databases to data recording cases, hospitalizations, and Long COVID would better inform decision-making about boosters.
    • Air quality monitoring: Another type of data collection to better inform decision-making is tracking carbon dioxide and other pollutants in the air. These metrics can show how well-ventilated (or poorly-ventilated) a space is, providing information about whether further upgrades or layers of safety measures are needed. For example, I’ve seen experts bring air monitors on planes, citing poor-quality air as a reason to continue wearing a mask. Similarly, the Boston public school district has installed air monitors throughout its buildings and publishes the data on a public dashboard.
    • Tracking animal reservoirs: One potential source for new coronavirus variants is that the virus can jump from humans into animals, mutate in an animal population, and then jump back into humans. This has happened in the U.S. at least once: a strain from minks infected people in Michigan last year. But the U.S. is not requiring testing or any mandatory tracking of COVID-19 cases in animals that we know are susceptible to COVID-19. Better surveillance in this area could help us catch variants.
    • Better Long COVID surveillance: For me personally, knowledge of Long COVID is a big reason why I remain as cautious about COVID-19 as I am. Long COVID patients and advocates often say that if more people understood the ramifications of this long-term condition, they might be more motivated to take precautions; I think better prevalence data would help a lot with this. (The Census and CDC just made great strides in this area; more on that later in the issue.) Similarly, better data on how the condition impacts people would help in developing treatments—which will be crucial for getting the pandemic under control.
    • More accurate death certificates: The true toll of the pandemic goes beyond official COVID-19 deaths, as the Documenting COVID-19 project has discussed at length in our Uncounted investigation. If we had a better accounting of everyone whose deaths were tied to COVID-19, directly or indirectly, that could be another motivator for people to continue taking safety precautions and protecting their communities.

    If you are working to improve data collection in any of these areas—or if you know a project that is—please reach out! These are all topics that I would love to report on further in the coming months.

    More federal data

  • The “one million deaths” milestone fails to capture the pandemic’s true toll

    The “one million deaths” milestone fails to capture the pandemic’s true toll

    This week, many headlines declared that the U.S. has reached one million COVID-19 deaths. While a major milestone, this number is actually far below the full impact of the pandemic; looking at excess deaths and demographic breakdowns allows us to get closer.

    NBC News was the first outlet to make this declaration, announcing that its internal COVID-19 tracker had hit the one million mark. Other trackers, including the CDC itself, have yet to formally reach this number, but major publications still jumped on the news cycle in anticipation of this milestone. (Various trackers tend to have close-but-differing COVID-19 counts due to differences in their methodologies; Sara Simon wrote about this on the COVID Tracking Project blog back when the official death toll was 200,000.)

    But the recent articles about “one million deaths” fail to mention that the U.S. actually reached this milestone a long time ago. This is because the official count only includes the deaths formally logged as COVID-19, in which the disease was listed on a death certificate or diagnosed before a patient passed. Such a count fails to include deaths that were tied to COVID-19, but never proven with a positive test result, or deaths that were indirectly linked to the pandemic for a myriad of reasons.

    To get closer to the pandemic’s true toll, demographers use a metric called excess deaths: the number of deaths that occurred in a given region and time period above what would be expected for that region and time period. Experts calculate that “expected death” number with statistical models based on patterns from previous years.

    In total, the U.S. has reported 1,118,540 excess deaths between early 2020 and last month. 221,026 of those deaths have not been formally tied to COVID-19. According to a new World Health Organization report, the U.S. was already close to one million COVID-related deaths by December 2021.

    To give a more specific example: in the U.S., in the week ending January 22, 2022, CDC analysts estimated that 61,303 deaths would have occurred if there were no COVID-19 pandemic. But actually, a total of 85,179 deaths occurred in the country that week. The difference between the observed and expected values, 23,876, is the excess deaths for this week.

    I selected the week ending January 22 as an example here because it has one of the highest excess death tolls of any week in the last two years. This week marked the peak of the Omicron surge, a variant that many U.S. leaders called “mild” and dismissed without instituting further safety measures.

    During this week, the CDC reports 21,130 official COVID-19 deaths. That suggests most of the excess deaths in this week, the deaths which occurred over pre-pandemic expectations, were directly caused by the virus.

    But what about the 2,746 deaths that weren’t? How many of these deaths were also caused by COVID-19, but in patients who were never able to access a PCR test? How many occurred in counties like Cape Girardeu, Missouri, where coroner Wavis Jordan claimed his office “doesn’t do COVID deaths” and refuses to put the disease on a death certificate without specific proof?

    And how many deaths resulted from people being unable to access the healthcare they needed because hospitals were full of COVID-19 patients, or people dying in car accidents during an era of less road safety, or people dying of opioid overdoses brought on by increased stress and financial instability?

    Answering these questions takes a lot of in-depth reporting, which I know well because the Documenting COVID-19 team has been doing our best to answer them through our (award-winning!) Uncounted investigation.

    As we’ve found, every state—and in some cases, every county—has a unique system for investigating and reporting deaths, especially those linked to the pandemic. In some places, coroners or medical examiners are elected officials who face political pressure to report COVID-19 deaths in a particular way. In others, they face chronic underfunding and a lack of training, leaving them to work long hours in an attempt to produce accurate numbers.

    You can see the resource difference when comparing officially-reported COVID-19 deaths to excess deaths by state or county. Some states, like those in New England, have COVID-19 death numbers that closely match or even exceed their excess death numbers; medical examiners in these states have centralized death reporting systems and a lot of resources for this process, reporting by my colleague Dillon Bergin showed.

    Other states, like Alaska, Oregon, and West Virginia, have officially logged fewer than three in four excess deaths as COVID-19 deaths. Such a number may signal that a state is failing to properly identify all of its COVID-19 fatalities.

    For more granular data on this topic, I recommend reading the work of Andrew Stokes and his team at Boston University. Andrew is the Documenting COVID-19 project’s main academic collaborator on Uncounted; his team just shared their latest county-level excess death estimates in a preprint. (County-level data are also available in the Uncounted project’s GitHub repository.)

    Excess deaths can also show how the pandemic continues to hit disadvantaged Americans harder. In 2020, COVID-19 death rates (i.e. deaths per 100,000 people) for Black, Indigenous, and Hispanic Americans were higher than the rates for White Americans; in 2021, some of these disparities actually got worse despite the broad availability of vaccines and other mitigation measures. Non-white groups also saw all-cause mortality (not just COVID-19 deaths) increase more from 2019 in both 2020 and 2021, compared to white Americans.

    Please note, the chart below shows crude death rates, which don’t account for differences in age breakdowns between race and ethnicity groups. For example, crude death rates for white Americans tend to be higher because white people generally live longer than people of color in the U.S., and more seniors have died of COVID-19. You can see the difference that ade-adjustment makes in the CDC charts here.

    Why is it important to acknowledge and investigate these excess deaths, going beyond the reported COVID-19 numbers? At an individual level, family members who lost loved ones to COVID-19 find that diagnosis important; they can access FEMA aid for funerals, and can receive acknowledgment of how this one death fits into the broader pandemic.

    And at the county, state, and national levels, looking at excess deaths allows us to see a full picture of how COVID-19 has affected us. Experts say that inaccurate COVID-19 death numbers can create a negative feedback loop: if your community has a too-low toll, you may not realize the disease’s impact, and so you may be less likely to wear a mask or practice other safety precautions—contributing to more deaths going forward.

    As a data journalist, sharing these statistics and charts is my way of acknowledging the one million deaths milestone, and all of the uncounted deaths that are not included in it. But this pales in comparison to actual stories shared by family members and friends of those who have died in the last two years.

    To read these stories, I often turn to memorial projects like Missing Them (from THE CITY), which captures names and stories of over 2,000 New Yorkers who died from COVID-19. Social media accounts like FacesOfCOVID also share these stories. And if any COVID-19 Data dispatch readers would like to share a story of someone they lost to this disease, please email me at betsy@coviddatadispatch.com; I would be honored to share your words in next week’s issue.


    More federal data

  • New CDC mortality data release from the Documenting COVID-19 project

    New CDC mortality data release from the Documenting COVID-19 project

    Many readers may know that, since last fall, I’ve been working part-time at the Documenting COVID-19 project: a public records, data, and investigative project at Columbia University’s Brown Institute for Media Innovation and the public records site MuckRock.

    One major focus at Documenting COVID-19 is our Uncounted investigation, an effort to understand how COVID-19 deaths—and other deaths indirectly caused by the pandemic—have gone under-reported in the last two years. The CDC has reported nearly one million official COVID-19 deaths; but that figure doesn’t include over 300,000 deaths of natural causes that occurred over what researchers expected in 2020 and 2021.

    These natural causes logged on Americans’ death certificates—such as diabetes, heart disease, and respiratory conditions—may have been linked to COVID-19. In fact, about 158,000 deaths during the pandemic were specifically linked to natural causes that the CDC considers potentially COVID-related. But the official records make it hard to say for sure.

    In a story with USA TODAY published late last year, Documenting COVID-19 found massive gaps and inconsistencies in the U.S.’s death system, which likely contributed to these undercounts. These include: a lack of standardization for medical examiners and coroners’ offices, workers in these positions becoming overwhelmed during the pandemic, and failures in some cases to order COVID-19 tests for patients or push back when families insisted a death wasn’t COVID-related.

    Documenting COVID-19 is working on further follow-up stories in this investigation. But we also want to empower other reporters—especially local reporters—and researchers to investigate pandemic deaths. To that end, our team recently released a GitHub data repository that provides county-level CDC mortality data from 2020 and 2021.

    The data come from the CDC’s provisional mortality database; our team signed a data-use agreement with the agency so that we can use their API to gather data more quickly and efficiently than what’s possible with the CDC’s WONDER portal.

    !function(){“use strict”;window.addEventListener(“message”,(function(e){if(void 0!==e.data[“datawrapper-height”]){var t=document.querySelectorAll(“iframe”);for(var a in e.data[“datawrapper-height”])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();

    Here’s a brief summary of what’s in the repository, taken from a write-up by my colleague Dillon Bergin:

    • Leading external causes of death in the 113 CDC code list, by underlying cause of death;
    • Natural causes of death associated with COVID-19, using the CDC’s categories for excess deaths associated with COVID-19, by underlying cause of death;
    • All deaths by race and ethnicity, with age-adjusted rate, regardless of underlying cause of death;
    • Information to help contextualize the CDC data, including excess mortality numbers modeled by demographers at Boston University, vaccination rates, and a Department of Justice survey released in December of all medical examiner and coroner offices in the country.

    And here are some other links related to Uncounted and the CDC’s mortality data:

    If you’re a journalist who wants to use these data, the Documenting COVID-19 team is happy to help! If you have questions or want support, feel free to reach out to the team at covid@muckrock.com, or to me specifically at betsy@muckrock.com.

    More federal data

  • COVID source callout: COVID-19 deaths in U.S. hospitals

    Readers active on COVID-19 Data Twitter may have seen this alarmist Tweet going around earlier this weekend. In this post, a writer (notably, one with no science, health, or data background) posted a screenshot showing that the Department of Health and Human Services (HHS) is no longer requiring hospitals to include COVID-19 deaths that occur at their facilities in their daily reports to the agency.

    This is not the end of U.S. COVID-19 death reporting, as the Tweet’s author insinuated. Primarily because: hospitals are not the primary source of COVID-19 death numbers. These statistics come from death certificates, which are processed by local health departments, coroners, and medical examiners; death certificate statistics are sent to state health departments, which in turn send the numbers to the CDC. The CDC is still reporting COVID-19 deaths with no disruptions, and, in fact, released a highly detailed new dataset on these deaths last month.

    For more explanation, see this thread by Erin Kissane (COVID Tracking Project co-founder) and this one from epidemiologist Justin Feldman. It’s particularly important to note here that, as Feldman points out, plenty of COVID-19 deaths don’t occur in hospitals! About one-third of COVID-19 deaths occurred outside these facilities in 2020.

    (Note: The Documenting COVID-19 project has written, in great detail, about how COVID-19 deaths are reported in our Uncounted series. See: this article at USA Today and this reporting recipe.)

    It is certainly worth asking why the HHS took in-hospital COVID-19 deaths off the list of required metrics for hospitals. This data field had some utility for researchers looking to identify COVID-19 mortality rates within these facilities—though, from what I could tell, nobody was looking at it very much before this weekend.

    But, again, this is not the end of COVID-19 death reporting! This is the HHS making one small change to a massive hospitalization dataset—which was primarily used for looking at other metrics—while the CDC’s death reporting continues as usual.

  • Six more things, January 9

    Six more things, January 9

    If you test positive for COVID-19, here’s what the CDC says you should do. Graphic via the Maine health agency.

    Here are six other COVID-19 news items from the past week that didn’t quite warrant full posts:

    • The CDC made its COVID-19 isolation guidance even more confusing, somehow. On Tuesday, the CDC updated its isolation guidance again—and the new guidance is, kind-of a “dumpster fire,” as the headline on this article by The Atlantic’s Katherine J. Wu aptly puts it. The agency still isn’t requiring rapid tests to get out of isolation early, but it says you can test if you have one available. Also, wear a mask if you leave isolation after five days and avoid travel, restaurants, and other high-exposure activities. Wu’s article provides a good summary of the guidance (and criticism of that guidance), as does this Your Local Epidemiologist post from Dr. Katelyn Jetelina.
    • New reporting recipe explains how to explore “uncounted” COVID-19 deaths with CDC data. Last week, I shared a new investigative story from my team at the Documenting COVID-19 project that dives into unreported COVID-19 deaths in the U.S. Up to 200,000 deaths may have gone unrecorded thanks to a lack of training, standardization, tests, and other issues with death reporting. This week, the team published a reporting recipe aimed to help other journalists do similar stories in their states, cities, and regions. If you have questions about the project or recipe, you can reach out to us at info@documentingcovid19.io.
    • B.1.640.2, or the “IHU variant” from France, is not currently cause for concern. In the past few days, you might have seen headlines about a new variant called B.1.640.2 that was identified in France last November. The variant has a number of mutations, including some mutations that have also been identified in other highly-contagious variants, according to a recent preprint from French researchers. But it’s not currently a concern, say experts at the World Health Organization and elsewhere. This variant actually predates Omicron, and only 20 cases had been reported between early November and early January (compared to well over 100,000 Omicron cases in the same timespan). Omicron is the main variant we should be worrying about right now.
    • “Flurona” means getting the flu and COVID-19 at the same time; it’s not a new mutant disease. Another buzzword you might’ve seen in headlines this week: “flurona,” a portmanteau of coronavirus and flu. Los Angeles and other places have recently reported cases in which a patient tests positive for both the flu and COVID-19 at the same time. While having two respiratory diseases at once is certainly unpleasant—and might lead to increased risk of severe symptoms—it’s not necessarily worth freaking out over. Roxanne Khamsi covered these potential coinfections in The Atlantic back in November 2021, writing: “Recent screening studies have found that 14 to 70 percent of those hospitalized with flu-like illness test positive for more than one viral pathogen.”
    • Senators call for HHS to answer key questions about COVID-19 testing. This week, Senators Roy Blunt (Missouri) and Richard Burr (North Carolina) wrote to Health and Human Services (HHS) Secretary Xavier Becerra requesting information on COVID-19 test spending. The Senators note that over $82.6 billion has been “specifically appropriated for testing,” yet the U.S. continues to experience dire shortages and delays for both PCR and rapid tests. The letter includes questions about Biden’s initiative to distribute 500 million rapid tests for free; little information has been shared about the initiative so far.
    • New meta-analysis estimates one in three COVID-19 patients have persistent symptoms for 12 weeks or more. In a meta-analysis, scientists compile results from a number of studies on the same topic in order to provide overall estimates for an important metric, like the risk of developing a particular condition. A new analysis from researchers at a Toronto hospital network and other co-authors examined the risk of Long COVID symptoms following a COVID-19 diagnosis, combining results from 81 studies. Their main findings: about 32% of patients had fatigue 12 weeks after their diagnosis, while 22% had cognitive impairment at 12 weeks; and the majority of those patients still had these symptoms at six months. (H/t Hannah Davis.)

    Note: this title and format are inspired by Rob Meyer’s Weekly Planet newsletter.

  • New CDC mortality data: “Real-time public health surveillance at a highly granular level”

    New CDC mortality data: “Real-time public health surveillance at a highly granular level”

    The CDC’s new data release allows researchers to search through mortality data from 2020 and 2021 in great detail. Screenshot of the CDC’s search tool retrieved December 12.

    This past Monday, the CDC put out a major data release: mortality data for 2020 and 2021, encompassing the pandemic’s impact on deaths from all causes in the U.S.

    The new data allow researchers and reporters to investigate excess deaths, a measure of the pandemic’s true toll—comparing the number of deaths that occurred in a particular region, during a particular year, to deaths that would’ve been expected had COVID-19 not occurred. At the same time, the new data allow for investigations into COVID-19 disparities and increased deaths of non-COVID causes during the pandemic.

    To give you a sense of the scale here: As of Saturday, the U.S. has reported almost 800,000 COVID-19 deaths. But experts say the true COVID-19 death toll may be 20% higher, meaning that one million Americans have died from the virus. And that’s not counting deaths tied to isolation, drug overdoses, missed healthcare, and other pandemic-related causes.

    The CDC’s new data release is unique because, in a typical year, the CDC reports mortality data with a huge lag. Deaths from 2019 were reported in early 2021, for example. But now, the CDC has adapted its reporting system to provide the same level of detail that we’d typically get with that huge lag—now with a lag of just a few weeks. The CDC has also improved its WONDER query system, allowing researchers to search the data with more detail than before.

    “I would describe this new release as more real-time surveillance at more specific detail than any journalists, or epidemiologists, or any other kind of researcher even knows what to do with,” said Dillon Bergin, an investigative reporter and my colleague at the Documenting COVID-19 project, at the Brown Institute for Media Innovation and MuckRock.

    Along with Dillon and other Documenting COVID-19 reporters, I worked on a story explaining why these CDC data are such a big deal—along with what we’re seeing in the numbers so far. The story was published this week at USA Today and at MuckRock. Our team also compiled a data repository with state-level information from the new CDC release, combined with death data from 2019 and excess deaths.

    If you’re a reporter who’d like to learn more about the new CDC data, you can sign up for a webinar with the Documenting COVID-19 team—taking place next Wednesday, December 15, at 12 PM Eastern time. It’s free and will go for about an hour, with lots of time for questions. Sign up here!

    Editor’s note, December 27: This webinar was recorded; you can watch the recording here.

    Also, as our initial story is part of a larger investigation (in collaboration with USA Today), the team has put together a callout form for people to share their stories around COVID-19 deaths in their communities. If you have a story to share, you can fill out the form here.

    To provide some more information on why this new CDC release is so exciting—and what you can do with the data—I asked Dillon a few questions about it. As the lead reporter on our team’s excess deaths investigation, he’s spent more time with these data than anyone else. This interview has been lightly edited and condensed for clarity.


    Betsy Ladyzhets: How would you summarize this new release? What is it?

    Dillon Bergin: I would describe this new release as more real-time surveillance at more specific detail than any journalists, or epidemiologists, or any other kind of researcher even knows what to do with. It’s unfathomably detailed, and the fact that we’re going to be able to see updates in almost real time is really critical at this stage of the pandemic, or at any stage in a public health crisis. I think it’s a huge, huge step forward.

    BL: Specifically in the realm of COVID deaths, but also, all deaths during the pandemic.

    DB: Exactly, yes. In the realm of COVID deaths, we do know that there is a large gap between the total amount of excess deaths and the excess deaths that COVID accounts for. So it’s interesting from that angle, understanding what COVID might have been misclassified. But the data can also be used for a broad range of other types of deaths that have happened during the pandemic or possibly increased during the pandemic.

    BL: So why are researchers excited about this data release?

    DB: Previously, for something to go up on the WONDER website, or to become WONDER data, has to be finalized in the year after. So, data from 2020 would just be finalized now. Typically, we might not see that data until, probably, early in the new year [2022].

    But with the new tool, we’re getting that 2020 and 2021 WONDER data now. And the CDC does a great job of providing a lot of granular details about causes of death, and racial demographics… Those are things that general CDC [mortality] data gives you, but the WONDER data is even more detailed. So, the fact that researchers don’t have to wait anymore for that data to be finalized, that the CDC is providing provisional data at such a detailed level—that’s what researchers are excited about.

    BL: It’s the provisional data that’s being released, like, a year earlier than you would normally expect it to be published, right?

    DB: Yeah, a year earlier than you would expect it to be published. Which means it’s almost real-time, because it has, I think, a three- or four- week lag. This data is real-time public health surveillance at a highly granular level—which is what people have been asking for. It’s what epidemiologists have been asking for, researchers, advocates of all kinds, journalists, lots of people have been saying, “We need this type of surveillance.”

    BL: When you say a three- or four-week lag—the CDC is going to update it every couple of weeks, right?

    DB: Yes, that’s correct.

    BL: Do you have a sense of what the update schedule is going to be, or is the CDC not sure yet?

    DB: I’m not sure. I know it was a big haul for them to just get this out, I’m not sure what the next update will be…

    BL: Yeah, well, I’m sure we [Documenting COVID-19] will keep an eye on it. And we’ll tell everybody when it updates. (Editor’s note: As of December 12, it has already been updated! Data now go through November 20, 2021.) So, what are some of the things that you’ve seen in the data from the preliminary analysis that you’ve done so far?

    DB: One of the specific things that I’ve seen, that’s been really important for the work that I’m doing right now, is increases of different types of deaths at home. When people die, they don’t always die in a hospital—they could die in an outpatient clinic, or in an ER, or they could come to the hospital dead on arrival, they could die in hospice, or a nursing home, or at home.

    And one of the awesome things about the CDC data is that you can see, actually, where people have died, and what specific causes of death that those people had when they died. Or, to be precise, you can’t see specific people—but you can see, say, 50 people died of heart attacks in a specific county at home. You would be able to see [in the data] that those people not only died of a heart attack, but they died at home. 

    The takeaway for me has been that respiratory and cardiovascular deaths have increased at home in specific states and counties. Louisiana is one example: it looks like Louisiana has the highest increase of deaths at home from [the CDC designation] “other forms of heart disease,” of any state, at like a 60% increase from previous years. So then we have to ask ourselves, what could lead to that increase? Are people really dying more of heart disease at home, by that much higher of a rate? Or is something else going on here?

    BL: If you were talking to local reporters about this, what would they recommend that they do with the data?

    DB: I would recommend that they take a look at the most recent data, the data from 2020 and 2021, for their area. And also pull some previous years, probably five years [of data], and start looking at causes of death, ages of the people who died, racial and demographic makeup, and place of death. I think different combinations of those data will start to provide some interesting avenues that can lead you to do actual human reporting—asking, what was happening? And why was that happening at this scale?

    The new WONDER data, you can kind-of stretch it and bend it in so many different ways, it can be a little bit intimidating at first. So maybe, it would also be useful to start with a more specific question. If you’re wondering about, let’s say, certain types of deaths in a very specific county. Say you’re wondering if that’s from unintentional drug overdoses, or deaths from respiratory diseases in your county. Then you can start looking at the more granular level of details within those types of deaths—whether it’s racial and demographic makeup, or whether or not the body was autopsied. You can even see the day of the week [that people died]. There’s a lot of different places you can zoom in.

    My overall advice would be: Start with a general question and then explore, then reform that question and explore, then reform that question. The data is both so extensive and so granular that you can get lost in it very quickly.

    BL: You mentioned that it’s very intimidating, which I would second. The first time I looked at the WONDER data, I was like, “What is going on here?” So, what would be your recommendations for working with that data tool? Or any major caveats that you think people should know before they dive into this?

    DB: That’s a great question, because with WONDER, you have to use their querying tool through their website. You can’t really easily and quickly export things or work with an API, though you can export data once you do a query.

    My first caveat would be, keep in mind the suppression of any values under 10. So, that means you can zoom in on certain things, but then you may also have to zoom out. For example, if you wanted to know the leading causes of death for someone, when a body is dead on arrival—if you do that search at a state level, you’ll probably be able to see the first five or so causes before you reach causes that have only happened between one and 10 times, and then that value is oppressed and you can’t see the information. But if you were to do the same search on a national level, you would have a lot more causes for those types of deaths.

    So, I would keep in mind the suppression, when zooming in and out. And also keep in mind, if, say, you’re looking at “dead on arrival” deaths for every county in a specific state, so many causes of death for those [county-level searches] will be suppressed, that your totals from the counties would not match the actual totals [at the state level]. Because you may not be aware that the CDC is not showing you the values that were suppressed if you didn’t click a specific button—or if you’re quickly adding things.

    BL: Another thing that [our team ran into] is occurrence versus residence—that’s something people need to know about. “Residence” means sorting by where people lived, “occurrence” means sorting by where they died. Those don’t always match up.

    DB: Yes, I would say residence versus occurrence is very important to keep in mind, especially because, when you’re redoing a search and scrolling very fast, you can accidentally fill out a state for occurrence instead of residence. Which actually did happen to me, and then I was confused by my own numbers. Then I noticed that there were a bunch of states coming up that I hadn’t meant to search for, because I, like, filtered by residence and then searched by occurrence.

    So yeah, keeping in mind the difference between residence and occurrence is definitely important. Though if you go back in the historical data [before 2018], it’s just residence—just a single state for each death.

    Also, just clear some extra time to get used to working with the WONDER interface. Because, unlike the CDC data updates that are just on the data.cdc.gov website, that you can just quickly download and open up in your technical took of choice—for WONDER, you do have to use the WONDER query site, and it can be difficult to get used to searching and importing. 

    BL: I will say one more thing, while we’re on this topic, that I’ve been doing and that might be helpful for other people: make sure that, if you export data from WONDER, that you always save that notes section it gives you at the bottom [of the exported file]. Because that will tell you exactly what you searched for. So, if you want to replicate something later, you can just go back and look at the notes. I feel like my instinct, often, when I’m looking at a dataset, is to delete all the notes and anything I don’t need—so I have to remind myself, like, “No, you should keep this.”

    DB: That’s actually a really good tip, because I do that… I import the data [to my computer] and then I delete all the notes. That’s a great point.

    BL: Also, what recommendations do you have if people are looking for, like, experts to interview about these data? Say a local reporter wants to search for experts in their area, what should they do?

    DB: I can speak about that, because that’s been really useful for me in my reporting. Once you have this data, or once you’ve researched excess deaths in your area, you should talk with an epidemiologist or a social epidemiologist—someone who would know your state, or maybe even your more local area—about the broader mortality trends in your community. That will really give you a deep understanding of, what were the reasons that people were dying before the pandemic? And what has this expert thought about during the pandemic? And what have they heard, or read, or researched about why deaths are increasing? For example, I talked to two epidemiologists in Mississippi while working on our investigation, and they really helped me understand what I was looking at and looking for.

    BL: Awesome. And then, my last, kind-of big picture question is, why does this matter for people who aren’t epidemiologists or COVID reporters?

    DB: That is also a good question. I think the thing that I have been thinking about over and over again—and it’s something that an epidemiologist told me—which is that, if we understand how people die, then we might know what’s making them sick. And if we know what’s making them sick, then we have a shot at stopping that from happening.

    This data is a very important step in that process, which is learning, in real-time, why people are dying. If we know that, we know what’s making them sick, whether it’s unintentional drug overdoses, or an increase of deaths because of lung cancer or heart disease. Any of those things are important to know, especially in a public health crisis like the one we’re in right now.

    BL: I know we’ve talked before about this sort-of cycle of, what happens when COVID deaths are maybe undercounted in a certain community, and then that contributes to people maybe being less aware of COVID in their community. And then [that lack of awareness] contributes back to the same process.

    DB: Yeah, exactly. I think that’s an important thing as well. Throughout this process—reporting on this topic, and working with this data, and thinking more about death certificates and the information on them—I’ve been increasingly… Not evangelized, exactly, but I’ve seen the light on the importance of that final piece of information of people’s lives. And what it means not only to their families and to the local area and communities, but also what it means when we start pulling that data up to larger and larger groups, and trying to understand: what does this person’s death mean at the level of the county, or the state, or in their racial demographic, or in their age demographic, or by gender?

    All of this is critically important. And it sounds kind-of corny, but in a way, [the death certificate] is like, one really last piece of information that you leave behind for humans after you.


    More national data