Tag: Federal data

  • New CDC mortality data: “Real-time public health surveillance at a highly granular level”

    New CDC mortality data: “Real-time public health surveillance at a highly granular level”

    The CDC’s new data release allows researchers to search through mortality data from 2020 and 2021 in great detail. Screenshot of the CDC’s search tool retrieved December 12.

    This past Monday, the CDC put out a major data release: mortality data for 2020 and 2021, encompassing the pandemic’s impact on deaths from all causes in the U.S.

    The new data allow researchers and reporters to investigate excess deaths, a measure of the pandemic’s true toll—comparing the number of deaths that occurred in a particular region, during a particular year, to deaths that would’ve been expected had COVID-19 not occurred. At the same time, the new data allow for investigations into COVID-19 disparities and increased deaths of non-COVID causes during the pandemic.

    To give you a sense of the scale here: As of Saturday, the U.S. has reported almost 800,000 COVID-19 deaths. But experts say the true COVID-19 death toll may be 20% higher, meaning that one million Americans have died from the virus. And that’s not counting deaths tied to isolation, drug overdoses, missed healthcare, and other pandemic-related causes.

    The CDC’s new data release is unique because, in a typical year, the CDC reports mortality data with a huge lag. Deaths from 2019 were reported in early 2021, for example. But now, the CDC has adapted its reporting system to provide the same level of detail that we’d typically get with that huge lag—now with a lag of just a few weeks. The CDC has also improved its WONDER query system, allowing researchers to search the data with more detail than before.

    “I would describe this new release as more real-time surveillance at more specific detail than any journalists, or epidemiologists, or any other kind of researcher even knows what to do with,” said Dillon Bergin, an investigative reporter and my colleague at the Documenting COVID-19 project, at the Brown Institute for Media Innovation and MuckRock.

    Along with Dillon and other Documenting COVID-19 reporters, I worked on a story explaining why these CDC data are such a big deal—along with what we’re seeing in the numbers so far. The story was published this week at USA Today and at MuckRock. Our team also compiled a data repository with state-level information from the new CDC release, combined with death data from 2019 and excess deaths.

    If you’re a reporter who’d like to learn more about the new CDC data, you can sign up for a webinar with the Documenting COVID-19 team—taking place next Wednesday, December 15, at 12 PM Eastern time. It’s free and will go for about an hour, with lots of time for questions. Sign up here!

    Editor’s note, December 27: This webinar was recorded; you can watch the recording here.

    Also, as our initial story is part of a larger investigation (in collaboration with USA Today), the team has put together a callout form for people to share their stories around COVID-19 deaths in their communities. If you have a story to share, you can fill out the form here.

    To provide some more information on why this new CDC release is so exciting—and what you can do with the data—I asked Dillon a few questions about it. As the lead reporter on our team’s excess deaths investigation, he’s spent more time with these data than anyone else. This interview has been lightly edited and condensed for clarity.


    Betsy Ladyzhets: How would you summarize this new release? What is it?

    Dillon Bergin: I would describe this new release as more real-time surveillance at more specific detail than any journalists, or epidemiologists, or any other kind of researcher even knows what to do with. It’s unfathomably detailed, and the fact that we’re going to be able to see updates in almost real time is really critical at this stage of the pandemic, or at any stage in a public health crisis. I think it’s a huge, huge step forward.

    BL: Specifically in the realm of COVID deaths, but also, all deaths during the pandemic.

    DB: Exactly, yes. In the realm of COVID deaths, we do know that there is a large gap between the total amount of excess deaths and the excess deaths that COVID accounts for. So it’s interesting from that angle, understanding what COVID might have been misclassified. But the data can also be used for a broad range of other types of deaths that have happened during the pandemic or possibly increased during the pandemic.

    BL: So why are researchers excited about this data release?

    DB: Previously, for something to go up on the WONDER website, or to become WONDER data, has to be finalized in the year after. So, data from 2020 would just be finalized now. Typically, we might not see that data until, probably, early in the new year [2022].

    But with the new tool, we’re getting that 2020 and 2021 WONDER data now. And the CDC does a great job of providing a lot of granular details about causes of death, and racial demographics… Those are things that general CDC [mortality] data gives you, but the WONDER data is even more detailed. So, the fact that researchers don’t have to wait anymore for that data to be finalized, that the CDC is providing provisional data at such a detailed level—that’s what researchers are excited about.

    BL: It’s the provisional data that’s being released, like, a year earlier than you would normally expect it to be published, right?

    DB: Yeah, a year earlier than you would expect it to be published. Which means it’s almost real-time, because it has, I think, a three- or four- week lag. This data is real-time public health surveillance at a highly granular level—which is what people have been asking for. It’s what epidemiologists have been asking for, researchers, advocates of all kinds, journalists, lots of people have been saying, “We need this type of surveillance.”

    BL: When you say a three- or four-week lag—the CDC is going to update it every couple of weeks, right?

    DB: Yes, that’s correct.

    BL: Do you have a sense of what the update schedule is going to be, or is the CDC not sure yet?

    DB: I’m not sure. I know it was a big haul for them to just get this out, I’m not sure what the next update will be…

    BL: Yeah, well, I’m sure we [Documenting COVID-19] will keep an eye on it. And we’ll tell everybody when it updates. (Editor’s note: As of December 12, it has already been updated! Data now go through November 20, 2021.) So, what are some of the things that you’ve seen in the data from the preliminary analysis that you’ve done so far?

    DB: One of the specific things that I’ve seen, that’s been really important for the work that I’m doing right now, is increases of different types of deaths at home. When people die, they don’t always die in a hospital—they could die in an outpatient clinic, or in an ER, or they could come to the hospital dead on arrival, they could die in hospice, or a nursing home, or at home.

    And one of the awesome things about the CDC data is that you can see, actually, where people have died, and what specific causes of death that those people had when they died. Or, to be precise, you can’t see specific people—but you can see, say, 50 people died of heart attacks in a specific county at home. You would be able to see [in the data] that those people not only died of a heart attack, but they died at home. 

    The takeaway for me has been that respiratory and cardiovascular deaths have increased at home in specific states and counties. Louisiana is one example: it looks like Louisiana has the highest increase of deaths at home from [the CDC designation] “other forms of heart disease,” of any state, at like a 60% increase from previous years. So then we have to ask ourselves, what could lead to that increase? Are people really dying more of heart disease at home, by that much higher of a rate? Or is something else going on here?

    BL: If you were talking to local reporters about this, what would they recommend that they do with the data?

    DB: I would recommend that they take a look at the most recent data, the data from 2020 and 2021, for their area. And also pull some previous years, probably five years [of data], and start looking at causes of death, ages of the people who died, racial and demographic makeup, and place of death. I think different combinations of those data will start to provide some interesting avenues that can lead you to do actual human reporting—asking, what was happening? And why was that happening at this scale?

    The new WONDER data, you can kind-of stretch it and bend it in so many different ways, it can be a little bit intimidating at first. So maybe, it would also be useful to start with a more specific question. If you’re wondering about, let’s say, certain types of deaths in a very specific county. Say you’re wondering if that’s from unintentional drug overdoses, or deaths from respiratory diseases in your county. Then you can start looking at the more granular level of details within those types of deaths—whether it’s racial and demographic makeup, or whether or not the body was autopsied. You can even see the day of the week [that people died]. There’s a lot of different places you can zoom in.

    My overall advice would be: Start with a general question and then explore, then reform that question and explore, then reform that question. The data is both so extensive and so granular that you can get lost in it very quickly.

    BL: You mentioned that it’s very intimidating, which I would second. The first time I looked at the WONDER data, I was like, “What is going on here?” So, what would be your recommendations for working with that data tool? Or any major caveats that you think people should know before they dive into this?

    DB: That’s a great question, because with WONDER, you have to use their querying tool through their website. You can’t really easily and quickly export things or work with an API, though you can export data once you do a query.

    My first caveat would be, keep in mind the suppression of any values under 10. So, that means you can zoom in on certain things, but then you may also have to zoom out. For example, if you wanted to know the leading causes of death for someone, when a body is dead on arrival—if you do that search at a state level, you’ll probably be able to see the first five or so causes before you reach causes that have only happened between one and 10 times, and then that value is oppressed and you can’t see the information. But if you were to do the same search on a national level, you would have a lot more causes for those types of deaths.

    So, I would keep in mind the suppression, when zooming in and out. And also keep in mind, if, say, you’re looking at “dead on arrival” deaths for every county in a specific state, so many causes of death for those [county-level searches] will be suppressed, that your totals from the counties would not match the actual totals [at the state level]. Because you may not be aware that the CDC is not showing you the values that were suppressed if you didn’t click a specific button—or if you’re quickly adding things.

    BL: Another thing that [our team ran into] is occurrence versus residence—that’s something people need to know about. “Residence” means sorting by where people lived, “occurrence” means sorting by where they died. Those don’t always match up.

    DB: Yes, I would say residence versus occurrence is very important to keep in mind, especially because, when you’re redoing a search and scrolling very fast, you can accidentally fill out a state for occurrence instead of residence. Which actually did happen to me, and then I was confused by my own numbers. Then I noticed that there were a bunch of states coming up that I hadn’t meant to search for, because I, like, filtered by residence and then searched by occurrence.

    So yeah, keeping in mind the difference between residence and occurrence is definitely important. Though if you go back in the historical data [before 2018], it’s just residence—just a single state for each death.

    Also, just clear some extra time to get used to working with the WONDER interface. Because, unlike the CDC data updates that are just on the data.cdc.gov website, that you can just quickly download and open up in your technical took of choice—for WONDER, you do have to use the WONDER query site, and it can be difficult to get used to searching and importing. 

    BL: I will say one more thing, while we’re on this topic, that I’ve been doing and that might be helpful for other people: make sure that, if you export data from WONDER, that you always save that notes section it gives you at the bottom [of the exported file]. Because that will tell you exactly what you searched for. So, if you want to replicate something later, you can just go back and look at the notes. I feel like my instinct, often, when I’m looking at a dataset, is to delete all the notes and anything I don’t need—so I have to remind myself, like, “No, you should keep this.”

    DB: That’s actually a really good tip, because I do that… I import the data [to my computer] and then I delete all the notes. That’s a great point.

    BL: Also, what recommendations do you have if people are looking for, like, experts to interview about these data? Say a local reporter wants to search for experts in their area, what should they do?

    DB: I can speak about that, because that’s been really useful for me in my reporting. Once you have this data, or once you’ve researched excess deaths in your area, you should talk with an epidemiologist or a social epidemiologist—someone who would know your state, or maybe even your more local area—about the broader mortality trends in your community. That will really give you a deep understanding of, what were the reasons that people were dying before the pandemic? And what has this expert thought about during the pandemic? And what have they heard, or read, or researched about why deaths are increasing? For example, I talked to two epidemiologists in Mississippi while working on our investigation, and they really helped me understand what I was looking at and looking for.

    BL: Awesome. And then, my last, kind-of big picture question is, why does this matter for people who aren’t epidemiologists or COVID reporters?

    DB: That is also a good question. I think the thing that I have been thinking about over and over again—and it’s something that an epidemiologist told me—which is that, if we understand how people die, then we might know what’s making them sick. And if we know what’s making them sick, then we have a shot at stopping that from happening.

    This data is a very important step in that process, which is learning, in real-time, why people are dying. If we know that, we know what’s making them sick, whether it’s unintentional drug overdoses, or an increase of deaths because of lung cancer or heart disease. Any of those things are important to know, especially in a public health crisis like the one we’re in right now.

    BL: I know we’ve talked before about this sort-of cycle of, what happens when COVID deaths are maybe undercounted in a certain community, and then that contributes to people maybe being less aware of COVID in their community. And then [that lack of awareness] contributes back to the same process.

    DB: Yeah, exactly. I think that’s an important thing as well. Throughout this process—reporting on this topic, and working with this data, and thinking more about death certificates and the information on them—I’ve been increasingly… Not evangelized, exactly, but I’ve seen the light on the importance of that final piece of information of people’s lives. And what it means not only to their families and to the local area and communities, but also what it means when we start pulling that data up to larger and larger groups, and trying to understand: what does this person’s death mean at the level of the county, or the state, or in their racial demographic, or in their age demographic, or by gender?

    All of this is critically important. And it sounds kind-of corny, but in a way, [the death certificate] is like, one really last piece of information that you leave behind for humans after you.


    More national data

  • COVID source shout-out: Community Profile Reports

    We’re now approaching almost a year since the Department of Health and Human Services (HHS) first started publicly releasing Community Profile Reports, massive documents containing COVID-19 data at the state, county, and metro area levels.

    These reports were originally compiled internally, starting in spring 2020, for meetings of Trump’s White House Coronavirus Task Force. Reporters such as Liz Essley Whyte at the Center for Public Integrity were able to obtain some of the documents, but they remained a mostly-secret trove of data until the HHS started publishing them publicly in late December.

    At the time, I wrote that I was excited about the public release because these reports contain a wealth of information in one place—including contextual data (such as population-adjusted case numbers and demographic information) and rankings for policy-makers built right into the Excel spreadsheets.

    Since then, I’ve relied on Community Profile Reports for weekly data updates in this newsletter, along with numerous other stories. While their update schedule has not remained regular, the reports continue to be a one-stop shop for everything from vaccination rates to hospitalization metrics.

    So, this Thanksgiving weekend, I’m thankful for the Community Profile Reports. According to the HHS site, they’ve been downloaded almost 100,000 times, and probably a solid 300 of those are me.

  • Cases are rising on Thanksgiving again, but we’re better protected this year

    Cases are rising on Thanksgiving again, but we’re better protected this year

    Before the Omicron news hit on Thursday, I was planning to write a big post about how the state of the pandemic in the U.S. at Thanksgiving this year compares to the state of the pandemic at Thanksgiving last year. But, well, Omicron happened—so here’s a small post about Thanksgiving, instead.

    Remember: last year, Thanksgiving was a turning point in the winter 2020 surge. While cases had already been going up prior to the holiday, the convergence of travel, indoor gatherings, and cold weather helped the coronavirus spread further. Christmas did the same thing, one month later.

    This year, we saw cases increase once again in the weeks prior to Thanksgiving. But we’re better protected this time, thanks to vaccines and better knowledge of the virus.

    Let’s look at the national metrics:

    • On November 23, 2021 (two days before Thanksgiving), the seven-day average of new COVID-19 cases was about 94,000 new cases a day. That’s about 45% lower than last year’s number, 170,000 new cases a day (on November 26, 2020, Thanksgiving itself).
    • On November 23, 2021, the seven-day average of new COVID-19 deaths was about 1,000 new deaths a day. That’s about 45% lower than last year’s number, 1,800 new deaths a day.
    • On November 25, 2021 (Thanksgiving day), 43,000 people were hospitalized with COVID-19 in facilities across the U.S. That’s just under half as many as the number of patients hospitalized last year, 84,000 people.
    • As of November 24, 2021, 196 million Americans are fully vaccinated against COVID-19—and an additional 35 million have received at least one dose, while more than 37 million have received booster shots.

    Clearly, while the trajectory of cases (and other metrics) may be the same as they were last year, the numbers are way lower. But the national metrics obscure local patterns. In some parts of the country, particularly some northern states, case numbers are actually higher at Thanksgiving this year than they were last year. 

    You can see how your county is faring on this map, which I put together for a DailyMail.com story on this topic. Use the drop-down menu at the top to click between Thanksgiving 2020 and Thanksgiving 2021.

    For that DailyMail.com story, I asked several COVID-19 experts for their thoughts on this winter’s oncoming surge, as well as their advice for staying safe while gathering for the holidays. Key pieces of advice included:

    • Get vaccinated, including a booster shot if you’re eligible.
    • Get tested prior to travel or large gatherings.
    • Use high-quality masks (especially N95s and KN95s) while traveling.
    • Be aware of case rates at both your point of origin and your destination.
    • If you’re gathering indoors with others, make sure everyone is on the same page about safety.

  • Biden’s new COVID-19 plan excludes data

    Biden’s new COVID-19 plan excludes data

    No mention of data reporting or infrastructure here. Screenshot taken from whitehouse.gov on September 12.

    On Thursday, President Joe Biden unveiled a major new plan to bring the U.S. out of the pandemic. If you missed the speech, you can read through the plan’s details online.

    Key points include vaccination requirements for large employers, federal workers, and federal contractors; booster shots (if the FDA and CDC approve them); and making rapid tests more accessible for the average American. Much of the plan aligns with safety strategies that COVID-19 experts have been recommending for months—or, in the case of rapid testing access, over a year.

    But I and other data nerd friends were quick to notice that one major topic is missing: data collection. Numerous reports and investigations have demonstrated how the U.S.’s underfunded state and local public health agencies were completely unprepared to collect and report COVID-19 metrics, hindering our response to the pandemic. (This POLITICO investigation is one recent example of such a story.) Local data collection has gotten even worse during the latest surge, as many states cut back on their COVID-19 reporting and the federal government has failed to comprehensively track breakthrough cases.

    As a result, one might expect Biden’s plan to take steps towards improving COVID-19 data collection in the U.S. Perhaps the plan could have provided funding to local public health agencies, tied to a requirement that they report certain COVID-19 metrics on a daily basis. Perhaps it could have included increased tracking for breakthrough cases, or increased genomic sequencing to identify the next variant that inevitably becomes a concern after Delta.

    Instead, the plan’s only mention of “data” is a line about how well the vaccines work: “recent data indicates there is only 1 confirmed positive case per 5,000 fully vaccinated Americans per week.”

    Without prioritizing data, the Biden administration is failing to prepare the U.S.—both for future phases of this pandemic and for future public health crises.

  • A new tracker highlights the racial disparities—and the missing data—in America’s COVID-19 outbreaks

    A new tracker highlights the racial disparities—and the missing data—in America’s COVID-19 outbreaks

    Screenshot of the Health Equity Tracker showing which states are missing race and ethnicity data for COVID-19 cases.

    Two weeks ago, a major new COVID-19 data source came on the scene: the Health Equity Tracker, developed by the Satcher Health Leadership Institute at Morehouse School of Medicine.

    This tracker incorporates data from the CDC, the Census, and other sources to provide comprehensive information on which communities have been hit hardest by COVID-19—and why they are more vulnerable. Notably, it is currently the only place where you can find COVID-19 race/ethnicity case data at the county level.

    I featured this tracker in the CDD the week it launched, but I wanted to dig more into this unique, highly valuable resource. A couple of days ago, I got to do that by talking to Josh Zarrabi, senior software engineer at the Satcher Health Leadership Institute—and a fellow former volunteer with yours truly at the COVID Tracking Project.

    Zarrabi has only been working on the Health Equity Tracker for a couple of months, but he was able to share many insights into how the tracker was designed and how journalists and researchers might use it to look for stories. We talked about the challenges of obtaining good health data broken out by race/ethnicity, communicating data gaps, and more.

    The interview below has been lightly edited and condensed for clarity.


    Betsy Ladyzhets: Give me the backstory on the Health Equity Tracker, like how it got started, how the different stakeholders got involved.

    Josh Zarrabi: At the beginning of the pandemic, the Satcher Health Leadership Institute at Morehouse School of Medicine saw the lack of good COVID data in the country, and especially the lack of racial data. The COVID Tracking Project kind-of tried to solve that as well with the Racial Data Tracker

    Morehouse wanted to do something similar. And so they applied for a Google.org grant… After about nine months, the tracker just got released. It went through a couple of different iterations, but what it is now is, it’s a general health equity tracker, so it tracks a couple of different determinants of health. And it really has a focus on equity between races and amplifying marginalized races as much as possible.

    Probably the most innovative thing it does is, it shows COVID rates by race down to the county level. We think that’s relatively hard to find anywhere else. (Editor’s note: It is basically impossible to find anywhere else.)  So that’s probably like the main feature that it has that people care about, but it does track other health metrics. We also have poverty, health insurance, and we try to track diabetes and COPD, but there’s not great data on that, unfortunately, in the United States. We’re planning to add more metrics in the future.

    BL: How does this project build on the COVID Racial Data Tracker? And I know, like APM has a tracker for COVID deaths by race. And there are a couple other similar projects. So what is this one doing that is taking it to the next level?

    JZ: A couple of things. We’re using the CDC restricted dataset. Basically what the dataset looks like is, it’s like a very large CSV file where every single line is an individual COVID case. So we’re able to break it down basically however we want. So we were able to break that down to the county level, state level and national level.

    And what we do is we allow you to compare that [COVID rates] to rates of poverty, and rates of health insurance in different counties. We think that’s pretty innovative, and we’re gonna allow you to compare it to other things in the future. So that’s one thing that we do. And I mean, the second thing that I would say is like, probably makes us stand out the most I would say is our real focus on racial equity, and showing where the data gaps are and how that affects health equity. So what you’ll notice, if you go to our website, we very prominently display the amount of unknown… 

    BL: Yeah, I was gonna ask you about that, because I know the COVID Racial Data Project had similar unknown displays. Why is it so important to be highlighting those unknowns? And what do you want people to really be taking away from those red flag notes?

    JZ: We really try to do our best to display the data in context as much as possible. First of all, the most important thing, I think, is just showing the high percentage of unknown race and ethnicity of COVID cases in the United States. For something like 40% of cases, we don’t know the race and ethnicity of the person who had COVID.

    We want people to really think about that when they look at, for example, you’ll notice that it looks like Black Americans are affected to the exact level of their population. Black Americans look like 12% of the population and 11% of cases. But we don’t know the race of 40% of people who have COVID. And so we really wanted people to think about that when they look at these numbers. And it’s the same for American Indian/Alaskan Native populations. It doesn’t look like they’re that heavily affected in the United States. But that’s why we allow you to break down into the county level, where race is not being reported. And so we really want people to look and say, like, oh, wow, like in Atlanta, 60% of cases are not being counted for race and ethnicity.

    We’re not doing any extrapolation. We’re not multiplying, we’re not like trying to guess the races of unknowns, or anything like that. We really want people to think about that, when they’re saying like, oh, wow, it looks like Native American people are not really heavily affected by COVID. It’s like, no, we just don’t know. We don’t know their races, or those people are just not being reported properly by the health agencies.

    And if you look at places that have high percentages of Black Americans and high percentages of American Indian/Alaskan Natives, you’ll see that those places are the same places that are not reporting the race and ethnicity of the people who had COVID.

    We had a team of about 20 health equity experts advising us throughout the entire project. That’s where those red flags that you see come from. It’s explaining, for example, if you look into deaths for Native American and Alaska Natives, there’s an article about how a lot of American Indian/Alaskan Native people who died are not, are improperly categorized racially, and they’re often categorized as white. And so we have that kind of stuff to really try to put the numbers in context.

    We were only able to do that, because we had this large team of racial equity experts and health equity experts advising us throughout the entire time. And so we really had diverse representation on the project as we were building it, and people who really knew what they were talking about.

    BL: What can public health agencies and also researchers and journalists do to push for better data in this area?

    JZ: The good thing is we are seeing [data completion] get better over time. And so we’ve seen, for example, the percentage of race and ethnicity for cases improved from about 50% to about 60% over the last couple of months.

    And, I mean, really, all you can do is—it’s really a thing that goes down to the county level. So, everybody’s just got to call their county representatives. I’d be like, hey, could you please report the race and ethnicity of the county’s COVID cases to the CDC? Unfortunately, a lot of that work might be too late, because [data were submitted months ago]. But we have seen it get better. And so we’re hoping that, you know, these health agencies are able to do the work and really, like, properly report these cases to the CDC… 

    BL: ‘Cause a lot of it comes from the case identification point, where if you’re not asking on your testing form, what race are you, then you just might not have that information. Or you might be, like, guessing and getting it wrong or something, right?

    JZ:  Yeah, there’s guessing. There’s two different categories of unknown cases—there’s unknown and there’s missing. The vast majority of these cases have filled out unknown [in the line file], which means that the person who’s filling out the data form literally puts “unknown” as the race. We don’t really know exactly what that means in every case. But it could be they didn’t ask, it could be the person didn’t feel comfortable saying it, just said, “I don’t want to tell you my race.” Or it could just be that they just didn’t make an effort to figure out what their race is.

    (Editor’s note: For more on the difficulties of collecting COVID-19 race data, I recommend this article by Caroline Chen at ProPublica.)

    BL: Do you have a sense of how that 60% known cases compares to what the COVID Racial Data Tracker had in compiling from the states?

    JZ: Yeah, I think the COVID Racial Data Tracker was a bit higher [in how many cases had known race/ethnicity]. But the thing is, as far as I understand, the COVID Racial Data Tracker was using aggregate numbers.

    BL: We were looking at the states and then kind-of like, synthesizing their data to the best of our ability, which was pretty challenging because every state had slightly different race and ethnicity categories. There were some states that had almost no unknown cases, but there were some where almost all cases or almost all deaths were unknown. New York, I don’t know if they ever started reporting COVID cases by race.

    JZ: They do to the CDC, I don’t think they report—

    BL: They don’t report it on their own, state public health site.

    JZ: Let me actually check that… Yeah, so New York is not great. They have a 60% unknown rate. [Race and ethnicity is only reported to the CDC for 40% of cases.] Not great. Actually, New York City is pretty good. But the rest of New York State is not doing a good job reporting the race and ethnicity of cases.

    BL: Because I’ve gotten tested here, I know that New York City is good about collecting that [race and ethnicity] from everybody.

    JZ: I was one of those cases in New York City, actually. When [I got called by a contact tracer], I was kind of chatting with them about this. They asked me about my race—I actually became a probable case for COVID, like, the day after I started this job. And [NYC Health] called me, they were like, “What’s your race?” I was like, “Oh, that’s kind of funny, I just started working on this racial data project.” And—this is totally anecdotal. But she told me, most people just refuse to report their race. 

    And then for deaths… 40% of COVID deaths in New York state, they don’t know the race, which is not great. New York is not good compared to the rest of the states. It’s one of the worst states for unknowns.

    BL: Could you tell me more about the process of getting the [restricted] case surveillance data from the CDC and how you’ve been using that?

    JZ: The process of getting it’s not that hard. You just apply, and then they give you access to a GitHub repo, and then you can just use it. Using the data itself is pretty hard because the data files are so large. We were lucky enough to have a team of Google engineers working on this project, they wrote a bunch of Python scripts that analyze the data and aggregate it in a way that the CDC isn’t doing.

    The reason why they restrict the use is because it’s line-by-line data. [Each line is a case.] And the CDC does suppress some of the data because they think it would make those cases identifiable. Still, you’re not allowed to just, like, release the data into the wild, because they want to know who else has track of it. So, we wrote some Python to aggregate the data, in exactly the way you see on the website. We aggregate it to the amount of cases, deaths and hospitalizations per county, per race, essentially. 

    The CDC has been extremely helpful, like, we’ve had a couple of meetings with them. We think we were one of the heaviest users of the data at the beginning, because we pointed out a couple of problems with the data that they actually fixed. So, that’s cool.

    BL: That’s good to hear that they were responsive.

    JZ: Yeah, definitely. We meet with them every couple of weeks. They’re really good partners in this.

    BL: And they update that [case surveillance] dataset once a month?

    JZ: They started doing it every two weeks now. Every other Monday, they update the dataset.

    BL: Could you talk more about the feature of the tracker that lets you compare COVID to other health conditions and insurance rates? I thought that was really unique and worth highlighting.

    JZ: We wanted to really provide the [COVID] numbers in context. And so that’s one way that we thought that we could do that and really show how… These numbers don’t happen, like a high rate of COVID for race doesn’t happen in a vacuum. There are political determinants of health.

    For example, you’ll see everywhere that Hispanic Americans are just by far the most impacted by COVID case-wise. In California especially. And we provide those numbers in context—Hispanic Americans are also much less likely to be insured than white Americans, for example, and much more likely to be in poverty. And, you know, it’s not a crazy surprise that they would also be more likely to have contracted COVID at some point.

    [The comparison feature] was a way that we thought, we would just allow people to really view numbers in context and get a better understanding of what the political situation is on the ground with where these high numbers are happening.

    BL: What are the next conditions that you want to add to the tracker?

    JZ: I want to be careful, because we can’t make any promises. But we’re talking about adding smoking rates, maybe. [The challenge is] where we can find data that we can aggregate correctly.

    BL: Right. Are you looking specifically for data that’s county level as opposed to state level?

    JZ: Hopefully… It depends. I was pretty surprised by the lack of quality in, for example, COPD and diabetes data, where like, if you look at [the dataset], like it’s state level—but in most states, there’s not a statistical significance for most races.

    BL: Wow.

    JZ: For example, we use the BRFSS survey. [The Behavioral Risk Factor Surveillance System.] It’s a CDC survey. And as far as we can tell, it’s the gold standard for diabetes [data] in the country.

    And if you look at, say, diabetes, for most states… There’s only, like, four states where Asian people are statistically significant in the survey to make any sort of guess about how many people have diabetes, which is pretty atrocious. But that [data source] is the best we could do, you know. Ideally, we would like to find places that do go down to the county level, but it’s hard.

    For as paltry as the COVID data is, it’s much better than—as far as I’ve seen, like, the fact that there’s like a line-by-line database that the CDC provides, that you can really make all these breakdowns of, is a huge step ahead [compared to other health data]. I’m not like a data expert on this kind of stuff, I’ve just been working on this project for two and a half months. But as far as I’ve seen, that’s what the situation is.

    BL: Yeah, I mean, that kind of lines up with what I have seen as well. And I bet a lot of it is a case where, like, a journalist could FOIA [the data] from a county or from a state. But that’s not the same as getting something that is comprehensive, line-by-line, from the CDC.

    JZ: And we [the Satcher institute] don’t want to be a data collection agency, like the COVID Tracking Project or the New York Times is. I mean, we want this to be a sustainable project. And the COVID Tracking Project was not a sustainable project.

    BL: Yeah, totally. I was there doing the [data entry] shifts twice a week, that’s not something we could have done forever.

    JZ: Yeah, I was there, too. I always think, like, the COVID Tracking Project could only exist when there’s an army of unemployed people who are too afraid to leave their house.

    BL: And volunteers who were like, yeah, sure, I’ll do this on my evenings and weekends.

    JZ: Who, you know, you don’t want to leave, you’re too afraid to go, like talk to people. You want to stay home in front of your computer all day, and feel useful.

    I’m sure you could find all the diabetes data by going to county and state health department websites, but it’s too much work. So we really want everything to come from federal sources, basically, that’s our goal.

    BL: How are you finding that people have used the tracker so far? Like, do you know of any research projects that folks are doing?

    JZ: We released it a couple weeks ago, and we haven’t really heard of any yet… But we hope people are looking at it. And we have a couple of meetings lined up with some interesting research groups and stuff like that. So hopefully, they’ll like it.

    BL: Are there any specific statistics or comparisons or anything else you found in working on it that you would want to see explored further? Are there any stories that you want to see come out of it?

    JZ: The high rates of unknown data in a lot of places, that really needs to be looked into. Because it’s just hard to make any conclusions about what’s going on if—I mean, in some states like New York, over 50% of cases are unknown. That’s a huge problem. And that’s definitely something that needs to be looked into, like, why that’s happening. And if there’s anything that can be done to change that [unknown rate.] The reason why I do think that it can get better is because the COVID Tracking Project racial data had higher completeness rates. And so they [the states] probably do know the races of people who got sick, but they’re just not reporting it for whatever reason.

    And for me, something that’s really stuck out was the extremely high rates of COVID for Hispanic and Latino people, especially in California. If you look at them and compare them to white rates, it’s, like, the exact opposite pattern. So it kind of does look like Hispanic and Latino people were kind-of shielding white people from getting COVID, if you compare the numbers. That’s something I would look into, too, like, why that happened.

    (Editor’s note: This story from The Mercury News goes into how the Bay Area’s COVID-19 response heightened disparities for the region’s Hispanic/Latino population.)

    BL: And another question along the same lines, is there a specific function or aspect of the tracker that you would encourage people to check out?

    JZ: The unknowns. Just, like, look into your county and see what percentage of cases in your county have reported race and ethnicity at all. I think you can really see how good of a job your county has done at reporting that data. I know I was kind-of shocked by that rate for the county like I grew up in, like, I know that they have the resources to [report more data], but they’re just not doing a very good job.

    BL: How would you say this experience with tracking COVID cases might impact the world of public health data going forward, specifically health equity data, and how do you see the tracker project playing a role in that?

    JZ: We really want this project to show the importance of tracking racial health data down to the county level or even lower than that. County is the best we can do right now, but we’d love to see city level or something like that. And again, I kind-of said this before—as much as was missing for the COVID data, it’s still better than the data that there is for most other diseases and other determinants of health. So we would like to see, like, more things able to be filled out on the tracker. We would like to be able to get more granular on more different determinants of health, so that we can see, for example, how poverty impacts health, or a lack of health insurance, or how diabetes and COVID are related down to the county level. You can’t really do that right now… 

    We want people to see that, A, there’s a lot of data missing. But B, even with the data that we have, we can see that there’s like a huge problem. And so we would like to be able to fill out the data more to really get a better picture of what’s going on. If we can see there’s a problem, we can make better policy to help and make these disparities not as stark.

  • Hey CDC, when dashboard?

    Hey CDC, when dashboard?

    As dedicated CDD readers may remember, one of President Biden’s big COVID-19 promises was the creation of a “Nationwide Pandemic Dashboard” that would be a central hub for all the information Americans needed to see how the pandemic was progressing in their communities.

    The Biden administration sees the CDC’s COVID Data Tracker as that dashboard and plans to continue improving it as time goes on, White House COVID-19 Data Director Cyrus Shahpar said in an interview with The Center for Public Integrity last month. But a new report from the Government Accountability Office suggests that the CDC’s tracker has a long way to go before it becomes the centralized system that Americans need.

    The Government Accountability Office, or GAO, is a federal watchdog agency that evaluates other federal agencies on behalf of Congress. Its full report, released last Wednesday, is over 500 pages of problems and recommendations, ranging from the Emergency Use Authorization process to health care for veterans.

    But, as COVID Tracking Project leader Erin Kissane pointed out on Twitter, there are some real data bangers starting in the appendix:

    Here are a few of those data bangers:

    • Recommending that the federal government provides more comprehensive data on who gets a COVID-19 vaccine. The GAO specifically wants to see more data on race and ethnicity, so that the public can gauge how well vaccination efforts are reaching more vulnerable demographic groups. The agency also notes the challenge of finding occupational data on vaccinations, something we’ve bemoaned before at the CDD.
    • Calling out the lack of public awareness for federal data. Some experts the GAO interviewed noted that “the public may be more aware of non-federal sources of data on COVID-19 indicators (e.g., the COVID Tracking Project, Johns Hopkins) than sources from the federal government,” in part because those non-federal sources started providing public data earlier in 2020. The federal agencies need to step up their communications game.
    • Stating the need for central access to federal data. The GAO describes how the HHS lacks a central, public-facing COVID-19 data website, while the CDC’s COVID Data Tracker fails to provide access to the full suite of information available from the HHS. Specific missing data pages include COVID-19 health indicators and vaccine adverse events.

    Overall, the GAO says, the agency recommends that “HHS make its different sources of publicly available COVID-19 data accessible from a centralized location on the internet.” One would think this is a pretty straightforward recommendation to follow, but HHS reportedly “neither agreed nor disagreed” with the assessment.

    While there’s a lot more to dig into from this report, it is only part of a long evaluation process to improve federal data collection and reporting. The new report is part of a GAO effort that started last March, reports POLITICO’s Sarah Owermohle:

    The latest report is part of nearly yearlong effort by GAO to track the federal pandemic response after a directive in the March 2020 CARES Act. The watchdog first called on CDC to “completely and consistently collect demographic data” including comprehensive results on long-term health outcomes across race and ethnicity, in September. It later criticized the government’s lack of “consistent and complete COVID-19 data” in a January report.

    I, for one, am excited to see what the GAO does next—and how the federal public health agencies respond.

  • New, more local data from the CDC

    New, more local data from the CDC

    The CDC made two major updates to its COVID-19 data this week.

    First: On Tuesday, the agency published a new dataset with more granular information on COVID-19 cases. Like previous case surveillance datasets, this new source compiles cases shared with the CDC, along with anonymized information on their symptoms, underlying medical conditions, race/ethnicity, and other variables. The new dataset is notable because it includes detailed geographic data, going down to the county level.

    After months of no state-by-state demographic data from the federal government, we now have county-by-county demographic data. This is a pretty big deal! It’s also a pretty big dataset; it includes about 22 million cases (out of a total 30 million U.S. cases to date).

    Of those 22 million cases, race is available for about 13 million cases (58%) and ethnicity is available for about 10 million cases (47%). The dataset will be updated monthly, so we may see better completion with further updates. I haven’t had time to do much detailed analysis of the new dataset yet (hell, I haven’t even managed to get it to load on my computer), but I’m excited to dive into it for future issues.

    Second: Vaccination data at the county level are now available on the CDC’s COVID Data Tracker, as of Friday. No, not in the vaccinations section—you need to go to the County View section, then select “Vaccinations” in the dropdown menu. Click on a specific county (or select it using dropdown menus), and you’ll be able to see data for that county.

    County-level vaccination data from the CDC. Screenshot taken on March 27.

    At the moment, only three data points are available: total fully vaccinated population, fully vaccinated population over age 18, and fully vaccinated population over age 65. Also, data are missing for Texas, New Mexico, and select other counties. Still, this a great start for more standardized vaccination data at the national level. (Can we get more demographic data next?)

    These county-level vaccination data aren’t downloadable directly from the CDC’s tracker, but the COVID Tracking Project is archiving the data at the Project’s public GitHub. The New York TImes has also built an interactive map with the data, which you can find on their vaccine tracker.

    It’s worth noting that I found out about both of these updates via tweets from the White House COVID-19 Data Director, Cyrus Shahpar. I’m on both the CDC’s press list and the White House press list, and I watch nearly every White House COVID-19 press briefing, so it seems a little odd that I’m getting the news from Twitter.

    (Not that I don’t love Cyrus’ daily tweets! I just wonder about the PR strategy here. Also, Cyrus, if you’re reading this, that interview request I sent back in January still stands.)

  • Goodnight, COVID Tracking Project

    Goodnight, COVID Tracking Project

    The COVID Tracking Project’s homepage on March 7, 2021.

    A couple of hours after I send today’s newsletter, I will do my final shift of data entry work on the COVID Tracking Project’s Testing and Outcomes dataset. Then, later in the evening, I will do my final shift on the COVID Racial Data Tracker. And then I will probably spend another hour or two bothering my fellow volunteers on Slack because I don’t want it to be over quite yet.

    In case you aren’t fully embroiled in the COVID-19 data world, here’s some context. Last spring, a few journalists and other data-watchers realized that the U.S.’s national public health agencies weren’t doing a very good job of reporting COVID-19 tests. Alexis Madrigal and Rob Meyer (of The Atlantic) compiled their own count from state public health agencies. Jeff Hammerbacher (of Related Sciences) had independently compiled his own count, also from state agencies. And, as the About page on the website goes: “The two efforts came together March 7 and made a call for volunteers, our managing editor, Erin Kissane joined up, and the COVID Tracking Project was born.”

    Now, one year after that formal beginning of the Project’s test-counting efforts, the team is ending data collection work. Erin Kissane and Alexis Madrigal provided some background for that decision in a blog published on February 1. I recommend reading the piece in full, if you haven’t yet, but the TL;DR is that a. this data collection work should be done by federal public health agencies, not a motley group of researchers and volunteers, and b. the federal agencies have greatly improved their own data collection and reporting efforts in recent months.

    The Project’s core Testing and Outcomes dataset formally ceases updates today, along with the Racial Data Tracker and Long-Term Care Data Tracker. But the Project has provided a lot of documentation and guidance for data users who want to keep tracking the pandemic, along with analysis that will be useful for months (if not years) to come. The rest of this post shares the highlights from those resources, along with a few personal reflections.

    Where to find your COVID-19 data now

    So, you’re a journalist who’s relied on the COVID Tracking Project’s tweets to illuminate pandemic trends for the past year. Or you’re a researcher who’s linked the Project’s API to your own tracking dashboard. Or you’re a concerned reader who’s checked up on your state regularly, watching the time series charts and annotations. Where do you go for your data now?

    Through a series of analysis posts and webinars over the past few weeks, Project staff have made their recommendation clear: go to the federal government. In recent months, the CDC and the HHS have built up data collection practices and public dashboards that make these data easier to work with.

    Here are a few highlights:

    • For daily updates at all geographic levels, use the Community Profile Reports. After months of private updates sent from the White House COVID-19 task force to governors, the data behind these in-depth reports were made public in December. The PDF reports themselves were made public in January, after Biden took office. The reports include detailed data on cases, deaths, tests, and hospitalizations for states, counties, and metropolitan areas. I’ve written more about the reports here.
    • For weekly updates, use the COVID Data Tracker Weekly Review. As I mentioned in a National Numbers post two weeks ago: the CDC is doing weekly updates now! These updates include reports on the national trends for cases, deaths, hospitalizations, vaccinations, and SARS-CoV-2 variants. They may be drier than CTP blog posts, but they’re full of data. You can also sign up to receive the updates as a newsletter, sent every Friday afternoon—the CDC has really moved into the 21st-century media landscape.
    • For state-specific updates, use the State Profile Reports. Similarly to the Community Profile Reports, these documents provide many major state-level metrics in one place, along with local data and color-coding to show areas of concern. They’re released weekly, and can be downloaded either state-by-state or in one massive federal doc.
    • For case and deaths data, use the CDC’s state-by-state dataset. This dataset compiles figures reported by states, territories, and other jurisdictions. It matches up pretty closely to CTP’s data, though there are some differences due to definitions that don’t match and other discrepancies; here’s an analysis post on cases, and here’s a post on deaths. You can also see these data in the CDC’s COVID Data Tracker and reports.
    • For testing data, use the HHS PCR testing time series. This dataset includes results of PCR tests from over 1,000 labs, hospitals, and other testing locations. Unlike CTP, the federal government can mandate how states report their tests, so this dataset is standardized in a way that the Project’s couldn’t be. Kara Schechtman has written more about where federal testing data come from and how to use them here. The HHS isn’t (yet) publishing comprehensive data on antibody or antigen tests, as these test types are even more difficult to standardize.
    • For hospitalization data, use the HHS hospitalization dataset. I’ve reported extensively on this dataset, as has CTP. After a rocky start in the summer, the HHS has shown that it can compile a lot of data points from a lot of hospitals, get them standardized, and make them public. HHS data for current hospitalizations are “usually within a few percentage points” of corresponding data reported by states themselves, says a recent CTP post on the subject. Find the state-level time series here and the facility-level dataset here.
    • For long-term care data, use the CMS nursing home dataset. The Centers for Medicare & Medicaid Services are responsible for overseeing all federally-funded nursing homes. Since last spring, this responsibility has included tracking COVID-19 in those nursing homes—including cases and deaths among residents and staff, along with equipment, testing availability, and other information. The CMS dataset accounts for fewer overall cases than CTP’s long-term care dataset because nursing homes only account for one type of long-term care facility. But, like any federal dataset, it’s more standardized and more detailed. Here’s an analysis post with more info.
    • For race and ethnicity data, there are a couple of options. The CDC’s COVID Data Tracker includes national figures on total cases and deaths by race and ethnicity—at least, for the 52% of cases and 74% of cases where demographic information is available. More detailed information (such as state-by-state data) is available on deaths by race and ethnicity via the CDC’s National Center for Health Statistics. A blog post with more information on substitutes for the COVID Racial Data Tracker is forthcoming.

    The COVID Tracking Project’s federal data webinars concluded this past Thursday with a session on race and ethnicity and long-term care facilities. Slides and recordings from these sessions haven’t been publicly posted yet, but you can look out for them on the Project’s website.
    Also, for the more technical data nerds among you: COVID Act Now has written up a Covid Tracking Migration Guide for users of the CTP API, and the Johns Hopkins Coronavirus Resource Center announced that it will begin providing state testing data.

    Analysis and update posts to re-read

    It took a lot of self control for me to not just link every single CTP article in here. But I’ll give you just a few of my favorites, listed in no particular order.

    What the COVID Tracking Project gave me

    I joined the COVID Tracking Project as a volunteer in early April, 2020. I actually searched back through my calendar to find exactly when I did a data entry training—it was Thursday, April 2.

    At the time, I wanted to better understand the numbers I kept seeing, in tweets and news stories and Cuomo’s powerpoints. But more than that, I wanted to do something. I sat, cooped up in my little Brooklyn apartment, listening to the endless sirens screaming by. I ran to the park and wanted to yell at every person I saw walking without a mask. I donated to mutual aid funds, but even that felt empty, almost impersonal.

    The Project put out a call for volunteers, and I thought, okay, data entry. I can do data entry. I can do spreadsheets. I know spreadsheets.

    Well, I know spreadsheets much better now, almost a year later. I know how to navigate through a state dashboard, find all its data definitions, and puzzle through its update time. But beyond all the technical stuff, volunteering for CTP gave me a sense of purpose and community. No matter how tired or angry the world made me feel, I knew that, for a few hours a week, I’d be contributing to something bigger than myself. My work played a small part in making data accessible, bringing information to a wider audience.

    Much ink has been spilled about how mutual aid groups have helped neighbors find each other, especially during that period of spring 2020 when everything seemed so bleak. I have seen the Project as another form of mutual aid. I’ve given countless hours to CTP over the past year in the form of data entry shifts, analysis, writing, and custom emojis—but those hours have also been given back to me, in everything from Tableau tricks to playlist recommendations. My fellow volunteers, the vast majority of whom I’ve never met in person, are my neighbors. We live in the same spreadsheets and Slack channels; we see the world in the same way. 

    I am beginning to understand how journalism, or something like journalism, can work when it is led by a community. By community, I mean: a group of people united in one mission. And by mission, I mean: bringing information to the public. Accessibility and accountability are common buzzwords right now, I think, but CTP approaches the truth of these principles, whether it’s by doing shifts through Christmas or by writing out detailed process notes on how to navigate Wyoming’s dashboard(s).

    I know why the Project’s data collection efforts are ending. The federal government can compile—and is compiling—data on a far more detailed and standardized level than a group of researchers and volunteers ever could. But I am grateful to have been part of this beautiful thing, so much bigger than myself. It is the bar by which I will measure every organization I join from here on out.

    If you’ve ever read the About page on the COVID-19 Data Dispatch website, you may have noticed a disclaimer stating that, while I volunteer for CTP, this publication is an entirely separate project that reflects my own reporting and explanations. This is true; I’m careful to keep this project distinct. But of course, the COVID-19 Data Dispatch has been influenced by what I’ve learned volunteering for CTP. I have attempted to carry forward those values, accessibility and accountability. I’ll keep carrying them forward. Feedback is always welcome.

    To all my neighbors in the CTP Slack: thank you. And to everyone who has followed the data: there is work still to be done.

    More federal data posts

    • The federal government starts acting like a federal government

      The federal government starts acting like a federal government

      A slide from the January 27 White House COVID-19 briefing, featuring the Biden team’s new commitment to provide states with three weeks’ lead time into their vaccine supply.

      Good afternoon only to the reporters on last Wednesday’s White House COVID-19 press call who told Dr. Anthony Fauci that he was on mute.

      And yes, you read that right: the White House is doing regular COVID-19 press calls again! With Dr. Fauci! Who is now President Biden’s Chief Medical Advisor on COVID-19! And CDC Director Dr. Rochelle Walensky! And chair of Biden’s health equity task force Dr. Marcella Nunez-Smith!

      Okay, that’s enough exclamation points. The briefings, which will be held three times a week, provide data-driven updates on the state of the pandemic and allow journalists to ask hard questions of the Biden administration’s response. In addition to the scientific experts, briefings so far have featured White House advisors/COVID-19 coordinators Jeff Zients and Andy Slavitt, who can speak to the more logistical aspects of the administration’s actions.

      This is, essentially, what a responsible federal government should have been doing since January 2020. But after a year of the Trump administration’s confusion, lack of coordination, and outright lies, it’s refreshing to watch a White House COVID-19 briefing in which every statement doesn’t need to be rigorously fact-checked in real-time.

      Besides the press briefings, here are a couple of moves the Biden team made this week that underscore the new administration’s commitment to better (and more transparent) COVID-19 data:

      • Publicly releasing the COVID-19 State Profile Reports: Since last spring, the White House COVID-19 Task Force has regularly compiled detailed reports to help national and state leaders respond to the pandemic. The reports include COVID-19 data for states, counties, and cities, along with specific assessments on where governors and state public health officials should focus their efforts in order to control the virus’ spread. In late December, the data behind these reports were released to the public; here’s a CDD post with more info on that release. Biden’s COVID-19 Task Force has kept the data releases going, and this week, they also shared the PDF reports themselves. What’s more, new White House COVID-19 Data Director Cyrus Shahpar made this release his first Tweet on his new official accountand he thanked public advocates for these data, such as the Center for Public Integrity’s Liz Essley Whyte and COVID Exit Strategy’s Ryan Panchadsaram. The release indicates a new commitment to data transparency that we did not see from Trump’s White House for the majority of his tenure.
      • Updating the CDC’s COVID-19 dashboard: The CDC has been building out a COVID-19 tracker since the spring, featuring data on cases, testing, vulnerable populations, and (since December) vaccination. But it got a major upgrade this week: the dashboard now has a curated landing page and a sidebar menu that makes it much easier for users to see all the available data. This dashboard also now includes those State Profile Reports I mentioned above, making it easy for users to find information about their regions. And, under the “Your Community” label, you’ll also find an interactive COVID-19 vulnerability index: select your county, and the map will show you how susceptible you are to the pandemic based on your community’s current infection rate, testing, population demographics, health disparities, and more.
      • More lead time for vaccine distribution: Last week, I discussed how unpredictable vaccine shipments from the federal government were making it difficult for states—and by extension, local public health departments and individual providers—to coordinate their dose administration. Biden’s team improved the situation this week by giving states their shipment numbers three weeks in advance. The extended lead time will allow vaccine providers to plan out appointments and coordinate other logistics in order to ensure all doses are used. Both the CDC’s Pfizer and Moderna distribution datasets were most recently updated on January 26, with allocation numbers for January 25 and February 1.
      • Stepping up the genomic surveillance: In both of this week’s White House COVID-19 briefings, CDC Director Rochelle Walensky announced that the agency is actively looking for new SARS-CoV-2 variants by working with local and international partners. She gave some specifics in Friday’s briefing: “We are now asking for surveillance from every single state,” she said, requiring states to sequence 750 strains each week. Collaborations with both commercial labs and research universities will take the surveillance to thousands of strains per week. As Sarah Braner wrote earlier in January, such surveillance is key to understanding how prevalent the new—and more contagious—coronavirus strains are in the U.S., as well as to detecting future strains that may become a threat in the coming months.

      It looks like the CDC may be on its way to adapting its current dashboard into the Nationwide Pandemic Dashboard that Biden promised in his transition plan. But I, for one, am trying not to get too comfortable. The statements still need to be fact-checked, and the hard questions need to be asked. Biden’s team is making the bare minimum look nice—albeit with a few Zoom glitches.

      As I look forward into my coverage of the Biden administration’s COVID-19 response, and its healthcare policies more broadly, I’m thinking about this quote from Chris La Tray in his most recent newsletter issue, “Same as it Ever Was”:

      “I’m already sick of all the white liberal people humping each other’s legs every time Biden does something that is simply his damn job. “It’s so nice to have a president that….” Blech. Puke. There is copious lingering accountability to be addressed and Joe goddamn Biden is neck deep in it. We are not going back to anything that resembles the last 40 years of his political career, our only way is forward.”

      Our only way is forward. To end this pandemic, to prepare for the next one.

      Related posts

      • Can Biden clean up America’s COVID-19 data?

        Can Biden clean up America’s COVID-19 data?

        President Biden signing executive orders related to COVID-19 on January 21. Screenshot via the White House’s livestream.

        Shortly after President Joe Biden’s inauguration, the official White House website got a makeover. It now hosts the president’s priorities and COVID-19 plan—including a promise to create a “Nationwide Pandemic Dashboard.”

        I wrote about this promise in November, when it first appeared on Biden’s transition plan website. The promise hasn’t changed since then:

        Create the Nationwide Pandemic Dashboard that Americans can check in real-time to help them gauge whether local transmission is actively occurring in their zip codes. This information is critical to helping all individuals, but especially older Americans and others at high risk, understand what level of precaution to take.

        We don’t have a clear timeline for this dashboard yet, of course, much less details on what it will include. But the foundation was laid this week: Biden released a detailed national COVID-19 plan and signed 30 executive orders—three of which are directly related to tracking the pandemic.

        In the coming weeks, I’ll be closely watching to see how the Biden administration follows through on these plans. Will the new administration build on the strengths of existing federal and state data systems, or will it tear down old systems and sow unnecessary confusion?

        What Biden is promising:

        • A Nationwide Pandemic Dashboard: We covered this one already. Biden’s national strategy document specifies that the federal government will track cases, testing, vaccinations, and hospital admissions—and will “make real-time information available.” The “real-time” promise here is worth highlighting, as real-time pandemic data do not actually exist; every metric from cases to vaccinations has its own lag based on reporting and data-sharing technologies. (COVID-19 deaths, in particular, may be reported weeks after they occur.) Still, the federal government is already tracking all of these metrics. The Biden team’s goal, then, is to consolidate them into an easily accessible dashboard that is widely used by everyone from county public health leaders to elementary school teachers.
        • Coordinated federal data collection: One of Biden’s executive orders, signed on January 21, requires several federal agencies to “designate a senior official” who will lead that agency’s COVID-19 data collection. The officials must both coordinate with each other and make data public. Meanwhile, the Department of Health and Human Services secretary will review the national public health data systems and figure out how to increase their efficiency and accuracy. (Xavier Becerra, Biden’s pick for HHS secretary, hasn’t been confirmed by the Senate yet; will this review need to wait until he officially starts the position?)
        • A focus on equity: Another Biden executive order promises to address the disproportionate impact that COVID-19 has had on people of color and other minority communities. The executive order specifically calls out a lack of standardized COVID-19 data on these communities, saying this data gap has “hampered efforts to ensure an equitable pandemic response.” Biden’s COVID-19 Health Equity Task Force will be required to address this data gap by coordinating with federal agencies—both expanding data collection for underserved populations right now and making recommendations to prevent this issue in future public health crises. This task is easier said than done, though; a recent STAT News article called using data to ensure vaccination equity one of the biggest challenges Biden faces as he takes office.
        • School data collection: Last week, I wrote that there was no mention of data-gathering in Biden’s K-12 COVID-19 plan. Well, maybe someone from his team reads the COVID-19 Data Dispatch, because his executive order on supporting school reopening requires data collection in two areas: data to inform safe reopening of K-12 schools, and data to understand the pandemic’s impact on students and educators. I would have liked to see a more specific promise to track COVID-19 cases, tests, and student enrollment in public schools, but this is a good start.
        • Data-based briefings: Jen Psaki, the new White House press secretary, said on Wednesday that the administration would hold regular briefings with health officials, “with data.” Ideally, such briefings should explain trends in COVID-19 data and put numbers into context for the Americans watching at home.

        The promises are, well, promising.  And I’m rooting for President Biden!  Seriously!  My job would be way easier if I could just give you all updates using one centralized dashboard each week.  But I’ve spent enough time hacking through the weeds of this country’s highly confusing, irregular data systems to know that the new president can’t just flip a switch and make a nationwide pandemic dashboard magically appear on whitehouse.gov.

        If anyone from the Biden administration is reading this, hello!  Please put me on all your press lists!  And here’s what this data reporter would, personally, like to see you focus on.

        What I want to see:

        • Don’t break what we already have: Or, build on the existing federal data systems (and dashboards) rather than creating something entirely new. Last week, Alexis Madrigal published a feature in The Atlantic advocating for the new administration to keep COVID-19 hospitalization data under its current HHS control rather than transferring this responsibility back to the CDC. I’ve covered the HHS’s hospitalization data extensively in the CDD, but this feature really paints a cohesive picture of the dataset—from its turbulent, politically charged beginnings to its current, comprehensive, trustworthy format. The story is worth a read. And on a similar note, I’ve been glad to see federal data sources like the CDC’s dashboard and the Community Profile Reports, continue to update on their usual schedules. Biden’s team should seek to improve upon these systems and make them easier to access, not start from scratch.
        • More public metadata: When the federal government has put out large data releases in recent months, responsibility has largely fallen on journalists and other outside communicators to make those releases accessible. I’ve done some of that work in this publication and at the COVID Tracking Project. But it shouldn’t really be my job—the federal agencies that put out these datasets should be releasing FAQ documents, holding press calls, and generally making themselves available to help out researchers and communicators who want to use their data.  
        • Count the rapid tests: Since August, I’ve called on the federal public health agencies to release national data on antigen tests and other types of rapid tests. A recent article in The Atlantic by Whet Moser makes clear that data for these tests are still widely unavailable. Moser writes that antigen test numbers are not reported at the federal level, and at the state level, such reporting is highly fractured and inconsistent; as a result, about three-quarters of the antigen tests that the federal government has distributed are unaccounted for in public data. The HHS should focus on tracking these tests as comprehensively as it has tracked PCR tests, and it should make the numbers publicly available.
        • Survey the genomes: Another massive challenge that the U.S. faces right now is keeping track of the SARS-CoV-2 variants that are circulating through the population, some of which may be more contagious or more life-threatening. As Sarah Braner reported two weeks ago, the majority of COVID-19 cases aren’t genomically sequenced, making it difficult for us to know how many of those cases are new strains as opposed to the regular coronavirus that we’ve all come to know and hate over the past year. Biden’s health and science leadership should make it a priority to step up the nation’s genetic sequencing game, and all of those data should be publicly shared.
        • Support the local public health agencies: Nationwide data coordination is obviously important, and is something that’s been desperately needed since last spring. But most of the COVID-19 data work—logging test results, standardizing those test results, sending them to a central location—is done by state and local public health officials. Local public health agencies, in particular, have been under-funded and threatened by partisan policies since before the pandemic started. To truly improve COVID-19 data collection, the Biden administration must provide support to these local agencies in the form of funding, personnel, technology, and truly anything else they need right now.

        When Biden’s nationwide pandemic dashboard does drop, you’d better believe I’ll be giving it a comprehensive review. For now, if you want to see how well Biden’s doing at keeping his campaign trail promises, I recommend checking out Politifact’s Biden Promise Tracker.

        Related posts