I like to answer questions. I’m pretty good at explaining complicated topics, and when I don’t know the answer to something, I can help someone find it. These days, that tendency manifests in everyday conversations, whether it’s with my friend from high school or a Brooklyn dad whose campsite shares a firepit with my Airbnb. I make sure the person I’m talking to knows that I’m a science journalist, and I invite them to ask me their COVID-19 questions. I do my best to be clear about where I have expertise and where I don’t, and I try to point them to sources that will fill in my gaps.
I want this newsletter to feel like one of those conversations. I started it when hospitalization data switched from the auspices of the Centers for Disease Control and Prevention (CDC) to the Department of Health and Human Services (HHS), and I realized how intensely political agendas were twisting public understanding of data in this pandemic. I wanted to answer my friends’ and family members’ questions, and I wanted to do it in a way that could also become a resource for other journalists.
This is the newsletter’s tenth week. As I took a couple of days off to unplug, it seemed a fitting time to reflect on the project’s goals and on how I’d like to move forward.
What should data reporting look like in a pandemic?
This is a question I got over the weekend. How, exactly, have the CDC and the HHS failed in their data reporting since the novel coronavirus hit America back in January?
The most important quality for a data source is transparency. Any figure will only be a one-dimensional reflection of reality; it’s impossible for figures to be fully accurate. But it is possible for sources to make public all of the decisions leading to those figures. Where did you get the data? Whom did you survey? Whom didn’t you survey? What program did you use to compile the data, to clean it, to analyze it? How did you decide which numbers to make public? What equations did you use to arrive at your averages, your trendlines, your predictions? And so on and so forth. Reliable data sources make information public, they make representatives of the analysis team available for questions, and they make announcements when a mistake has been identified.
Transparency is especially important for COVID-19 data, as infection numbers drive everything from which states’ residents are required to quarantine for two weeks when they travel, to how many ICU beds at a local hospital must be ready for patients. Journalists like me need to know what data the government is using to make decisions and where those numbers are coming from so that we can hold the government accountable; but beyond that, readers like you need to know exactly what is happening in your communities and how you can mitigate your own personal risk levels.
In my ideal data reporting scenario, representatives from the CDC or another HHS agency would be extremely public about all the COVID-19 data they’re collecting. It would publish these data in a public portal, yes, but this would be the bare minimum. This agency would publish a detailed methodology explaining how data are collected from labs, hospitals, and other clinical sites, and it would publish a detailed data dictionary written in easily accessible language.
And, most importantly, the agency would hold regular public briefings. I’m envisioning something like Governor Cuomo’s PowerPoints, but led by the actual public health experts, and with substantial time for Q&A. Agency staff should also be available to answer questions from the public and direct them to resources, such as the CDC’s pages on childcare during COVID-19 or their local registry of test sites. Finally, it should go without saying that, in my ideal scenario, every state and local government would follow the same definitions and methodology for reporting data.
Why am I doing this newsletter?
The CDC now publishes a national dataset of COVID-19 cases and deaths, and the HHS publishes a national dataset of PCR tests. Did you know about them? Have you seen any public briefings led by health experts about these data? Even as I wrote up this description, I realized how deeply our federal government has failed at even the basics of data transparency.
Neither the CDC nor HHS even published any testing data until May. Meanwhile, state and local public health agencies are largely left to their own devices, with some common definitions but few widely enforced standards. Florida publishes massive PDF reports, which fail to include the details of their calculations. Texas dropped a significant number of tests in August without clear explanation. Many states fail to report antigen test counts, leaving us with a black hole in national testing data.
Research efforts and volunteer projects, such as Johns Hopkins’ COVID-19 Tracker and the COVID Tracking Project, have stepped in to fill the gap left by federal public health agencies. The COVID Tracking Project, for example, puts out daily tweets and weekly blog posts reporting on the state of COVID-19 in the U.S. I’m proud to be a small part of this vital communication effort, but I have to acknowledge that the Project does a tiny fraction of the work that an agency like the CDC would be able to mount.
Personally, I feel a responsibility to learn everything I can about COVID-19 data, and share it with an audience that can help hold me accountable to my work. So, there it is: this newsletter exists to fill a communication gap. I want to tell you what state and federal agencies are doing—or aren’t doing—to provide data on how COVID-19 is impacting Americans. And I want to help you attain some data literacy along the way. I don’t have fancy PowerPoints like Cuomo or fancy graphics like the COVID Tracking Project (though my Tableau skills are improving!). But I can ask questions, and I can answer them. I hope you’re reading this because you find that useful, and I hope this project can become more useful as it grows.
America is moving into what may be a long winter, with schools open and the seasonal flu incoming. (If you haven’t yet, this is your reminder: get your flu shot!) I’m in no position to hypothesize about second waves or vaccine deployment, but I do believe this pandemic will not go away any time soon.
With that in mind, I’d like to settle in this newsletter for the long haul. And I can’t do it alone. In the coming months, I want this project to become more reader-focused. Here are a couple of ideas I have about how to make that happen; please reach out if you have others!
- Reader-driven topics: Thus far, the subjects of this newsletter have been driven by whatever I am excited and/or angry about in a given week. I would like to broaden this to also include news items, data sources, and other topics that come from you.
- Answering your questions: Is there a COVID-19 metric that you’ve seen in news articles, but aren’t sure you understand? Is there a data collection process that you’d like to know more about? Is there a seemingly-simple thing about the virus that you’ve been afraid to ask anywhere else? Send me your COVID-19 questions, data or otherwise, and I will do my best to answer.
- Collecting data sources: In the first nine weeks of this project, I’ve featured a lot of data sources, and the number will only grow as I continue. It might be helpful if I put all those sources together into one public spreadsheet to make a master resource, huh? (I am a little embarrassed that I didn’t think of this one sooner.) I’ll work on this spreadsheet, and share it with you all next week.
- Events?? One of my goals with this project is data literacy, and I’d like to make that work a little more hands-on. I’m thinking about potential online workshops and collaborations with other organizations. I’m also looking into potential funding options for such events; there will hopefully be more news to come on this front in the coming weeks.