The 20 best COVID-19 data stories of 2020

Here are 20 stories that have uncovered significant patterns of the pandemic, demonstrated a mastery of craft, and inspired me to be a better data journalist.

(Disclaimer: I primarily read U.S. coverage from national and New York City-specific publications, so this list is not as diverse as I’d like; still, I did my best to include a variety of outlets and topics, featuring data viz-heavy stories as well as more traditional articles which explain COVID-19 numbers.)

  • Edward Holmes’ tweet announcing that the novel coronavirus genome has been posted (Jan. 10): Okay, so this isn’t technically a work of data journalism. But it seemed crucial to me that I include the most important tweet of the year. When Holmes publicly shared the genome of SARS-CoV-2—sequenced by Shanghai professor Yong-Zhen Zhang—scientists around the world immediately sprung into action, developing tests and therapeutics for the novel virus. “Please feel free to download, share, use, and analyze this data,” a note on the Virological.org posting reads. And scientists did: the first vaccines were designed within days.
  • Limited data may be skewing assumptions about severity of coronavirus outbreak, experts say (STAT News, Jan. 30): Helen Branswell’s diligent record on covering COVID-19 speaks for itself—I had to go eight pages back in her archive to find stories from January. (Her first story on the virus was published on January 4). This January 30 piece points out how a limited case definition hindered Chinese scientists attempting to determine how far the virus had spread through the country. Throughout the pandemic, Branswell has been an experienced voice who can clearly spell out the implications of medical data, as she does here: she explains why the severe COVID-19 cases that had been reported so far were the “tip of the iceberg.”
  • The Strongest Evidence Yet That America Is Botching Coronavirus Testing (The Atlantic, March 6): I wish I could include every single one of Alexis Madrigal and Rob Meyer’s COVID-19 data stories in this list; throughout the pandemic, these reporters have used data from the COVID Tracking Project (which they cofounded) to explain major COVID-19 trends and draw attention to issues in the U.S. Their work shows how journalists can benefit from truly getting inside of a dataset and spending months watching the same metrics. I chose these reporters’ first story, however, because it was the basis for the COVID Tracking Project itself. “How many people have actually been tested for the coronavirus?” Madrigal and Meyer ask. The answer, it turns out, took hundreds of volunteers, intensive infrastructure, and endless partnerships that spanned far beyond March.
  • Why It’s So Freaking Hard To Make A Good COVID-19 Model (538, March 31): At a time when it seemed like every other Twitter account suddenly belonged to an armchair epidemiologist, 538’s Maggie Koerth, Laura Bronner, and Jasmine Mithani swept in to expound upon the complexities of infectious disease modeling. The article uses simple graphics—flowcharts of color-coded boxes—to show all the factors that can go into calculations of how many people might get sick and die during the COVID-19 pandemic. Rereading it this week, I was struck by how relevant the story still is in articulating fundamental uncertainties about this virus.
  • Mapping Covid-19 outbreaks in the food system (Food & Environment Reporting Network, April 22/ongoing): Meatpacking plants and other food processing facilities have been some of the biggest outbreak sites in the U.S., but most government sources do not report specifically on these outbreaks. Reporter Leah Douglas has singlehandedly filled this gap by synthesizing reports from local news outlets, health agencies, and food production companies. She has updated the data visualizations in this story regularly since April. As of December 18, the most recent update, at least 1,257 meatpacking and food processing plants have seen COVID-19 cases. Tyson Foods has seen the most cases, at over 11,000.
  • How to Understand COVID-19 Numbers (ProPublica, July 21): Caroline Chen is a veteran infectious disease reporter who lived through Hong Kong’s SARS outbreak and reported on Ebola. With the help of designer Ash Ngu, she walks readers through a couple of key principles in understanding—and reporting—COVID-19 data. The story explains why to use seven-day averages over raw case numbers, how to understand test positivity rates, and more. I covered it in my first newsletter issue back in July and was inspired to write my own “how to understand COVID-19 numbers” story for Stacker in the fall.
  • To Navigate Risk In a Pandemic, You Need a Color-Coded Chart (WIRED, July 21): In this delightfully meta story, Maryn McKenna unpacks the design choices that go into those green-to-red risk charts that were widely shared across social media when states began reopening in the summer. She explains the challenge of taking risk—something that is inherently impossible to fully quantify—and putting it into one-size-fits-all guidance. True COVID-19 risk, the story explains, must incorporate one’s location, environment, behavior, and many more factors.
  • Which Cities Have The Biggest Racial Gaps In COVID-19 Testing Access? (538, July 22): A lot of journalists have tried to explain how systemic racism in America led to disproportionately high COVID-19 cases and deaths for the Black community. But this story, by a team of six 538 researchers and designers, is particularly effective. The graphics demonstrate a clear disparity: “testing sites in and near predominantly Black and Hispanic neighborhoods are likely to serve far more patients than those near predominantly white areas.” In South Texas, for example, a single testing site may have served 600,000 people—leading to extensive test wait times and other barriers to healthcare for COVID-19 patients.
  • Thousands of Texans are getting rapid-result COVID tests. The state isn’t counting them. (Houston Chronicle, Aug. 2): Fun story about this one: back in August, when I was working on my antigen testing issue, I needed to cite this piece on the disconnect between how antigen tests were being reported by Texas’ state public health agency and how they were being reported by several Texas counties. I paid for a subscription to the Houston Chronicle to get around the site’s paywall. And then, probably because I am a Millennial/Gen Z cusp who hates unnecessary phone calls… I never canceled my subscription. I have no regrets, though—the Houston Chronicle does good work. This particular story provided a clear explanation of antigen test reporting issues long before many other news outlets became aware of the test type.
  • Why the United States is having a coronavirus data crisis (Nature, Aug. 25): This story, by Nature’s Amy Maxmen, uses global context to explain why it is so damn hard for the U.S. to collect and share COVID-19 data. While South Korea has coordinated case reporting and contact tracing from 250 regional public health agencies, local agencies in the U.S. are overworked, underpaid, and relying on outdated technology. The article also discusses how a lack of federal leadership and data standards trickles down to make data collection, analysis, and transparency harder for epidemiologists.
  • A long time to wait (Spotlight PA, Sept. 24): There was a period in summer 2020 during which Sara Simon tweeted every day about delays in Pennsylvania’s COVID-19 reporting. The state often reported COVID-19 deaths months later than they had occurred, due to an antiquated data system that was not updated in time for Pennsylvania’s outbreaks—and caused additional confusion for public health workers and state data watchers alike. Simon and her colleagues’ story explores these reporting issues, while a data visualization of the death reporting lag in every state provides context.
  • Data Journalists’ Roundtable: Visualizing the Pandemic (The Open Notebook, Sept. 29): This roundtable interview brings together four data journalists to share the design choices behind COVID-19 graphics they produced. It includes both discussions of the journalists’ biggest challenges and behind-the-scenes notes on specific charts, ranging from a visualization of cell phone data to one of high-risk health conditions in minority communities. (One of the graphics featured is, in fact, a chart from the 538 article on COVID-19 modeling that I highlighted earlier in this list.)
  • This Overlooked Variable Is the Key to the Pandemic (The Atlantic, Sept. 30): Never has a science writer elaborated upon a single variable so expertly as Zeynep Tufekci does in this story. She uses k, a measure of how a virus disperses, to explain why some COVID-19 patients are able to infect many other people—in what epidemiologists call superspreading events—while other patients do not infect anyone else at all. The story walks readers through an immense amount of scientific evidence while clarifying basic principles with easy-to-grasp analogies.
  • Covid-19’s stunningly unequal death toll in America, in one chart (Vox, Oct. 2): This story lives up to its headline’s promise. The chart in question, by Vox’s Christina Animashaun, visualizes COVID-19 death rates with small human icons: each “person” represents one in 100,000 Americans who have died from the disease. As of early October, 98 of every 100,000 Black Americans had died from COVID-19, compared to 47 of every 100,000 white Americans. As of December 26, 126 out of every 100,000 Black Americans and 74 out of every 100,000 white Americans have now died of this disease.
  • Test Positivity in the US Is a Mess (The COVID Tracking Project, Oct. 8): Out of the many informative blog posts produced by the COVID Tracking Project since last spring, this is the one I’ve shared most widely. Project Lead Erin Kissane and Science Communication Lead Jessica Malaty Rivera clearly explain how COVID-19 test positivity—what should be a simple metric, the share of tests conducted in a given region that return a positive result—can be calculated in several different ways. Graphics by Júlia Ledur illustrate the different options, with the help of a cartoon COVID-19 patient called Bob. The post both highlights a major issue in COVID-19 data reporting and explains why the Project does not report test positivity on its own site.
  • We Don’t Really Know if COVID is Spreading in Lincoln Schools (Seeing Red Nebraska, Oct. 13): This local news story takes a deep dive into reporting issues in the Lincoln Public Schools district. Reporter Trish Wonch Hill explains why the school district’s data dashboard is “close to useless,” unpacks a flaw in the district’s contact tracing protocol that discounts in-school disease spread, and highlights a group of parents who have been tracking school cases on their own crowd-sourced dashboard. Data on COVID-19 in schools have been severely lacking throughout the pandemic—every local news outlet should be conducting this type of investigation.
  • A room, a bar and a classroom: how the coronavirus is spread through the air (El País, Oct. 28): This set of data visualizations by Madrid-based newsletter El País was shared far and wide after its publication in the fall—for good reason. As a reader scrolls through the charts, they clearly see how the novel coronavirus may travel through aerosols, or small air particles, in an indoor space. The charts effectively dispel widespread beliefs that sitting six feet apart or keeping masks on throughout a long conversation will protect everyone in the room from getting infected.
  • Pandemic Backlash Jeopardizes Public Health Powers, Leaders (KHN, Dec. 15/ongoing): Since the summer, reporters at KHN and The Associated Press have produced stories in the publications’ joint “Underfunded and Under Threat” series, highlighting how public health departments across the nation were ill-prepared for the pandemic. (The dataset behind this series was a featured source in one of my early issues back in August.) This story focuses on the leaders of local public health agencies who have faced pressure to leave their jobs during the pandemic, putting faces to the impacts of budget cuts and anti-mask threats.
  • 1 in 5 Prisoners in the U.S. Has Had COVID-19 (The Marshall Project, Dec. 18/ongoing): Similarly to the KHN story above, this article by criminal justice-focused outlet The Marshall Project is part of a broader reporting project. Since March, the Project has been compiling data on COVID-19 cases and deaths in prisons around the country, in partnership with The Associated Press. (Dataset available here.) This December article visualizes the full brunt of the pandemic in each state’s prisons—in South Dakota, three out of five prisoners have been infected—while also telling several individual stories about the people who have gotten sick in prison and the advocates who are fighting for them.
  • Remembering the New Yorkers We’ve Lost to‌ COVID‑19 (THE CITY, ongoing): Nonprofit local newsroom THE CITY is building an online memorial of the New Yorkers who have died due to COVID-19. As of December 18, the memorial includes 1,946 names—remembering about 8% of the over 24,000 New Yorkers who have been lost. Earlier in December, THE CITY hosted a two-day event series to honor the dead, including readings of poetry and the obituaries written by the publication’s staff. I also participated in a protest last summer during which hundreds of these names were read aloud; it was a sobering reminder of the people behind the COVID-19 data I use in my work every day.

Join the COVID Data Dispatch community

One thought on “The 20 best COVID-19 data stories of 2020

Leave a Reply