Tag: Featured sources

  • Featured sources, Jan. 10

    This week’s featured sources are all about hospitalizations and treatments. See the full CDD source list here.

    • Hospital facilities visualization by the COVID Tracking Project: Last month, the Department of Health and Human Services (HHS) released an extensive dataset showing how COVID-19 patients are impacting hospitals at the individual facility level. (See my Dec. 13 post for more information on this dataset.) The COVID Tracking Project has produced an interactive visualization from this dataset, allowing users to zoom in to individual facilities or search for hospitals in a particular city or ZIP code. I contributed some copy to this page.
    • Therapeutics distribution (from HHS): The HHS is posting a list of locations that have received monoclonal antibody therapies, for the purpose of treating COVID-19. Bamlanivimab, one such therapy, received EUA from the FDA in early November. The HHS page notes that this is not a complete list: “Although monoclonal antibody therapeutic treatments have been shipped nationwide, shipment locations are displayed for those States that have opted to have their locations displayed on this public website.”
    • Hospital discharge summaries (from the Healthcare Cost and Utilization Project): This project, under the HHS umbrella, posts time series data on U.S. hospital patients. The site recently posted summaries on patients from April to June 2020, including datasets specific to COVID-19, flu, and other viral respiratory infections. As epidemiologist Jason Salemi explains in a summary Twitter thread, the data doesn’t provide new information but may be useful for a researcher looking to dig into spring and summer hospitalization trends.
  • Your guide to choosing a COVID-19 data source

    Your guide to choosing a COVID-19 data source

    In preparing for this re-launch, I asked a few of my readers what they liked about the COVID-19 Data Dispatch and how it could better serve them. One common answer was that the publication has helped readers navigate the landscape of COVID-19 data sources, and pick the best source for a given story.

    The first two resources pages I’ve produced take this service to the next level.

    First: The Featured Source List is an upgraded version of the Google spreadsheet I’ve been using to keep track of data sources featured in the newsletter since July. You can use the table to search, sort, and filter all 82 featured sources by their names and categories. The little green plus icons toggle expanded views, with more details on every source. Much friendlier than a spreadsheet!  (Though, if you want to see the raw spreadsheet, it’s still accessible here.)

    Second: The Data Source Finder tool tells you exactly where to find the data you need for a given story.  (Or for a Facebook post, or an argument with your friend, and so forth.)  The tool includes detailed annotations on 16 data sources which I consider the primary COVID-19 sources in the U.S.

    Here’s how to use it. You start out by selecting the geographic scale on which you’d like to see data (global, U.S. states, counties, or cities), then choose the type of metric you’re looking for. The tool will return your options, including each dataset’s available metrics, methodologies, update schedule, download links, and more.

    It’s essentially an interactive flowchart, aimed to make it easy to compare and contrast sources for reporters on deadline and students engaged in Twitter debates alike. You can also find the full set of annotations linked on the page.

    While I compiled the annotations, the interactive tool was coded in Twine by my girlfriend, Laura Berry.  Your membership fees will help me buy Laura a nice dinner to thank her for her work.

  • The 20 best COVID-19 data stories of 2020

    The 20 best COVID-19 data stories of 2020

    Here are 20 stories that have uncovered significant patterns of the pandemic, demonstrated a mastery of craft, and inspired me to be a better data journalist.

    (Disclaimer: I primarily read U.S. coverage from national and New York City-specific publications, so this list is not as diverse as I’d like; still, I did my best to include a variety of outlets and topics, featuring data viz-heavy stories as well as more traditional articles which explain COVID-19 numbers.)

    • Edward Holmes’ tweet announcing that the novel coronavirus genome has been posted (Jan. 10): Okay, so this isn’t technically a work of data journalism. But it seemed crucial to me that I include the most important tweet of the year. When Holmes publicly shared the genome of SARS-CoV-2—sequenced by Shanghai professor Yong-Zhen Zhang—scientists around the world immediately sprung into action, developing tests and therapeutics for the novel virus. “Please feel free to download, share, use, and analyze this data,” a note on the Virological.org posting reads. And scientists did: the first vaccines were designed within days.
    • Limited data may be skewing assumptions about severity of coronavirus outbreak, experts say (STAT News, Jan. 30): Helen Branswell’s diligent record on covering COVID-19 speaks for itself—I had to go eight pages back in her archive to find stories from January. (Her first story on the virus was published on January 4). This January 30 piece points out how a limited case definition hindered Chinese scientists attempting to determine how far the virus had spread through the country. Throughout the pandemic, Branswell has been an experienced voice who can clearly spell out the implications of medical data, as she does here: she explains why the severe COVID-19 cases that had been reported so far were the “tip of the iceberg.”
    • The Strongest Evidence Yet That America Is Botching Coronavirus Testing (The Atlantic, March 6): I wish I could include every single one of Alexis Madrigal and Rob Meyer’s COVID-19 data stories in this list; throughout the pandemic, these reporters have used data from the COVID Tracking Project (which they cofounded) to explain major COVID-19 trends and draw attention to issues in the U.S. Their work shows how journalists can benefit from truly getting inside of a dataset and spending months watching the same metrics. I chose these reporters’ first story, however, because it was the basis for the COVID Tracking Project itself. “How many people have actually been tested for the coronavirus?” Madrigal and Meyer ask. The answer, it turns out, took hundreds of volunteers, intensive infrastructure, and endless partnerships that spanned far beyond March.
    • Why It’s So Freaking Hard To Make A Good COVID-19 Model (538, March 31): At a time when it seemed like every other Twitter account suddenly belonged to an armchair epidemiologist, 538’s Maggie Koerth, Laura Bronner, and Jasmine Mithani swept in to expound upon the complexities of infectious disease modeling. The article uses simple graphics—flowcharts of color-coded boxes—to show all the factors that can go into calculations of how many people might get sick and die during the COVID-19 pandemic. Rereading it this week, I was struck by how relevant the story still is in articulating fundamental uncertainties about this virus.
    • Mapping Covid-19 outbreaks in the food system (Food & Environment Reporting Network, April 22/ongoing): Meatpacking plants and other food processing facilities have been some of the biggest outbreak sites in the U.S., but most government sources do not report specifically on these outbreaks. Reporter Leah Douglas has singlehandedly filled this gap by synthesizing reports from local news outlets, health agencies, and food production companies. She has updated the data visualizations in this story regularly since April. As of December 18, the most recent update, at least 1,257 meatpacking and food processing plants have seen COVID-19 cases. Tyson Foods has seen the most cases, at over 11,000.
    • How to Understand COVID-19 Numbers (ProPublica, July 21): Caroline Chen is a veteran infectious disease reporter who lived through Hong Kong’s SARS outbreak and reported on Ebola. With the help of designer Ash Ngu, she walks readers through a couple of key principles in understanding—and reporting—COVID-19 data. The story explains why to use seven-day averages over raw case numbers, how to understand test positivity rates, and more. I covered it in my first newsletter issue back in July and was inspired to write my own “how to understand COVID-19 numbers” story for Stacker in the fall.
    • To Navigate Risk In a Pandemic, You Need a Color-Coded Chart (WIRED, July 21): In this delightfully meta story, Maryn McKenna unpacks the design choices that go into those green-to-red risk charts that were widely shared across social media when states began reopening in the summer. She explains the challenge of taking risk—something that is inherently impossible to fully quantify—and putting it into one-size-fits-all guidance. True COVID-19 risk, the story explains, must incorporate one’s location, environment, behavior, and many more factors.
    • Which Cities Have The Biggest Racial Gaps In COVID-19 Testing Access? (538, July 22): A lot of journalists have tried to explain how systemic racism in America led to disproportionately high COVID-19 cases and deaths for the Black community. But this story, by a team of six 538 researchers and designers, is particularly effective. The graphics demonstrate a clear disparity: “testing sites in and near predominantly Black and Hispanic neighborhoods are likely to serve far more patients than those near predominantly white areas.” In South Texas, for example, a single testing site may have served 600,000 people—leading to extensive test wait times and other barriers to healthcare for COVID-19 patients.
    • Thousands of Texans are getting rapid-result COVID tests. The state isn’t counting them. (Houston Chronicle, Aug. 2): Fun story about this one: back in August, when I was working on my antigen testing issue, I needed to cite this piece on the disconnect between how antigen tests were being reported by Texas’ state public health agency and how they were being reported by several Texas counties. I paid for a subscription to the Houston Chronicle to get around the site’s paywall. And then, probably because I am a Millennial/Gen Z cusp who hates unnecessary phone calls… I never canceled my subscription. I have no regrets, though—the Houston Chronicle does good work. This particular story provided a clear explanation of antigen test reporting issues long before many other news outlets became aware of the test type.
    • Why the United States is having a coronavirus data crisis (Nature, Aug. 25): This story, by Nature’s Amy Maxmen, uses global context to explain why it is so damn hard for the U.S. to collect and share COVID-19 data. While South Korea has coordinated case reporting and contact tracing from 250 regional public health agencies, local agencies in the U.S. are overworked, underpaid, and relying on outdated technology. The article also discusses how a lack of federal leadership and data standards trickles down to make data collection, analysis, and transparency harder for epidemiologists.
    • A long time to wait (Spotlight PA, Sept. 24): There was a period in summer 2020 during which Sara Simon tweeted every day about delays in Pennsylvania’s COVID-19 reporting. The state often reported COVID-19 deaths months later than they had occurred, due to an antiquated data system that was not updated in time for Pennsylvania’s outbreaks—and caused additional confusion for public health workers and state data watchers alike. Simon and her colleagues’ story explores these reporting issues, while a data visualization of the death reporting lag in every state provides context.
    • Data Journalists’ Roundtable: Visualizing the Pandemic (The Open Notebook, Sept. 29): This roundtable interview brings together four data journalists to share the design choices behind COVID-19 graphics they produced. It includes both discussions of the journalists’ biggest challenges and behind-the-scenes notes on specific charts, ranging from a visualization of cell phone data to one of high-risk health conditions in minority communities. (One of the graphics featured is, in fact, a chart from the 538 article on COVID-19 modeling that I highlighted earlier in this list.)
    • This Overlooked Variable Is the Key to the Pandemic (The Atlantic, Sept. 30): Never has a science writer elaborated upon a single variable so expertly as Zeynep Tufekci does in this story. She uses k, a measure of how a virus disperses, to explain why some COVID-19 patients are able to infect many other people—in what epidemiologists call superspreading events—while other patients do not infect anyone else at all. The story walks readers through an immense amount of scientific evidence while clarifying basic principles with easy-to-grasp analogies.
    • Covid-19’s stunningly unequal death toll in America, in one chart (Vox, Oct. 2): This story lives up to its headline’s promise. The chart in question, by Vox’s Christina Animashaun, visualizes COVID-19 death rates with small human icons: each “person” represents one in 100,000 Americans who have died from the disease. As of early October, 98 of every 100,000 Black Americans had died from COVID-19, compared to 47 of every 100,000 white Americans. As of December 26, 126 out of every 100,000 Black Americans and 74 out of every 100,000 white Americans have now died of this disease.
    • Test Positivity in the US Is a Mess (The COVID Tracking Project, Oct. 8): Out of the many informative blog posts produced by the COVID Tracking Project since last spring, this is the one I’ve shared most widely. Project Lead Erin Kissane and Science Communication Lead Jessica Malaty Rivera clearly explain how COVID-19 test positivity—what should be a simple metric, the share of tests conducted in a given region that return a positive result—can be calculated in several different ways. Graphics by Júlia Ledur illustrate the different options, with the help of a cartoon COVID-19 patient called Bob. The post both highlights a major issue in COVID-19 data reporting and explains why the Project does not report test positivity on its own site.
    • We Don’t Really Know if COVID is Spreading in Lincoln Schools (Seeing Red Nebraska, Oct. 13): This local news story takes a deep dive into reporting issues in the Lincoln Public Schools district. Reporter Trish Wonch Hill explains why the school district’s data dashboard is “close to useless,” unpacks a flaw in the district’s contact tracing protocol that discounts in-school disease spread, and highlights a group of parents who have been tracking school cases on their own crowd-sourced dashboard. Data on COVID-19 in schools have been severely lacking throughout the pandemic—every local news outlet should be conducting this type of investigation.
    • A room, a bar and a classroom: how the coronavirus is spread through the air (El País, Oct. 28): This set of data visualizations by Madrid-based newsletter El País was shared far and wide after its publication in the fall—for good reason. As a reader scrolls through the charts, they clearly see how the novel coronavirus may travel through aerosols, or small air particles, in an indoor space. The charts effectively dispel widespread beliefs that sitting six feet apart or keeping masks on throughout a long conversation will protect everyone in the room from getting infected.
    • Pandemic Backlash Jeopardizes Public Health Powers, Leaders (KHN, Dec. 15/ongoing): Since the summer, reporters at KHN and The Associated Press have produced stories in the publications’ joint “Underfunded and Under Threat” series, highlighting how public health departments across the nation were ill-prepared for the pandemic. (The dataset behind this series was a featured source in one of my early issues back in August.) This story focuses on the leaders of local public health agencies who have faced pressure to leave their jobs during the pandemic, putting faces to the impacts of budget cuts and anti-mask threats.
    • 1 in 5 Prisoners in the U.S. Has Had COVID-19 (The Marshall Project, Dec. 18/ongoing): Similarly to the KHN story above, this article by criminal justice-focused outlet The Marshall Project is part of a broader reporting project. Since March, the Project has been compiling data on COVID-19 cases and deaths in prisons around the country, in partnership with The Associated Press. (Dataset available here.) This December article visualizes the full brunt of the pandemic in each state’s prisons—in South Dakota, three out of five prisoners have been infected—while also telling several individual stories about the people who have gotten sick in prison and the advocates who are fighting for them.
    • Remembering the New Yorkers We’ve Lost to‌ COVID‑19 (THE CITY, ongoing): Nonprofit local newsroom THE CITY is building an online memorial of the New Yorkers who have died due to COVID-19. As of December 18, the memorial includes 1,946 names—remembering about 8% of the over 24,000 New Yorkers who have been lost. Earlier in December, THE CITY hosted a two-day event series to honor the dead, including readings of poetry and the obituaries written by the publication’s staff. I also participated in a protest last summer during which hundreds of these names were read aloud; it was a sobering reminder of the people behind the COVID-19 data I use in my work every day.
  • Featured sources, Dec. 20

    These sources, along with all others featured in previous weeks, are included in the COVID-19 Data Dispatch resource list.

    • Mass Incarceration, COVID-19, and Community Spread: The nonprofit Prison Policy Initiative has published a new report showing how prisons impacted COVID-19 case rates in 2020. One major finding: rural counties with more incarcerated people per square mile had more COVID-19 cases, especially at higher percentiles.
    • COVID Border Accountability ProjectThis interactive map documents travel and immigration bans that countries have introduced in response to COVID-19. It’s compiled by a team of academic researchers, engineers, and other non-academic volunteers, and updated weekly on Wednesdays.
    • The Buffalo News’ trackers of COVID-19 cases in college athletics: CDD reader Rachel Lenzi, who covers college athletics for The Buffalo News, has kindly allowed me to share her spreadsheets compiling COVID-19 reports of COVID-19 cases in NCAA football and basketball programs. Football spreadsheetbasketball spreadsheet.
  • Featured sources, Dec. 13

    These sources, along with all others featured in previous weeks, are included in the COVID-19 Data Dispatch resource list.

    • National report from the White House Coronavirus Task Force: The Center for Public Integrity, a nonprofit newsroom focused on investigations of democracy, has been periodically releasing reports of COVID-19 statistics intended for internal use by the White House Coronavirus Task Force and state governors. Reporters at the Center are often only able to obtain state-level reports, but last week, they released a national report including summary data and recommendations for all 50 states. The report is dated November 29.
    • Searchable database of PPP loans: On December 1, the Small Business Administration released extensive data on loans issued through the Paycheck Protection Program (PPP), including specific loan amounts and company names. Accountable.US, a nonpartisan watchdog group, has made this information available in an easy-to-navigate database. You can search for a specific business or filter by different geographic regions and industries.
    • Searchable database of federal COVID-19 purchases: Since March, ProPublica has tracked where federal government spending on the pandemic is going. The database represents $28 billion, 14,209 government contracts, and 6,832 individual vendors. Data can be sorted by spending categories, vendor types, and contract sizes.
    • COVID-19 Global Travel Restrictions and Airline Information: The Humanitarian Data Exchange is an international repository run by the United Nations Office for the Coordination of Humanitarian Affairs. One of the repository’s COVID-19 datasets displays travel restrictions and airline restrictions for nearly 300 jurisdictions, updated every day.
  • Featured sources, Dec. 6

    These sources, along with all others featured in previous weeks, are included in the COVID-19 Data Dispatch resource list. Please note that I took state school data sources out of this list because my COVID-19 state school data survey provides a more comprehensive view of these data.

    • Allocating Regeneron’s treatment: On November 21, Regeneron’s monoclonal antibody treatment received Emergency Use Authorization from the FDA. A new dataset from the HHS shows how this drug is being allocated to states and territories. For more information on the dataset, see HHS’s November 23 press release.
    • COVID-19 relief tracker: The Project on Government Oversight (POGO) has a new tracker which shows where COVID-19 relief funds from the federal government have been spent. The dashboard visualizes data from USAspending.gov, and is searchable by state, county, and ZIP code.
    • Census COVID-19 Demographic and Economic Resources: My coworker Diana Shishkina recently alerted me to a Census page which compiles and visualizes a great deal of data on how COVID-19 has impacted Americans. It includes data from weekly small business surveys, the Household Pulse Survey, and a wealth of other information.
  • Featured sources, Nov. 29

    These sources, along with all others featured in previous weeks, are included in the COVID-19 Data Dispatch resource list.

    • Leading in Crisis briefs: A series of briefs from the Consortium for Policy Research in Education document how 120 principals in 19 states responded to COVID-19 in the spring. The briefs compile analyses, summaries, and recommendations on topics ranging from accountability during school closures to calm during a crisis.
    • COVID-19 in Congress: GovTrack.us. a project which normally documents bills and resolutions in the U.S. Congress, is currently tracking how COVID-19 has spread through the national legislature. The tracker currently includes 87 legislators who have entered quarantine, tested positive, or come into contact with someone who had been diagnosed with the disease.
    • COVID-19 Community Vulnerability Index: In the first vaccine section above, I discussed the CDC’s Social Vulnerability Index, which charts populations that are more vulnerable to health disasters. The Surgo Foundation’s COVID-19 Community Vulnerability Index builds on the CDC’s research with additional, COVID-specific metrics based on epidemiological and healthcare-related factors. I’ve produced two Stacker stories using this source: States with the populations most vulnerable to COVID-19 and Counties most vulnerable to COVID-19 in every state.
  • Featured sources, Nov. 22

    These sources, along with all others featured in previous weeks, are included in the COVID-19 Data Dispatch resource list.

    • State COVID-19 vaccine plans: A new report from the Kaiser Family Foundation explores how state public health departments are planning to distribute COVID-19 vaccines once they become available. The report includes common themes and concerns across all 50 state plans, as well as links to the plans themselves. One insight that stuck out to me: “Just over half (25 of 47, or 53% ) of state plans report having immunization registries/database systems in place that are described as being (at least fairly) comprehensive and reliable; in the other state plans that information is unclear.”
    • COVID-19 Testing Communications Toolkit: The Brown School of Public Health has compiled a resource to help public health communicators encourage COVID-19 testing. The toolkit includes evidence-based tutorials, handouts, and an image library, all of which are free for public use.
    • COVID-19 and Impacted Communities: A Media Communications Guide: This is another communications tool from the New York COVID-19 Working Group. The guide includes best practices for explaining key terms, advice on framing stories, and how to avoid stereotypical narratives about minority communities.
    • SARS-CoV-2 and COVID-19 Data Hub: Erin Sanders, a nurse practitioner and contact tracer, has compiled a list of data sources on the novel coronavirus. The list includes clinical data, transmission data, and genomic data, among other medical and epidemiological topics.
  • Featured sources, Nov. 15

    These sources, along with all others featured in previous weeks, are included in the COVID-19 Data Dispatch resource list.

  • Featured source, Nov. 8

    This source, along with all others featured in previous weeks, is included in the COVID-19 Data Dispatch resource list.

    • Household Pulse Survey by the U.S. Census: I featured this source—a survey program run by the U.S. Census to determine how COVID-19 impacted the lives of American residents—back in August. The Census did an initial round of surveys from April through July. But the dataset was so widely used that the Census expanded it to a second round of surveys, from August through October. New data are now being released in two-week intervals.