Lies, Damned Lies, and “Official” Statistics

Maria Gargiulo and Megan Price

Collecting data in a pandemic is difficult and can be dangerous. Even in the best settings, where health records are routinely and accurately maintained, it can be hard to justify maintaining that level of precision when the health system is overwhelmed in a pandemic. In other settings, which lack the infrastructure or are coping with armed conflict or other crises, public health data collection was likely already deficient even before the pandemic struck.

Information about confirmed cases, causes of death, and infection rates—even if imperfect—are crucial to understanding the pandemic, making policy recommendations, and allocating resources. People need to trust the information provided by public health and other government officials in order to comply with policy and health protection recommendations such as physical distancing and mask wearing. This trust has been absent in many countries throughout the pandemic.

In many ways, the challenges inherent in data relating to the COVID-19 pandemic parallel our experiences working with conflict data. In the context of documenting conflict-related deaths, data collection might be limited by access to particular regions, administrative capacity, and community trust in data collection mechanisms. Surviving witnesses may not file reports due to stigmatization or fear for their safety, and bodies might never be identified. As a result, data documenting conflict-related deaths are almost certainly partial and (statistically) biased. In both conflict and pandemic contexts, governments may intentionally hide or obfuscate official counts, researchers may be threatened, and civil society may attempt to fill in data gaps. However, in both contexts the fact that data is incomplete doesn’t undermine its value, but does heighten the need for transparent explanations of the limitations of the data and appropriate statistical analyses to account for these limitations (whenever possible).

Intentional obfuscation

Despite the importance of accurate and transparent data collection in times of crisis, the past year has provided many examples of government officials actively undermining these crucial processes.

Some governments have used selective definitions to limit what “counts” as a death due to COVID-19. In New York, deaths of nursing home residents were undercounted by as much as 50% because deaths of residents that occurred outside the nursing homes (that is, in hospitals) were not included in the official counts.[1] In Gujarat, India, official statistics only include deaths due to viral pneumonia. These counts exclude deaths by other causes known to occur as a result of COVID-19, such as cardiac arrest, stroke, or organ failure. As of 16 April 2021, the official state death toll was 78, but data from hospitals, burial sites, and crematoriums diverge from the state’s narrative. Data from seven of Gujarat’s cities showed that almost nine times as many bodies were buried or cremated following protocols established for COVID-19 than appeared in the official statistics for the entire state.[2]

Nicaraguan media described the situation there in October of 2020: “…it is impossible in Nicaragua to obtain an estimate on the deadliness of the pandemic. The real number of infected is also unknown, which MINSA [the Nicaraguan Ministry of Health] hides through confusing figures of additions and subtractions of contagions, those who supposedly recovered and people in ‘responsible and careful monitoring.’ The latter being a term invented by the Ortega government.”[3] Further, “[t]he Government controls access to COVID-19 tests and does not reveal the number taken or their results.”[4]

Data, scientific progress, and human rights

Elsewhere, scientists and journalists have been threatened for investigating the pandemic or publishing work that diverged from state narratives. In Venezuela, journalists have been arbitrarily detained, threatened, and subject to smear campaigns, lawsuits, or equipment confiscation to prevent them from reporting on COVID-19.[5] In Brazil, scientists studying COVID-19 have come under attack—with threats to their employment, interrogation from federal prosecutors, and death threats.[6] These are part of the broader trend of attacks on science and scientists that have occurred since Bolsonaro became president in 2019.

We connect these direct threats to violations of fundamental rights to life, liberty, and personal security, to freedom of opinion and expression, or to freedom from arbitrary arrest or detention, as established in the Universal Declaration of Human Rights (UDHR).[7] Our collective right to contribute to scientific progress and to share its benefits is also specified in the UDHR.[8] Data is one of the benefits of scientific progress, so when governments intentionally conceal, obscure, or manipulate official data—especially during a global public health crisis—they not only erode public trust in science and hinder policy responses, they also violate this right.

Civil society filling in the gaps

In many instances where governments have failed to collect or share data that is accurate, trustworthy, and timely, civil society groups, academics, and individuals have stepped in to fill the data void.

In Nicaragua, an independent network of doctors and volunteers called COVID-19 Citizens Observatory recorded more than twice the official government count of cases in the fall of 2020. In March 2021, the Citizens Observatory documented more than 17 times the number of deaths officially reported by the Health Ministry.[9] In Mexico, an economist and a programmer used publicly available information about sequentially numbered death certificates to estimate how many more COVID-19-related deaths were likely compared to the official reported statistics in Mexico City.[10]

In the United States, the Covid Tracking Project was an entirely volunteer-run data aggregation and contextualization project created in direct response to the continued publication of “patchy and often ill-defined data” by the federal government.[11] The team stopped their data collection efforts on 7 March 2021—a year after they started—because they observed “persuasive evidence that the [Centers for Disease Control] and [the Department of Health and Human Services] are now both able and willing to take on the country’s massive deficits in public health data infrastructure, and to offer the best available data and science communication in the interim.”[12]


More than a year since official declaration of the pandemic, many governments and organizations continue to update and refine death tolls and infection rates, and calls are being made for various investigative commissions to uncover the true toll of the pandemic.[13] From our work as statisticians researching human rights questions, we know that the process of clarification can be long and arduous, and often relies on a combination of data collected by official and civil society sources. It might be many years before the full burden of COVID-19 can be known, due in no small part to the difficulty inherent in collecting reliable data throughout a pandemic.

We applaud these ongoing efforts, and hope that alongside careful study of the public health and economic burdens of the pandemic there will be accountability for people in positions of power who intentionally obstructed scientific progress or the sharing of this information with the public. COVID-19 is not the first crisis where governments have attempted to lie with statistics, and it may not be the last, but it provides an opportunity to establish precedents to ensure that there are consequences when science is jeopardized by the state.

Maria Gargiulo, BS, is a statistician at the Human Rights Data Analysis Group, San Francisco, CA, USA.

Megan Price, PhD, is Executive Director of the Human Rights Data Analysis Group, San Francisco, CA, USA.

