Share This Post

Business / Finance / Markets / Money

Garbage and Non-Garbage Estimates: Puerto Rico Edition

Garbage and Non-Garbage Estimates: Puerto Rico Edition

Since the Puerto Rican government ceased publishing mortality data in February, there has been a debate over the death toll arising from Hurrican Maria. The official death toll, focusing on direct deaths, remains at 64. However, starting in November, a number of scholars attempted to gain further insight into the extent of the human disaster in the Commonwealth. One commentator has labeled another study “garbage”. What is the import of these competing analyses?

First consider the estimates over time of the death toll, both direct and indirect.


Figure 1: Estimates from Santos-Lozada and Jeffrey Howard (Nov. 2017) for September and October (calculated as difference of midpoint estimates), and Nashant Kishore et al. (May 2018) for December 2017 (blue triangles), and Roberto Rivera and Wolfgang Rolke (Feb. 2018) (red square), and calculated excess deaths using average deaths for 2015 and 2016 compared to 2017 and 2018 using administrative data released 6/1 (black square), and Santos-Lozada estimate based on administrative data released 6/1 (large dark blue triangle), end-of-month figures, all on log scale. + indicate upper and lower bounds for 95% confidence intervals. Orange + denotes Steven Kopits 5/31 estimate for range of excess deaths through September 2018; Orange triangle is Steven Kopits estimate for year-end as of June 4. Cumulative figure for Santos-Lozada and Howard October figure author’s calculations based on reported monthly figures.

The Harvard School of Public Health led analysis was a survey based study, which – because of the relatively small number of households surveyed relative to population resulted in fairly large confidence intervals.

The calculations based on administrative data — the Lozada-Santos and Howard study, the Rivera and Rolke study — both had confidence intervals, but much narrower. That’s because the uncertainty came from sampling error involved in estimating the average death rate per month under normal circumstances; the observations for 2017-18 post-Hurricane were treated as known, and without measurement error. As was highlighted by Mr. Kopits assertion as of 5/31 that the mortality data through December was well measured, that is not necessarily a good assumption.

In other words, confidence intervals are important, and it’s important to know what assumptions underpin the construction of the confidence intervals. Assuming away measurement error in say administrative data (either unbiased — the best case — or biased) leads to overconfidence in the precision of one’s estimates. Unfortunately, in the absence of more information, there’s not much one can do (Santos-Lozado and Howard calculate differences of 2017 reported data from upper 95% confidence interval for pre-Maria averages, in order to be conservative.)

Interestingly, Mr. Kopits’ current (as of 6/4) estimate of cumulative excess deaths by end-December is well within the 95% confidence interval cited by the Harvard School of Public Health-led research team. (Journalistic accounts were often misleading, saying at least 4600 excess fatalities occurred; but that is not the fault of the research team, and in fact the NY Times reported the confidence interval correctly in their article.)

An interesting account of the difficulties confronted by those relying on administrative data is to be found in FiveThirtyEight’s 2015 piece on the post-Katrina count:

By its own admission, Louisiana never finished counting the dead. Its last news release on the topic, from February 2006, put the statewide toll at 1,103. Three months later, it added hundreds of state residents who’d died in other states. Three months after that, in August 2006, Louisiana counted 1,464 victims, with 135 people still missing. Today, when asked about the Louisiana death total, the health department cites a 2008 study that reviewed death certificates and concluded that there were 986 victims. But that study said the total could be nearly 50 percent higher if deaths possibly linked to the storm were included.

One year after Katrina, the state’s medical examiner pledged to keep working until every victim was identified. Four years after that, he told the Houston Chronicle that he didn’t get the time or resources to finish the job.

Among federal agencies, the National Oceanic and Atmospheric Administration has been the primary one focused on determining how many people died because of Katrina regionwide. It reported 1,833 deaths in 2006 but has continually revised the number downward, to 1,100 at last count. Yet the 1,833 number has made it into news articles and the congressional record in the past month. The agency’s count remains as uncertain as it was in 2005, when NOAA researchers wrote that “the true number will probably not ever be known.”

John Mutter, a geophysicist at Columbia University, was more familiar with earthquakes than hurricanes before Katrina. After the levees failed and the official death counts kept rising, Mutter began looking into it. There had not been a storm with a comparable death toll since 1928, when a hurricane pushed the waters of Florida’s Lake Okeechobee up and over the levees at its southern end, drowning thousands. Mutter was interested in what standards existed for counting how many people were killed by a hurricane. He discovered that there aren’t any. “They made their own,” he said in a telephone interview.

Standards are extremely important for the grim task of counting the dead. They settle questions with no obvious right answer — for example, whether to include deaths that occurred immediately before a storm hit (such as someone who died in a fall while cutting down tree branches to mitigate anticipated damage).

What are known as indirect deaths are the most confounding to the count. Direct deaths are those that occur from drowning or an injury sustained during the storm or post-storm flooding, while indirect deaths occur from some other cause that might be linked to the storm, such as an inability to access medical care to treat an illness.

After Katrina, government counters in Louisiana chose to include indirect deaths based on an arbitrary time cutoff — people who were evacuated from New Orleans and died after Oct. 1 were not included, while those who died before were. The authors of the 2008 study that counted 986 Louisiana deaths took a different approach, counting only deaths that could be directly attributed to the storm. “I do think we’re likely an underestimate,” said Joan Brunkard, an epidemiologist with the Centers for Disease Control and Prevention and the study’s lead author. NOAA, meanwhile, has reviewed death reports and removed indirect deaths from its count, a major reason that its total went down.

Hence, my view is that contentions that a “firm” count has been achieved within a year of a major and ongoing disaster is unwarranted. I also think that people should learn the nature of the data sources they are using, and what survey design involves, what sampling error is, and what measurement error is, before undertaking the difficult task of policy analysis.

Addiitional discussion, here, here, and here. Gelman’s commentary on the Harvard School of Public Health led study, here. Useful thread on the study by way of Alexis Santos-Lozada here.

Additional conversation with some of those “on the ground”, NPR’s 1A show aired yesterday.

Share This Post

Leave a Reply