Why Pandemic-Era Data Deficits Matter

Updated: Jan 15, 2021

The current COVID19 period presents unique challenges for scenario analysis at the tail of the distribution, where investment risks are uniquely exposed to public policy risks in addition to traditional financial risks. This series of posts explores the implications for risk management and alternative data as attention turns to forecasting priorities for the second half of 2020.

Yesterday’s blogpost (“Two Data Deficits”) focused on how the COVID-19 pandemic disrupts the collection and delivery of data necessary for accurate forecasting, from the weather to financial risk management. Today we focus on the second level of data challenges: analytics.

Specifically, we focus on the challenges created by breaks in the data series and the related challenge of extrapolation when living at the tail of the distribution continues for an extended period of time. Spoiler alert: the economic impact of the pandemic ensures that 2020 will continue to deliver up tail event data long after the health situation has been brought under control.

Breaks in the Data Series

It is ironic to talk about data deficits during the data revolution. We are awash in alternative data drawn from connected devices (the Internet of Things), accelerated access to new combinations of traditional data, and entirely new data generated by new technological processes.As we discussed HERE in a blogpost for Interactive Brokers last year on the role of alternative data, the flood of new data drawn from processes that convert unstructured verbal data into structured numerical data creates new frontiers for risk analysis.

While alternative data is both exciting and interesting, the reality is that scenario analysis (particularly in the financial arena) still requires access to classic data concerning economic aggregates like GDP, sales turnover, etc.

We will discuss in a later post what happens when new data capture methods replace old ones. For today, our focus is on what happens when breaks occur in the data series.

COVID-19 delivers significant disruption to traditional data flows. As noted yesterday, delays in regulatory reporting literally generate gaps in monthly data that wreak havoc with automated risk calculations. Even when data reporting returns to regular submissions, significant gaps will remain. It is possible that some of the data may be incomplete for a period of time.

This is not just a temporary problem of delayed data delivery which can be normalized after the fact.

Most economic data will show dramatic variations in economic activity. Moreover, the massive acceleration towards digitalization in the economy during the pandemic create the real possibility that the actual standards and delivery mechanisms for defining and observing economic data will change.

Any one of these factors along could create a break in a time series which requires risk managers and scenario analysis designers to reconsider how they reconcile preexisting data with the new data. Abrupt changes in definition and collection methods as well as in the values themselves can generate changes in the mean of the distribution itself or imply changes regarding other parameters used to generate the time series.

Structural breaks in a data series create real challenges for risk forecasting. Among other things, decisions must be made (with sparse data) regarding whether -- or when --or if -- a mean reversion will occur. It is an open question whether the national or global economic activity will revert to a mean after 2020-21 or whether instead permanent shifts have occurred.

Simply inputting data when it becomes available or substituting alternative data for traditional data cannot be undertaken without considering carefully how the new elements will impact the forecasting model. Which brings us to tail events.

Tail Events vs. Normal Distributions

Strategists and risk managers use data to define risk in relation to how the data is distributed.

  • A “normal” distribution is bell-shaped, with the majority of data points clustered in the middle.

  • Extreme or outlier events occur at the “tail” of the distribution, which means the data occurs at a low frequency.

  • Typically, the left side of the tail represents negligible risks because low frequency occurrences also have low values.

Risk managers worry about the right hand side of a distribution, which represent low frequency occurrences with a high value. They lose sleep over skewed distributions in which the center of gravity in the distribution shifts to the right, creating a “fat tail.”

Welcome to 2020. In the language of mathematics, we are living in a skewed, fat-tail distribution probably for the rest of the year at least.

The economic consequences of a simultaneous, sudden global demand and supply shock as much of the globe shuttered substantial segments of their economies have been massive. The full scope of the economic damage will not be known until we know (i) how economic activity will adjust going forward and (ii) whether additional waves of infection will require economies to shutter again.

It seems unlikely that companies will return to “business as usual” for quite some time. Many firms in the personal services, retail, travel, transportation/delivery and tourism sectors will likely operate at less-than-full capacity for the foreseeable future. Whether demand recovers will depend critically on the elasticity of the labor supply. Substitution effects associated with accelerated adoption of digital alternatives may permanently alter long-standing supply and demand functions.

Moreover, as discussed in yesterday’s Data Deficits post, firms are now more exposed than ever to risks arising from shifts in public policy. But those risks arise from unstructured data which until recently have not been subjected to rigorous, objective quantification.

It’s not just that COVID19 crisis measures which placed the economy on life support are elevated and remain plateaued. On any given day, policy initiatives fluctuate.

This sets off a reaction function which makes public policy data more noisy than economic data. Because the data sets are new (we have only been collecting this data since 2019 for most issues, and since February 2020 for COVID19), the full distribution may not yet really be known. However, we have already discovered that the outlines of a relatively symmetrical parabolic wave do appear.

Forecasting risk and economic trajectories has just become incredibly difficult as data scientists must grapple with how to adjust parameterization and assumption-setting in a low data environment during an extended fat-tail period.

Scenario Analysis and Nowcasting

Many quants view scenario analysis with disdain as a nearly random walk through probable distributions. Scenario analysis involves changing one or more parameter settings to determine how the shape of the distribution shifts in response to the changes. Extreme changes earn the term “stress test.” While not exactly hypothetical, positing shifts in specific parameters regarding possible geopolitical or domestic political events can seem as much an exercise in writing fiction than rigorous data-driven risk management.

Consequently, at BCMstrategy, Inc. we favor a nowcasting approach (another meteorological methodology!) for scenario analysis when seeking to assess risk exposures regarding shifts in public policy.. We take the daily data captured by our patented process and forecast outcomes based on existing facts on the ground.

We do not speculate on what might be. We focus on what is real today before extrapolating to tomorrow. And our platform makes it possible to capture in minute detail globally even the smallest tactical policy shifts that can impact outcomes half a world away. In a nutshell, we have automated the “superforecasting” process…..and make it available to analysts and strategists everywhere. As our animated video explainer makes clear, we are making superforecasting easy using our PolicyScope platform.

Our patented process does not, however, eliminate the need for humans to read. Nor does it automate the scenario analysis process itself. To use scenario analysis effectively in the policy risk context particularly, it is crucial to debunk three key misconceptions about scenarios analysis.

Tomorrow’s post will being the debunking process.


PolicyScope data is available through the Bloomberg Enterprise Access Point.

Customized widgets and dashboards are available via API HERE.

Analytical scenario analysis (The Scenarios -- twice monthly) and daily global macro analysis of platform data (The PolicyScope Risk Monitor) are also available.