Words Count, But Context Counts More




Of course words count....especially in the policy context. In a Distributed Age turbo-charged by social media, it is no longer true that "words can never hurt me." When words are used by public policy officials, words take on additional, action-oriented attributes which are not present when private sector entities utter the same words.


In other words (pardon the pun), policy language is profoundly more predictive of future decisions than most people appreciate because it articulates action. Superforecasters know this, which is why they focus on minute shifts in position. The question is whether the superforecasting process can be automated and scaled. We believe, based on experience, that the answer is yes. We recently had some fun with this idea in this blogpost and YouTube video.


The trick is to use the right tools to convert the unstructured policy language into structured data. Natural Language Processing (NLP) and Natural Language Understanding (NLU) are the obvious candidates for the task. Regrettably, the very context that drives meaning in political language creates massive computational challenges for even the most advanced NLP/NLU utilities. This post explains why.


Free Speech 101 -- Basic Concepts


Let's start with a counterintuitive and deliberately provocative statement: there is no such thing as 100% free speech in a public setting. In fact, there is not such thing as unrestricted freedom in a public setting.


Enlightenment thinkers sparked revolution by maintaining that the natural state for humans (OK, at the time, white men who owned land) was absolute -- but NOT unlimited -- freedom. Maximum freedom was to be found on one's own land. To this day, in the United States, this concept underpins self-defense laws in one's home. A good high-level historical overview of the "castle doctrine" can be found in this Wikipedia entry.


We will let others continue the centuries old polemical debate on whether the consequent enshrinement of private property created a "tragedy of the commons." This post does not pass value judgements; it merely explains how we got to where we are. The history matters in order to understand well the technology challenges that arise when we ask bots and algorithms to parse policy language.


John Locke and others made clear that people need to coexist. Your freedom is not entitled to disrupt someone else's ability to enjoy freedom. When your actions disturb others' rights, agreement must be reached on how competing rights should be shared along the overlap. This concept continues to underpin modern laws on pollution emanating from private property as well as traffic safety and other laws governing shared spaces.


Similar standards apply to speech.

The United States Constitution, in the First Amendment, famously establishes a right of free speech without limitation. Among other things, the idea was that if someone has the right to say what they think, this serves as a release valve that makes violent action less likely. Of course, Eighteenth century society did not have Twitter or Facebook or Reddit.


Throughout the twentieth century (long before the rise of social media), the Supreme Court interpreted the First Amendment to include boundaries on free speech that are conceptually similar to the limits on freedom that operate on private property. The key is the potential for physical harm. The Supreme Court has made clear that the limit of free speech by private citizens ceases to operate if the speech seeks to incite illegal/dangerous action and will likely generate actual illegal/dangerous action. Hate speech legislation extends this concept by defining certain categories of speech that are automatically considered to be so powerful that they are likely to incite odious actions.


We will leave aside for another day the question of whether banning speech increases or decreases the likelihood of odious or dangerous activity. We bypass altogether the amazing and difficult debate underway about whether and how certain speech should be suppressed on social media platforms by democratically elected governments or private parties.

The key point for this blogpost is that all these standards were created with the private citizen in mind.

Why Policy Language is Different


Policymakers enjoy far more freedom of speech than private citizens.


Article I, Section 6 of the Constitution insulates from legal action "any Speech or Debate" by Members of Congress. We will let constitutional scholars debate whether the punctuation in the text applies to Congressional speech the same exceptions that apply to arrest ("except treason, felony and breach of the peace") while in transit to or from the Congress. The insulation of Congressional speech, among other things, is the foundation for immunity from prosecution for libel or slander any statements made by Members of Congress on the floor of the House or Senate.


Why do Members of Congress enjoy these immunities? Because an elected official is deemed to be speaking not just for himself or herself. The official is deemed to be speaking on behalf of constituents. The official serves as a megaphone; the size of the megaphone can be considered to be the size of the constituency that gets to determine at specific intervals whether the individual continues to have the honor of articulating opinions on behalf of the constituency.

This makes policy language different from private speech in rather profound ways.

Among other things, a policy official is constantly aware that his or her words will generate ripple effects and have concrete consequences for a large number of people if not whole industries and economies. Policymakers weigh their words deliberately, even when it seems they are acting impromptu..


What policymakers say matters. Alot. Every word implies action and holds the potential for generating a reaction.

Elected policymakers predominantly speak in simple language laced with normative values that resonate with voters. Even when discussing technical arcane issues of monetary policy, regulatory capital, cybersecurity, blockchains, etc., they prioritize language that can be understood by voters who hold the power to turn them out of office in the next election if they say and do the wrong thing.


The same is true regarding speech by appointed officials (e.g., regulators, central banks), but with a twist. These officials cater to a technical constituency more than voters. Insulated from ballot box pressures, these officials effectively speak their own dialect to a well-informed constituency that speaks the same language. For example, bond market traders are fluent in "Fedspeak." Banking regulators use "Basel-speak."


Every word uttered by appointed policymakers also matters and also implies action. But the language must be decoded to be understood by non-experts. For an in-depth description of how to distinguish rhetoric from action within the official sector,