Why (Language)policy + (Data)structured = Predictive Analytics [Part 1 of 4]

Updated: Jun 19, 2019

Predictive analytics, powered by machine learning and artificial intelligence, is all the rage. The race is on to build data lakes and train algorithms to find patterns in those lakes, in the belief that faster computation times will generate new insights. The capabilities (real and potential) drive parallel concerns that machines will replace people when conducting a range of cognitive and analytical activities.

Nowhere is this more evident than in the unstructured data context, particularly Natural Language Processing (NLP). NLP automates the reading and listening process, enabling technology to understand lexicons and syntax. In certain contexts, writing original stories is also possible.

Understanding meaning and nuance remains a work in progress. Pioneers have aimed first at retail consumers, creating website chatbots, automated telephone operators, and interactive devices that conduct rudimentary conversations like Siri and Alexa.

These systems mainly enable technology to anticipate the next word and, thus, likely consumer intent. Related sentiment analysis purports to discern emotions from language, with immediate applications in brand marketing and extensions to the political realm. However, they are potentially prone to embedded bias when normative judgements are incorporated into the algorithm.

Over the course of four posts, we present the case that the language of policy is different from the natural language of ordinary people/consumers. The differences permit the language of policy to be converted into objective structured data without embedded bias. The outcome delivers impressive predictive analytics, particularly when turbo-charged by artificial intelligence. Our patented Metadata tagging processes automates this process.

Part II will address Structured Data

Part III will address Predictive Analytics

Part IV will address the Artificial Intelligence Advantage.

Let's start with language.


The policy context exponentially enhances the communication value of language from policymakers.

Policy language is highly stylized, using technical lexicons laced with normative meaning. Historical context imbues phrases with specific meaning often known only to domain experts.

Policy language serves three functions simultaneously.

  1. Values: Policy language articulates a perspective on what constitutes a good, wise, or preferred policy.

  2. Intention: Policy language articulates an intention to act. Even when policymakers stall, delay, or try to ignore an issue, they are still making a decision to take a specific action (e.g., stall, delay, etc.).

  3. Direction: Policy language identifies the direction of action.

While these generalizations may be true for all language, the important distinguishing factor in the policy context is that policymakers are required to act. Policymakers were elected or appointed to make a difference. Not only are they required to act, they also must communicate their intention in order to build support for a policy choice.

These requirements are not limited to Western liberal democracies. Even policymakers in autocratic or authoritarian governments articulate their intentions to act in order to build alliances with policymakers in other countries and in order to notify domestic constituencies of the direction policy is taking. Policymakers in all types of government additionally communicate their intentions publicly (through statements as well as leaks) often for the purpose of attracting foreign direct investment and/or investment in their sovereign bonds.


The business of policymaking is to find a way to gain the maximum support for a particular policy through word choice. As noted above, support need not be limited to electoral prospects particularly in jurisdictions that do not hold free and fair elections.

Language matters in law making, not just in speeches or on Twitter. Having served as Congressional Committee staff and having negotiated various public statements globally, I saw firsthand the importance policymakers in multiple jurisdictions and functional areas place on finding precisely the right words and the right combination of words. In the U.S. Congress, nothing can become law before a conference committee between the House and the Senate generates identical text (including punctuation) which subsequently must be voted on again by both houses before it proceeds to the White House for signature.

At the international level, treaties, communiques and declarations communicate the substance of any agreement. The words are painstakingly negotiated. The same is true for technical regulations.

Policymakers know this and act accordingly. Every tweet, nearly every "hot mike" moment these days, every social media post, every official sector leak to the media, is designed to communicate some action to some audience. Every one of those utterances communicates a concrete intention to act in a particular way.

You don't have to like what is said, but it is important to understand what is said in order to measure the risk and react accordingly.

Limited Lexicon

Not only is policy language highly stylized, it is also far more constrained than vernacular language. Specific documents (laws, regulations, treaties, constitutions) create the boundaries for how policymakers articulate actionable ideas. These source documents constrain the ambit of potential consensus-building.

Not all language has the same value, of course. The speaker matters, as does the timing and context of the utterance. Policy language before a decision provides clues about the direction of a decision. Policy language articulating a decision sets the foundation for future discussion, including complaints about the decision.

Policy language is also more fluid than the vernacular. Policymakers impact the lexicon by "changing the terms of the debate" daily. Some do this with technocratic language. Others import value-laden terms to influence the debate. Others insist on using vernacular language that will resonate with constituents in order to build grass roots support for policies in complicated areas. Not all vernacular language is value-laden or inappropriate.

Consequently, the language of policy is beautifully positioned for automated analysis using computational linguistics tools...so long as one knows which specific language to track and tag.

Why This Is Hard To Implement, Why NLP Technology Helps

Policy language presents technology professionals without policy experience with multiple challenges. Because they are not subject matter experts, technology professionals attempt to find relevant language by deploying computational power across too broad a universe of language.

In addition, policy language is designed to elicit an emotional response from the reader/listener that can distract from objective analysis. Policy language can include a range of normative intentions many might find inefficient, ineffective, or even abhorrent. It can be hard to separate one's personal emotions and beliefs when reading certain statements. How many times have you reacted to policy language with any of the following statements:

  • they did what?

  • I can't believe they did [x].

  • This is a bad idea.

  • This won't end well.

  • This is wrong.

This is emotion talking and it interferes with analysis.

Technology is not burdened by emotion. NLP can count words and actions dispassionately. It does not have to "get your head around" the implications of a statement in order to collect the information, categorize it, and visualize it.

NLP and related analytical technologies deliver enhanced cognition, making it possible for human beings to see and analyze trends in policy language faster and better because they create a buffer between the emotional reaction to policy language and the information content of policy language.

This empowers advocates, NGOs, and portfolio managers to accelerate their ability to connect the dots, put the puzzle pieces together, and act more strategically when they spot policy language adverse to their interests. These kinds of consumers of policy language can use NLP-powered policy risk platforms to increase the effectiveness of their reaction function.

The technology may be value-neutral, but the use of the technology can enhance a human's ability to engage at the normative as well as the objective analytical level.

NLP provides a powerful tool to automate language acquisition, data lake assembly, and preliminary analysis. When those tools are programmed by subject matter experts with experience in making policy decisions, as at BCMstrategy, Inc., what emerges is a highly refined and objective set of training data for use in a range of artificial intelligence deployment contexts.

We will address policy trend projection and sentiment analysis in detail in a separate series of posts. For now, the point is that automated policy trend projection requires as a condition precedent a data lake of all language from policymakers regarding the policy in question.


At BCMstrategy, Inc., we are methodically building a data lake of policy language in a few areas first, before covering the waterfront. We are only at the front end of innovation regarding use of NLP tools in the policy language context. Every day brings new insights as the technology grows.

We look forward to contributing to the conversation at the intersection of NLP/ML/AI technology and the language of policy going forward so that people can put the puzzle pieces together better and make more effective decisions based on cold, hard facts.​​

#NaturalLanguageProcessing #AI #artificialintelligence #predictiveanalytics #enhancedcognition