Insurance is a risk management tool, which allows the transfer of risk to a third party. Risk can include any adverse event in the future, whose occurrence is not certain but probable, and one wants to protect herself from the devastating effects of that unwanted event. This transfer of risk from one party to another is the essence of any insurance policy. The precise conditions under which such transfer happens differ depending on many circumstances. Traditionally, conditions are captured in policy wording expressed in natural language, which often contains lengthy and difficult-to-understand terms and clauses. Even though technological advancements are driving transformation in the industry, the core product - the insurance policy - remains relatively unchanged. It is the move from the analog to digital contract which can truly bring a revolution to the industry and unlock many new opportunities for consumers and insurance companies alike.
Digitizing the contract in the initial stage means making it human and machine readable by design. Such computable contracts are often expressed in the form of logic programming and may or may not encompass the nuanced expression found in natural language. Nevertheless, the move from analogue to computable contracts could create more transparent and accessible products, allowing insurance companies to tap into new markets and lower costs throughout the value chain.
As mentioned above, this paper aims to analyze current practices in insurance/reinsurance and propose a methodology to analyze policy wording in the context of natural language translation to code. The goal is to develop a diagnostic tool which can help to measure the level of clarity and objectivity in any insurance clause. Short term, this tool could help to analyze the current levels of contract certainty. Long term, the tool could enable the move from natural language to computable contracts. To truly test the limits of translation from natural language to code, the focus should then be on more complex insurance products such as commercial or reinsurance contracts.
Most personal lines insurance products are fairly standardized, relatively short documents aiming to protect individuals against probable events with relatively low sums. By contrast, commercial insurance and reinsurance contracts are very sophisticated agreements, providing meticulous specification on the conditions for payout. Commercial insurance and reinsurance are business-to-business (B2B) sectors where all parties are skilled negotiators. For instance, corporations seeking coverage for their properties hope to purchase insurance policies covering as many risk scenarios for the lowest price possible. Inevitably, the precise conditions of each insurance contract are very nuanced. Moreover, the negotiations between commercial insurance buyers and commercial insurers, as well as insurance companies and reinsurers, are often facilitated by specialized brokers. Intermediaries at every stage of negotiation often use variations in legal definitions to get their client the best deal in terms of coverage and price. Risks are often insured by several different insurers, and then, typically, each insurer will get reinsurance from one or more carriers. Lastly, reinsurers may cede some of their aggregate risks to other capital providers (known as “retrocession”). Each subsequent transaction is a separate legal arrangement, and as the reinsurance agreement is a separate contract from the original insurance contract, the original insured is not a party to the reinsurance. These factors then lead to significant clause wording variation, despite the same risk, product, and/or jurisdiction within the risk value chain.1
Various macroeconomic situations have a big impact on the quality of policy contract wording. The reinsurance market follows well-established supply and demand dynamics - that is, the more capital available, the more likely the insured gets a better price and more advantageous the terms and conditions (T&Cs) which specify rights and obligations of the insured. Depending on market conditions, clause position favors either insured or insurer. Historically, the insurance industry has faced a continuous influx of capital from alternative capital providers. Low interest rates motivated many investors to look for new strategies, which in turn drove up the market for Insurance-Linked Securities (ILS).2 This created a so-called “soft market” where (re)insurance buyers optimize for broader and broader coverages, often achieved by introducing less conclusive T&C language.3 Over time, this has increased complexity of insurance clauses. For consumer contracts, there has been rising pressure from regulators to solve any uncertainty in favour of consumers.4 Furthermore, other trends such as the overall professionalization of the (re)insurance industry has resulted in arduous and costly court and arbitration disputes.5
Policy wording often drives billions of dollars in claims. The immense variation in contractual clauses, coupled with increasingly broad T&Cs, has spawned a significant industry challenge. There exists a gap between the words of the clause and the quality of its content, largely owing to a lack of performance benchmarks around the certainty of contractual clauses,6 making it exceptionally difficult for the (re)insurer to review policy wording. Often, legal advisors are pulled in at a last moment, creating bottlenecks for the business often further constricted by a single, tight policy renewal window. Consequently, underwriters, who evaluate and analyze the risk, often find it overwhelming to manage the variation in contractual wording and its implications. Moreover, once the contracts are signed and stored, analysis at the post-bind stage remains very manual and time-consuming for the time being, as even the best information retrieval technologies struggle with the diversity of (re)insurance forms and structures.
In recent years, the industry introduced efforts to address standardizing clause wording and accounting forms, most prominently, the Ruschlikon Initiative,7 which resulted in some progress around claims automation. However, heterogeneous policy wording remains a challenge. Companies such as Swiss Re and AXA focus on using the latest technology to improve the contract review process. The goal is to enable UWs and legal advisors to understand and analyze policy wording more efficiently and accurately. Current technology scans the policy, and, through Optical Character Recognition (OCR), or data extraction technology, can make it machine readable. Depending on the need, the human reviewer can easily evaluate the policy, either by analyzing it against a prior version or against the given company’s “underwriting governance framework” - its gold standard, or playbook of best practices. The underwriting governance framework goes beyond what is demanded by a regulator and is typically unique to each insurer, driven by its overall business strategy. These frameworks regulate the risks and conditions under which they can be accepted. For example, some automatic checks flag discrepancies between contract versions, missing critical clauses, and potential clause risk.
Importantly, the intricacies of (re)insurance wording are difficult to capture for both humans and machines alike. However, advances in machine learning (ML) and natural language processing (NLP) constantly push the boundaries of what is possible. The current technologies aim to augment human capabilities by managing the diversity and inconsistencies of the world of (re)insurance contracts, characteristic of what Oliver Goodenough describes as Legal Technology 1.0.8 The next stage, Legal Technology 2.0, would lead to further augmentation of current human-machine touchpoints. Progressively, more expert systems would replace manual processes and allow for completely new insights at the portfolio level. However, the current focus is on empowering, rather than replacing, humans and, as Goodenough explains, Legal Technology 2.0 remains "embedded within the existing system."9
Next, we consider whether, technologically, the (re)insurance contract could be simplified and automated at the design stage such that the industry could leapfrog to Legal Technology 3.0. By leveraging the power of computational approaches, the current system could be radically redesigned. We will attempt to do so through a combined linguistic and deep-learning analysis. The remainder of the paper will unfold as follows. A linguistic background will set the stage and provide the theoretical framework around the paper’s methodology. We then consider the significance of context applied to code. Finally, our strategy for our proposed project will be outlined. We will then conclude with some next steps.
As law has language at its core, contract interpretation is, in effect, a linguistic exercise. This has led to a heavy reliance on translation when reconciling human with machine-readability. Core linguistics suggests that natural language is composed of three underlying components: syntax, semantics, and pragmatics. Curiously, the enduring focus on syntax and semantics in computational models has led to a neglect of pragmatics, an arguably essential pillar in meaning-making. Consequently, this impedes understanding and contextualizing of legal concepts.
Pragmatics is the field of linguistics that reflects on intention, using tools of implicature and inference. Consider the phrase: “There is an elephant in the tree.” Semantics is helpful, to the extent that it could raise what may be a prototype example of an elephant. That elephants are not typically found in trees, signals other possible meanings. Could this be an idiom (i.e., “elephant in the room”) - or perhaps the elephant in question is a paper elephant? Additionally, pragmatics raises the issue of reference. Consider: “Jane is speaking with Joanne. She is a renowned legal scholar.”10 The referent of “she” is not clear. Without context, semantics alone is insufficient to ascertain meaning.
Computational systems that use propositional logic reflect the limitations of semantics: propositional logic can enable the validation of some statements but cannot in itself establish the truth of all statements.11 So, why must we consider pragmatics in computational law, and specifically, computable contracts?
Contrary to the rhetoric on clarity and precision, ambiguity is revered as an inherent property of legal drafting. This is because legal documents are not independent artifacts and instead belong to a broader ecosystem. The aforementioned issues of pragmatics in natural language are integrated into the fabric of law and legal text and powered by literary tools of metaphor and analogy that outline context. This has specific implications for contracts.
Jeffrey M. Lipshaw considered the persistence of “dumb” contracts,12 or, more simply, contracts drafted in natural language as opposed to code. Lipshaw clarifies that the intuition to restate contractual ‘logic’ into code is misleading. In his paper, Lipshaw experiments with translating Article 2 of the Uniform Commercial Code (UCC) to formal logic. Interestingly, he was able to formally prove that a buyer can be compensated for damages.13 Moreover, Lipshaw notes that Article 2 includes fuzzy standards (e.g., “to sell goods that are fit for ordinary purpose”).14 Still, fuzzy logic was able to account for seemingly subjective criteria. This suggests that legal documents that involve complex future contingencies, albeit written in natural language, are already reducible to simpler, more logical structures.15 However, Lipshaw argues that imminency leads to risk-hedging behavior. In effect, vagueness, or ‘elasticity,’16 are pragmatic functions of natural language that create the strategic space for mitigation. Formal logic, on the other hand, is complete and unambiguous. There is no elasticity available.
Consider Relevance Theory17 in linguistics, which identifies three levels of meaning: (1) logical form, (2) explicature, and (3) implicature. Meaning is derived from accessing all three levels. For example:18
“You are not going to die.”
Logical form: You are immortal.
Explicature: You are not going to die from this paper cut.
Implicature: You are being dramatic and should stop making a fuss.
Notably, explicature and implicature are both pragmatic developments of the sentence’s logical form.19 Explicature provides further detail that contextualizes the original sentence. This suggests that what is said cannot solely be derived from lexical meaning and syntactic combinations. Returning to Lipshaw, the assumption is that code, unlike natural language, is unable to ‘enrich’ propositions expressed, since formal logic has no pragmatic dimension. As a result, there will be a persistence of legal documents drafted in natural language. Though logic is evidently a core component to legal structure, logic lacks the elasticity that is currently only available in the natural language realm. While logic is present, natural language text must persist to clarify meaning. This suggests that logic should be considered as an entry-point and the groundwork laid, but that the drafting process does not stop there. A reflection on context – and in effect, pragmatics – and how it can be represented computationally is then the next step.
Interestingly, code is not quite as transparent or reducible as often assumed. Mark C. Marino argues that code, like other systems of signification, cannot be removed from context. Code is not the result of mathematical certainty but “of collected cultural knowledge and convention”.20 While code appears to be ‘solving’ the woes of imprecision and lack of clarity in legal drafting, the use of code is, in fact, capturing meaning from a different paradigm. Rather, code is “frequently recontextualized” and meaning is “contingent upon and subject to the rhetorical triad of the speaker, audience (both human and machine), and message.”21 It follows that code is not a context-independent form of writing. It relies on existing methods and structures of constructing concepts. It has its own ontology. The questions become whether there could be a pragmatics of code, and, if so, how could code effectively communicate legal concepts?
Currently, few scholars have addressed code beyond its functional competence, reflecting the focus on syntax and semantics as primary drivers of using code for legal drafting. Yet, learning how meaning is signified in code enables a deeper analysis of how the relationships, contexts, and requirements of law may be rightfully represented.
This paper, therefore, proposes a combined approach to drafting reinsurance contracts, inspired by emerging literature on the application of network analysis and graph theory to analyze legal complexity. In a recent article on the growth of the law, legislative materials were modelled using methods from network science and natural language processing.22 Modelling legislative corpora as dynamic document networks, Katz et al. argue that quantifying law in a static manner fails to represent the diverse relationships and interconnectivity of rules. They suggest that legal texts should instead be represented using multi-dimensional, time-evolving document networks. As legal documents are interlinked, networks better reflect the dynamics of their language and the “deliberate design decisions made.”23 Moreover, networks enable “circumvent[ing] some of the ambiguity problems that natural language-based approaches inherently face.”24 Most fascinating is the authors’ isolation of, through graph clustering techniques, legal topics that have fostered the most “complex bodies of legal rules.”25 This enabled a deeper understanding of the evolution of legal concepts and specific inflection points representing shifting perceptions.26
What is additionally striking about the paper by Katz et. al is the introduction of quantitative approaches that stress content representation as opposed to structural miming. This model considers, importantly, the context that shapes legal documents. How then could machine-readability be reconciled with graphical representation of legal documents?
Legal information must be understood at a systemic level to consider the interaction of legal documents with one another across a temporally sensitive frame. Therefore, legal texts should be perceived as objects with code as the semiotic vessel. How these objects interact, how references are made, and how their histories interrelate must be accounted for. It appears then that a dual-pronged method of semiotic analysis coupled with pragmatics contribute to a more fruitful engagement of legal knowledge representation. As opposed to applying a purely arithmetic lens in the name of clarity and precision, language design for machine-readability requires a multi-layered approach that extends beyond syntactic structure and ensures temporal management and formal ontological reference. This suggests that rather than an emphasis on semantic translation to computable reinsurance contracts, consideration from an information extraction perspective may be a promising alternative.
We recommend a multi-step approach to exploring computable reinsurance contracts. This proposed method uses both prior research and existing tools to advance the next stages of testing contract computability. First, we seek to define “elasticity” in contractual clauses, referring to the Principles of Reinsurance Contract Law (PRICL),27 developed by the 2019 International Institute for the Unification of Private Law (UNIDROIT) Working Group. In a significant undertaking that resulted in a 234-page report, the PRICL principles seek to clarify the language of reinsurance clauses. PRICL provides reinsurance specific rules on contract law, particularly in areas where practitioners felt a demand for legal certainty. We consider the PRICL principles as a verifiable benchmark of the quality of a well-structured clause. Following these guidelines, we can then develop a preliminary definition of “elasticity.” That is, in the first phase of our work, we plan to use NLP tools to identify elasticity patterns in selected clauses.28 We consider the extent of language variability is more likely to trigger negotiation, dispute resolution, and contract dissolution.
Signs of variability will be cross-referenced with the linguistic fingerprints of contractual clauses. Grace Q. Zhang, a pioneering scholar of elastic language, noted that elasticity accounts for fluidity and strategic uses of language that challenge the limits of propositional validity.29 As opposed to the binary assumptions of statements as either true or false, she introduces the spectrum – the degrees of truth in propositions. In her text, she identifies four types of linguistic ‘stretchers,’ with the specific role of adding degrees of elasticity to the claims presented:30 approximate, general, scalar, and epistemic. Examples of the first three types are frequently found in contractual clauses (e.g., the use of “some” as an approximate stretcher, the use of “anything” as a general stretcher, and the use of “often” as a scalar stretcher). These types of stretchers are also understood in linguistics as “scope ambiguity.”
Of particular interest are epistemic stretchers, or “epistemic stance markers,” that reveal speakers’ “commitment to their claim.”31 The use of prescriptive assertions in contractual drafting may be considered an example of this type of stretcher. Likewise in linguistics, epistemic stretchers reveal positionality and carry out a “subjective function of language.”32 That is, these types of stretchers behave as linguistic clues that signal preferences of contractual parties. For instance, modal verbs, such as “could,” “may,” or “should,” act as indicators of varying degrees of certainty. Similarly, they signpost risk-hedging behavior.33
Moreover, epistemic markers can be implicit or explicit. In contracts, they most frequently are implicit since the speaker’s position is typically less direct, suggesting that implicit epistemic stretchers lean towards intentionality and strategic uses of vagueness. As a result, an NLP-based analysis focused on linguistic stretchers will be developed to serve as the initial litmus test of elasticity in contractual drafting. Importantly, identifying elasticity may be helpful in determining areas where contract disambiguation is required and, alternatively, where some flexibility is necessary.
The above example is a sample reinsurance clause, drawn from PRICL, frequently considered as complex, vague, and susceptible to legal uncertainty. It features ill-defined terms such as “reasonable” and “prudent.” Moreover, this clause showcases the variation and range of epistemic stretching. Consider the phrase above: “of the type that would.” This phrase is notably elastic, as the conditional verb, would, signals that there is a latent position around what “information [is] material to a risk” that is implicit and not revealed to the reader. We consider this type of clause as useful testing grounds for our proposed approach.
In the second phase of our work, we consider how we may be able to compute elasticity. While the intention of the first phase is to develop a basic framework for defining elasticity, the second will focus on identifying and developing an elasticity metric. This quantifier will behave as a building block for an eventual reinsurance taxonomy. Provided the complexity of reinsurance contracts, the elasticity metric may help demystify intentional and unintentional vagueness in contractual language. In this phase, we revisit the context surrounding “elastic” clauses, focusing on issues of reference related to: (1) other related agreements (both past and future); (2) original policy conditions; (3) other supplementary documents (e.g., schedules); and (4) legislative or regulatory updates. We emphasize the significance of context here, provided the networked ecosystem and interconnectivity of players in the reinsurance industry.
Finally, in the third phase, we aim to apply the results drawn from the first two phases to begin modelling elasticity. We intend to use the elasticity metric as the foundation for a diagnostic tool to assist with contractual risk assessment. Not only would the diagnostic tool act as a mechanism for identifying areas of vagueness, but also would aim to highlight areas where context may be formalized to facilitate computable contracts implementation. Ultimately, our hope is that we may be able to introduce an alternative perspective that will contribute to the ongoing and fruitful endeavors in the space of computable contracts.
The CodeX Insurance Initiative has invited leaders from industry, academia, and the regulatory community to contribute short papers describing the author’s views on important issues relating to the application of computable contracting in the insurance industry. The development of computable contacting for insurance is still a work in progress, and the sharing of ideas and approaches within the community of interest is a major goal of the Insurance Initiative. As a part of this conversation, these papers present the views of their authors, and do not necessarily reflect the views of CodeX, of the Insurance Initiative, or of any of its participants.
Klaudia Galka is a Fellow at CodeX — The Stanford Center for Legal Informatics and a Contract Solutions Manager at Swiss Re.
Megan Ma is a Residential Fellow at CodeX — The Stanford Center for Legal Informatics.