Abstract—Legal writing is centrally important to the modern practice of law. It is the main conduit through which courts engage with the public. Clear prose is so important, that the Federal Judicial Center has explicitly identified the quality of judicial opinion writing as a concern. However, we have little empirical insight into how clearly judges communicate. This Essay helps address this lack of data by examining the readability of over 6000 Supreme Court opinions. It shows that Supreme Court writing styles have become more complex in recent decades, and that there is substantial inter-Justice variation in how clearly opinions are written.
Writing is the main way that courts engage with the public. As such, the quality of writing in judicial opinions is an important element of the legal system, and one we know relatively little about. Gut opinions might suggest that judicial writing is sometimes overly complex. Indeed, the Federal Judicial Center finds it necessary to encourage judges to avoid wordiness, pomposity, and overly complex phrasing. However, we have little empirical insight into how well judges heed this advice and whether the quality of judicial writing has changed over time.
This Essay is the first to shed some light into this empirical darkness by analyzing the readability of over 6000 Supreme Court opinions. The data shows that, at the Supreme Court level, legal writing has become more complex in recent decades, and that judicial writing styles tend to become more complicated as the Justices gain experience on the court. We also see substantial variation among opinion writers, with Justice Scalia penning the wordiest opinions.
The Data & Analysis. The data analyzed below comes from a variety of sources. To measure the readability of Supreme Court opinions, I downloaded all Supreme Court decisions issued in 1946 or later that are available on CourtListener.com. CourtListener was chosen over alternative legal publishers because it offers bulk downloads of opinions in machine-readable form. This results in a dataset containing the full text of 6206 of Supreme Court opinions.
There are a variety of measures that can be used to determine how easy it is to read a text. The Simple Measure of Gobbledygook (SMOG) is robust and often the preferred method. Like most readability indices, SMOG looks to the length of sentences and the number of long multisyllabic words in each sentence to arrive at a measure that can be roughly interpreted as the number of years of education one would need in order to be able to comfortably read the text. So, a SMOG score of 12 would suggest that the text is at a high school graduate reading level, while a SMOG score of 16 would be more appropriate for a college graduate.
To measure the SMOG scores for the cases within the Supreme Court opinion dataset, I wrote Python program that parses the opinion text into sentences, measures the number of syllables in each word and calculates SMOG values. In addition to measuring SMOG I also noted the year of publication for each decision, which Justice authored the opinion, how long into his or her tenure on the Supreme Court the opinion was written, and the ideological leanings of the authoring Justice.
SMOG Over Time. There are a number of ways we can examine judicial writing styles. Perhaps of most interest is the simple trend over time. Is judicial writing become more or less clear? All else being equal, is it easier to read an opinion drafted today compared to one drafted decades ago? To examine this we can look to the overall SMOG score trend over time, shown in Figure 1.
This shows that there is a wide range of SMOG scores every year (the error bars span one standard deviation), but there also appears to be a fairly clear trend upwards. The Pearson’s correlation coefficient for year and SMOG is 0.31 (p < 0.0001), suggesting that SMOG scores are indeed increasing over time.
The changes in SMOG correspond roughly to similar observations about changes in opinion length. This time period saw two major changes in the way opinions are drafted: an increased role played by clerks in the drafting process, and the introduction of electronic typewriters and subsequently computers and word processing software. More clerk participation in the drafting process might partially explain increasing language complexity as language is written, edited, and re-written by in successive stages. This could lead to complexity bloat as each author adds his or her voice to the opinion. The introduction of word processors offers an even more straightforward potential explanation for increasing SMOG scores. SMOG is a function of sentence length and polysyllabic word usage. The editorial flexibility that word processors allow may encourage authors to draft and re-draft sentences, adding words and clauses to early drafts. This would lead directly to higher SMOG scores.
SMOG and Time on the Court. One might suspect that judicial writing style changes over the course of a judge’s tenure. This could arise both because junior justices may be systematically assigned different sorts of cases to draft opinions for, and because the act of writing many opinions may alter one’s writing style over time. Figure 2 shows the trend in SMOG scores by the number of years that a Justice has sat on the Court. Because we are interested in seeing changes in individual justice’s style in this context, the scores are presented as z-scores, where each decision’s SMOG value is normalized by the SMOG values of all opinions with the dataset written by that Justice. Each yearly bar shows the mean (and standard deviation) of the z-scores for each year of a Justice’s tenure. So, we see that in their first year on the job, Justices tend to write opinions that are somewhat simpler in SMOG terms (z = -0.13) than their average opinion.
Years of tenure correlates with SMOG scores at the r = 0.12 (p < 0.0001) level. This suggests that, although the relationship is relatively weak, the longer a Justice sits on the court the more difficult to read his or her opinions become. There are a number of potential explanations for this. The number of judges contributing opinions to each year of tenure decreases as tenure increases (because justices eventually leave the court). So, this effect may be driven by selection bias, with those judges more inclined to use complex writing styles also more likely to sit on the court for extended periods. Alternately it could be that the act of sitting on the court alters writing styles, leading to an increase in average SMOG over time.
Individual Trends Over Time. While the average SMOG scores across all Justices increase over time, the same is not necessarily true of individual Justice scores. Some remain relatively stable over their careers, while others trend upwards or downwards. Each of the 32 analyzed Justice’s SMOG score trends can be seen in Figure 3. These plot z-scores, with each Justice’s opinions normalized over their entire career.
Comparing Justices. Because the above individual scores are normalized by judge, they do not allow for direct comparison between how Justices write their opinions. Figure 4 below plots each Justice’s SMOG scores in absolute values enabling comparisons between judges. While we naturally see a wide range of SMOG scores for each individual Justice, we also see some significant variation in how difficult to read individual judicial writing styles are. Justice Scalia has the highest average SMOG score (14.41), while Justice Minton has the lowest (12.97).
Ideology and SMOG. Upon seeing the variation amongst judges in how readable their opinions tend to be, the natural question to ask is: why? What drives the variation between writing styles? Most of the variation is likely due to personal stylistic preferences, education, experience, and training. Some of the variation might be explained by the types of cases that justices are assigned to write opinions for. These factors are almost certain to explain the majority of the inter-Justice variation.
While most of the variation is likely due to individual backgrounds and stylistic preferences—traits that are difficult to objectively measure—we may be able to get at some of the personal attributes that drive complexity in judicial writing by looking to the Justices’ ideological perspectives. There is evidence to show that conservatives have more straightforward cognitive styles, which may affect the way more conservative justices draft their opinions. We can use the data at hand to explore this question.
Figure 5 plots individual opinion SMOG scores by the Martin-Quinn score for the Justice that authored the opinion. Martin-Quinn scores approximate the ideology of each Justice for each year they are on the Court. The scores place the Justices on a continuum from most liberal on the left to most conservative on the right. We see no strong relationship between ideology and SMOG, but do note a slight positive correlation (r = 0.11, p < 0.0001).
The above has briefly explored how judicial writing styles have changed over time and over judicial careers, and how styles vary between Supreme Court Justices. We have seen that Supreme Court opinions have grown more difficult to read in recent decades, and that the longer Justices sit on the court the more complex their writing tends to become. We have also seen that there is substantial inter-Justice variation with Antonin Scalia writing decisions with the highest SMOG scores while Sherman Minton wrote decisions with the lowest.
Communication is central to the law. This brief Essay has demonstrated some trends within Supreme Court communication styles. Future work could adopt these methods to examine the communication styles of other legal practitioners. It would be very interesting to see whether communication clarity correlates with the likelihood that a judge will be promoted, or that a lawyer will win cases or make partner.
It may be that there is an optimal level of readability in legal writing. It is likely the case that high SMOG scores are not necessarily a bad thing. Although we saw above that Justice Scalia had the highest SMOG scores among his peers—suggesting his opinions were the least easy to read—he is well-known for his strong and distinctive writing style and has written extensively on legal communication. Although true gobbledygook is probably best avoided, scoring highly in the Simplified Measure of Gobbledygook may be an unavoidable part of the practice of modern law.
 Fed. Judicial Ctr., Judicial Writing Manual: A Pocket Guide for Judges, at vii (2d ed. 2013) (“The link between courts and the public is the written word. With rare exceptions, it is through judicial opinions that courts communicate with litigants, lawyers, other courts, and the community. Whatever the court’s statutory and constitutional status, the written word, in the end, is the source and the measure of the court’s authority.”).
 Id. at 21–25.
 CourtListener, About, https://www.courtlistener.com/about/.
 G. Harry McLaughlin, SMOG Grading-a New Readability Formula, 12 J. of Reading 639 (1969).
 See e.g., P.R. Fitzsimmons et al., A Readability Assessment of Online Parkinson’s Disease Information, 40 The J. of the Royal C. of Physicians of Edinburgh 292 (2010).
 The formal definition of SMOG is: . The SMOG score of the above the line text in this essay is approximately 9.4.
 SMOG calculation requires parsing text into distinct sentences and measuring the number of syllables in each word. I used the Python Natural Language Toolkit sentence tokenizer to separate opinions into sentence units, see http://www.nltk.org, and subsequently used the Carnegie Mellon University Pronunciation Dictionary to lookup the number of syllables in each word. http://www.speech.cs.cmu.edu/cgi-bin/cmudict. In instances where words were not in the Carnegie Mellon dictionary, I used a simple algorithm to estimate the number of syllables in the word.
To ensure that a peculiarity of the SMOG formula was not driving results, I also measured readability using the Automated Readability Index measure with similar results. See RJ Senter & EA Smith, Automated Readability Index, AMRL-TR-66–220 (Aerospace Med. Research Laboratories 1967).
 See Ryan C. Black & James F. II Spriggs, An Empirical Analysis of the Length of U.S. Supreme Court Opinions, 45 Hous. L. Rev. 621, 635–40 (2008) (showing a marked increase in the length of Supreme Court opinions in the second half of the 20th century).
 Id. at 639.
 On the impact of computers on legal practice generally, see R.L. Marcus, The Impact of Computers on the Legal Profession: Evolution or Revolution?, 102 Nw. U. L. Rev. 1827 (2008).
 See supra note 6.
 Records of which justice wrote which opinion were obtained from the Supreme Court Database. Harold J. Spaeth et al., 2014 Supreme Court Database, Version 2014 Relase 01, http://Supremecourtdatabase.org.
 These box plots show, standard deviations, 95th percentiles, and outliers.
 See Philip E. Tetlock, Cognitive Style and Political Ideology, 45 J. of Personality & Soc. Psychol. 118, 123 (1983).
 See Andrew D. Martin & Kevin M. Quinn, Dynamic Ideal Point Estimation via Markov Chain Monte Carlo for the US Supreme Court, 1953–1999, 10 Pol. Analysis 134 (2002).
 See Andrew D. Martin et al., The Median Justice on the United States Supreme Court, 83 N.C. L rev. 1275, 1281 (2004).
 Antonin Scalia & Bryan A. Garner, Making Your Case: The Art of Persuading Judges (Thomson W., 1st edition ed. 2008); Antonin Scalia & Bryan A. Garner, Reading Law: The Interpretation of Legal Texts (W., 1 edition ed. 2012).