Difference between revisions of "Stylistic Structure in Historic Legal Text"

From Cohen Courses
Jump to navigationJump to search
Line 4: Line 4:
 
'''The Background'''
 
'''The Background'''
  
Slaves remained the largest source of wealth until 1840s.
+
In this project, we are interested in understanding the stylistic differences of judges in historical legal opinions. We specifically focus on cases regarding slaves as property.
Judicial preferences and styles could generate variations in the security of slaves.
+
Slaves remained the largest source of wealth until 1840s. Judicial preferences and styles could generate variations in the security of slaves.
In this project, we are interested in understanding the stylistic
+
 
differences of judges among all pro-slavery, anti-slavery, and boundary states.
+
We are interested in studying how these cases were handled in different regions of the United States with varying views towards slavery. Because this is a longitudinal data set, we are also interested in understanding how styles change over the course of decades.
We are also interested in investigating how the stylistic patterns from judges' opinions
+
 
are different from the case summaries, and how they can inform judicial decisions.  
+
To do this, we will utilize a comparable aligned corpus of judicial opinions and overviews on the same cases. Our belief is that by capturing the topical overlap between an opinion and a neutral overview, the non-content word structure of the judge's opinion that remain will be indicative of the style in which that information is being presented.
 +
 
 +
To measure this, we will utilize local structured prediction tasks to generate a feature representation of a text based on those stylistic cues. We will then compare that representation to a simpler, unigram or LDA-based feature space at a classification task (region identification) and a regression task (year identification). Our belief is that our stylistic model will be more accurate quantitatively (by measuring accuracy at these tasks) and more interesting qualitatively (by leveraging features other than topic-based cue words to make a classification).
  
 
----
 
----
Line 22: Line 24:
 
'''The Theory'''
 
'''The Theory'''
  
Our work will be informed at the highest level by the sociolinguistic theory of heteroglossia. This theory describes the ways in which speakers align themselves with the content of their language. For instance, when stating a fact, it can be given without justification, elaboration, or citation, in which case the writer is assuming that fact as given by all readers. This is termed monoglossic. On the other hand, language can be adapted with any number of different markers to show any uncertainty in a fact - this is termed heteroglossic language. Examples of this type of behavior include distancing (by attributing the fact to an authority or a witness, for instance), justifying (e.g. by giving a causal explanation to show what other information the fact is based on), or hedging (by adding modal auxiliaries or other "softening" language). Importantly, heteroglossic statements can include facts that the writer believes beyond a shadow of a doubt, if they feel the need to justify or explain that belief - any marker showing that a fact is not completely inarguable marks heteroglossia.
+
We focus on the issue of ''author engagement'', an attempt to describe the extent to which an author aligns themselves with the content of what they are writing. Examples of low engagement may be signalled by distancing with modal phrases ("it may be the case that...") or by attribution to another source ("the defendant claims that..."). High engagement may be signalled by pronouncement ("Of course it's true that...") or explicit endorsement of a third-party claim ("The defendant has demonstrated that..."). On the other hand, speakers may make statements with no engagement (simply stating a fact), suggesting that they believe that fact will be taken for granted or is entirely obvious to any reader.  
  
Our belief is that the way in which facts, entities, and events are referenced by a judge in an opinion will be influenced heavily by other factors surrounding the judgment, such as the location, time period, and outcome of the verdict. Therefore, if we can extract these behaviors in a systematic way, we can then use them as observables in a generative model of these variables. Moreover, these features are likely to be more informative and interesting for social scientists than simpler n-gram features, even if they perform no better at classification, due to their more descriptive nature.
+
These levels of engagement with the facts of a case demonstrate alignment with certain facts or sides in a legal case. Our belief is that the way in which facts, entities, and events are referenced by a judge in an opinion will be influenced heavily by other factors surrounding the judgment, such as the location, time period, and outcome of the verdict. Therefore, if we can extract these behaviors in a systematic way, we can then use them as observed features in a generative model. Moreover, these features are likely to be more informative and interesting for social scientists than simpler n-gram features, even if they perform no better at classification, due to their more descriptive nature.
  
 
----
 
----
 
'''The Approach'''
 
'''The Approach'''
  
Qualitative analysis of our data set immediately showed a major disparity between the two largest text fields in each case - Judge Opinion and Case Overview. The first, written by the judge in delivering a verdict, is heavily heteroglossic, with markers for opinionated, convincing, judgmental, or attributed facts. This is only natural for a judgment that must collect myriad testimonies and sources of evidence into a single verdict. On the other hand, the Case Overview section of each case is heavily monoglossic. Facts and testimonies are recorded impassively, with no attempt to persuade the reader - it is a simple summary.
+
Qualitative analysis of our data set immediately showed a major disparity between the two largest text fields in each case - Judge Opinion and Case Overview. The first, written by the judge in delivering a verdict, is littered with examples of author engagement, with markers for opinionated, convincing, judgmental, or attributed facts. This is only natural for a judgment that must collect myriad testimonies and sources of evidence into a single verdict. On the other hand, the Case Overview section of each case lacks author engagement entirely. Facts and testimonies are recorded impassively, with no attempt to persuade the reader - it is a simple summary.
 +
 
 +
Most intriguingly, these texts are about the same pieces of evidence, the same testimonies, the same series of events. This means that we have, in effect, fairly large pseudo-parallel corpora for engaged and disengaged authors. However, these texts are not the same length - on average, an overview is roughly 10% of the size of the judge's opinion. Therefore, it is not practical to attempt sentence-by-sentence alignment.
 +
 
 +
----
 +
'''Evaluation'''
 +
 
 +
Our task is to build structured representations of text which are informative for describing the stylistic structure of a written text. To test whether we are, in fact, getting any signal from our structured representation, we will attempt a classification task and a regression task. The first will be to predict whether an opinion was written in a slave state, free state, or border state. The second will attempt to predict the year in which an opinion was written.
  
Most intriguingly, these texts are about the same pieces of evidence, the same testimonies, the same series of events. This means that we have, in effect, fairly large pseudo-parallel corpora for monoglossia and heteroglossia. Our project will be based in three stages: Alignment, Extraction, and Prediction.
+
We can then measure these results both quantitatively (mean squared error (in years) for regression, and classification accuracy or kappa for classification) and qualitatively (by checking that the distribution of features in different categories is indeed informative). For this latter interpretation and analysis, we will be collaborating with a historian from Columbia University and an economist from American University, from whom we received access to this corpus.
  
In the Alignment stage, we will develop an approach for aligning sentences between Judge Opinion and Case Overview that are referring to the same facts from the case. This may be done using simple string similarity, or we may include a more complex framework for named entity resolution, coreference, and event detection. This will produce a more closely aligned corpus, on a sentence level, compared to the large paragraphs of parallel texts from a whole case.
+
----
 +
'''Baseline'''
  
In the Extraction stage, we will use unsupervised methods to detect the stylistic differences between parallel sentences describing the same entities or events. This may be based on syntactic structure, or it may be a flatter and less processing intensive method. Our goal is to extract a feature space representation of the extracted patterns for marking text as heteroglossic in a given sentence.
+
We will attempt two baselines. The first will be a bag-of-words representation of an opinion. The second will be based on LDA topic modeling, using default settings.
  
In the final, Prediction stage, we will use the output of the past two stages to represent each case not by the topical content of the ruling, but by an empirically derived representation of the judge's linguistic positioning towards the facts of the case. We can then use this representation to predict three different output variables - the state or region the judge is from; the time period that the judge is from; and the outcome of the case. Our hope is that we can gain leverage from this representation in a way which surface, n-gram based models cannot achieve.
+
It is possible this model will perform well. However, we believe that if it does, it will be because of shallow features which are not informative for social scientists.

Revision as of 16:31, 5 October 2011

This will be the project page for Elijah Mayfield and William Y. Wang.


The Background

In this project, we are interested in understanding the stylistic differences of judges in historical legal opinions. We specifically focus on cases regarding slaves as property. Slaves remained the largest source of wealth until 1840s. Judicial preferences and styles could generate variations in the security of slaves.

We are interested in studying how these cases were handled in different regions of the United States with varying views towards slavery. Because this is a longitudinal data set, we are also interested in understanding how styles change over the course of decades.

To do this, we will utilize a comparable aligned corpus of judicial opinions and overviews on the same cases. Our belief is that by capturing the topical overlap between an opinion and a neutral overview, the non-content word structure of the judge's opinion that remain will be indicative of the style in which that information is being presented.

To measure this, we will utilize local structured prediction tasks to generate a feature representation of a text based on those stylistic cues. We will then compare that representation to a simpler, unigram or LDA-based feature space at a classification task (region identification) and a regression task (year identification). Our belief is that our stylistic model will be more accurate quantitatively (by measuring accuracy at these tasks) and more interesting qualitatively (by leveraging features other than topic-based cue words to make a classification).


The Dataset

We have collected a corpus of slave-related and property-related US supreme court legal opinions from Lexis Nexis. The dataset includes 6,014 slave-related state supreme court cases from 24 states, during the period of 1730 - 1866. It also includes 14,580 property-related cases from the same period. Most of the cases consist of the following data fields:Parties, Court, Date, Judge Opinion, Previous Court and Judges, Disposition, Case Overview, Procedural Posture, Outcome, Core Terms Generated by Lexis, Headnote, Counsel, and Judge(s).


The Theory

We focus on the issue of author engagement, an attempt to describe the extent to which an author aligns themselves with the content of what they are writing. Examples of low engagement may be signalled by distancing with modal phrases ("it may be the case that...") or by attribution to another source ("the defendant claims that..."). High engagement may be signalled by pronouncement ("Of course it's true that...") or explicit endorsement of a third-party claim ("The defendant has demonstrated that..."). On the other hand, speakers may make statements with no engagement (simply stating a fact), suggesting that they believe that fact will be taken for granted or is entirely obvious to any reader.

These levels of engagement with the facts of a case demonstrate alignment with certain facts or sides in a legal case. Our belief is that the way in which facts, entities, and events are referenced by a judge in an opinion will be influenced heavily by other factors surrounding the judgment, such as the location, time period, and outcome of the verdict. Therefore, if we can extract these behaviors in a systematic way, we can then use them as observed features in a generative model. Moreover, these features are likely to be more informative and interesting for social scientists than simpler n-gram features, even if they perform no better at classification, due to their more descriptive nature.


The Approach

Qualitative analysis of our data set immediately showed a major disparity between the two largest text fields in each case - Judge Opinion and Case Overview. The first, written by the judge in delivering a verdict, is littered with examples of author engagement, with markers for opinionated, convincing, judgmental, or attributed facts. This is only natural for a judgment that must collect myriad testimonies and sources of evidence into a single verdict. On the other hand, the Case Overview section of each case lacks author engagement entirely. Facts and testimonies are recorded impassively, with no attempt to persuade the reader - it is a simple summary.

Most intriguingly, these texts are about the same pieces of evidence, the same testimonies, the same series of events. This means that we have, in effect, fairly large pseudo-parallel corpora for engaged and disengaged authors. However, these texts are not the same length - on average, an overview is roughly 10% of the size of the judge's opinion. Therefore, it is not practical to attempt sentence-by-sentence alignment.


Evaluation

Our task is to build structured representations of text which are informative for describing the stylistic structure of a written text. To test whether we are, in fact, getting any signal from our structured representation, we will attempt a classification task and a regression task. The first will be to predict whether an opinion was written in a slave state, free state, or border state. The second will attempt to predict the year in which an opinion was written.

We can then measure these results both quantitatively (mean squared error (in years) for regression, and classification accuracy or kappa for classification) and qualitatively (by checking that the distribution of features in different categories is indeed informative). For this latter interpretation and analysis, we will be collaborating with a historian from Columbia University and an economist from American University, from whom we received access to this corpus.


Baseline

We will attempt two baselines. The first will be a bag-of-words representation of an opinion. The second will be based on LDA topic modeling, using default settings.

It is possible this model will perform well. However, we believe that if it does, it will be because of shallow features which are not informative for social scientists.