THE PREDICTIVE CODING CASES

INTRODUCTION
Since the spring of 2012, when Magistrate Judge Peck first approved the use of predictive coding in Monique da Silva Moore, et al., v. Publicis Groupe SA & MSL Group, practitioners have watched closely for each new decision mentioning predictive coding. Because no universal technical or legal standard has yet emerged for the use of predictive coding in eDiscovery, each new order provides potentially valuable insights into what parties are attempting, how they’re defending it and how courts and parties opponent are reacting.

Although the total number of such cases is still small, the number grows each year. In Gibson Dunn’s “2014 Year-End E-Discovery Update” (January 20, 2015), they count 6 such decisions in 2012, 9 such decisions in 2013, and 17 such decisions in 2014. Some of these decisions make only passing mention of predictive coding, but some engage in extensive, educational discussion of the issues.
In this white paper, we will review the nine most significant of these predictive coding cases, starting at the beginning with Moore and working our way through chronologically to the most recent, Rio Tinto.

Some of these cases are significant because they were firsts, others because they are particularly educational examples, and all of them because they have been widely covered in industry press. This white paper reviews:
1. Monique da Silva Moore, et al., v. Publicis Groupe SA & MSL Group;
2. Kleen Products LLC, et al., v. Packaging Corporation of America, et al.;
3. Global Aerospace Inc., et al., v. Landow Aviation, L.P., et al.;
4. In Re: Actos (Pioglitazone) Products Liability Litigation;
5. In Re: Biomet M2a Magnum Hip Implant Products Liability Litigation;
6. Progressive Casualty Insurance Company v. Jackie K. Delaney, et al.;
7. Bridgestone Americas, Inc. v. International Business Machines Corporation;
8. Dynamo Holdings Limited Partnership v. Commissioner of Internal Revenue; and,
9. Rio Tinto PLC v. Vale, S.A., et al.

What will become clear as we move through these cases is that the answer to the predictive coding question is likely to be “yes,” but as long as producing parties insist on seeking prior judicial approval, that “yes” may come with a lot of transparency and cooperation strings attached.

MOORE
We start our case law review with Monique da Silva Moore, et al., v. Publicis Groupe SA & MSL Group, No. 11 Civ. 1279 (ALC)(AJP) (S.D.N.Y. Feb. 24, 2012), the first case in which the use of predictive coding was judicially approved. In Moore, the Plaintiffs were members of a putative class allegedly subjected to gender discrimination by the Defendants. The primary discovery issue in question was the appropriate methodology by which to evaluate approximately 3,000,000 emails the Defendants had gathered from their custodians.

The Defendants proposed using a predictive coding tool and a methodology selected in consultation with their eDiscovery vendor. That methodology involved:
 Developing a seed set to train the tool by having senior attorneys:

  • Review a random sample of 2,399 documents  (The sample size required for any resulting prevalence estimation to have a confidence level of 95% and a margin of error of +/-2%)
  • Review the results of some judgmental sampling
  • Review the top 50 results of a variety of keyword searches and Boolean search strings (Including those searches suggested by the Plaintiffs)
  • Engaging in iterative rounds of training review to improve the tool’s results:
  • Engage in 7 rounds of iterative review of at least 500 results per round
  • Test completeness after round 7 by reviewing a random sample of 2,399 documents from the pool of documents deemed not relevant by the predictive coding tool (Again, the sample size required for the resulting elusion estimation to have a confidence level of 95% and a margin of error of +/-2%)
  • With low elusion confirmed, review and produce the top 40,000 results
  • Providing training and testing documents to Plaintiffs to maintain transparency with regard to the definition of relevance being employed:
  • Defendants to turn over to Plaintiffs all non-privileged documents reviewed to create the seed set – both relevant and not relevant
  • Defendants to turn over to Plaintiffs all non-privileged documents reviewed during the 7 rounds of iterative review – both relevant and not relevant
  • Defendants to turn over to Plaintiffs all non-privileged documents reviewed during the final elusion measurement after round 7 – both relevant and not relevant

The Plaintiffs did not oppose the use of predictive coding in theory, but opposed some specifics of the methodology proposed by the Defendants. The Plaintiffs’ primary objection was to the Defendants’ plan to only review and produce the top 40,000 results of the trained predictive coding tool, which the Defendants’ were basing on an assertion that the cost to review and produce the top 40,000 results was proportionally correct. Magistrate Judge Peck agreed that this was problematic, calling it a “pig in a poke.”

Magistrate Judge Peck made clear that determinations about proportionality and when proportionality justifies cutting off review cannot be made in advance of the system being trained and actual review commencing, because real information about prevalence, cost, and relevance must be considered for that fact-based determination. Magistrate Judge Peck also emphasized that the planned 7 rounds of iterative review were subject to the same reality check – 7 would not be enough if the results were still not right.

Ultimately, Magistrate Judge Peck approved the Defendants’ proposed use of predictive coding, and that approval was upheld on review by District Judge Carter. In his order, Magistrate Judge Peck writes that his decision was based on his consideration of:
. . . (1) the parties' agreement, (2) the vast amount of ESI to be reviewed (over three million documents), (3) the superiority of computer-assisted review to the available alternatives (i.e., linear manual review or keyword searches), (4) the need for cost effectiveness and proportionality under Rule 26(b)(2)(C), and (5) the transparent process proposed by [the Defendants].

After Magistrate Judge Peck’s ruling at the hearing approving the Defendants’ plan, but before he had issued his written order, the Plaintiffs filed objections to his decision with District Judge Carter, the presiding Judge in the case. Because of that sequence of events, Magistrate Judge Peck was able to offer some useful thoughts on those objections in his written order.

The most important of these concern the Plaintiffs’ objection that the Defendants’ “method lacks the necessary standards for assessing whether its results are accurate; in other words, there is no way to be certain if [the Defendants’] method is reliable.” The Plaintiffs argued that, by allowing the Defendants’ to proceed without setting a maximum acceptable level of elusion and other standards, Magistrate Judge Peck was “simply kicking the can down the road.”
He explains eloquently why “down the road” is, in fact, the right place to make such determinations:

In order to determine proportionality, it is necessary to have more information than the parties (or the Court) now has, including how many relevant documents will be produced and at what cost . . . In the final sample of documents deemed irrelevant, are any relevant documents found that are “hot,” “smoking gun” documents (i.e., highly relevant)? Or are the only relevant documents more of the same thing? One hot document may require the software to be re-trained (or some other search method employed), while several documents that really do not add anything to the case might not matter. These types of questions are better decided “down the road,” when real information is available to the parties and the Court.

KLEEN PRODUCTS
Our second important case is Kleen Products LLC, et al., v. Packaging Corporation of America, et al., No. 10 C 5711 (N.D. Ill. Aug. 21, 2012). The Kleen Products case is a consolidation of a series of actions alleging anticompetitive behavior by various participants in the containerboard industry (the material from which corrugated boxes, etc., are made). Discovery in the consolidated case began in January 2011, but as of December 2011, unresolved discovery issues still remained between the parties, including the issue of search methodology.
The Defendants wanted to use a Boolean search method, had engaged outside consultants and begun work. Over several months during 2011, the Defendants and their consultants had iteratively tested and refined search terms to be used, using sampling to measure the results. The Defendants contended that their Boolean search process would be tested and validated to ensure accuracy as good as, or better than, predictive coding or any other option.

The Plaintiffs criticized Boolean search methods as inherently inadequate and flawed and lobbied the Court for the Defendants to be required to use “content-based advanced analytics” – also known as predictive coding. The Plaintiffs argued that the Defendants’ searches would be likely to find less than 25% of the responsive documents, while a predictive coding solution would be likely to find more than 70% of them, at no greater cost than the Defendants’ proposed methodology. The Plaintiffs also criticized specific details of the Defendants’ sampling and validation methodologies.

Hearings were held on February 21st and March 28th, 2012 at which witnesses testified regarding the proposed search tools and methodologies. Rather than directing the parties to adopt a particular resolution to this issue, the Magistrate Judge emphasized her support for Sedona Principle 6, which states:
Responding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information. The Sedona Principles: Best Practices, Recommendations & Principles for Addressing Electronic Document Production, Second Edition.

The Magistrate Judge then urged the parties, in light of this principle and the extensive work already completed by the Defendants, to consider whether the Defendants’ Boolean search methodology might be refined or supplemented in some way that would satisfy the Plaintiffs without scrapping all of the work done by the Defendants or forcing her to decide the issue.

For five months after these hearings, the parties continued to meet with each other and with the Magistrate Judge about this and other issues, and in August of 2012, the parties reached an agreement regarding search methodology for the first phase of discovery. For all discovery requests propounded before October 1, 2013, the Plaintiffs dropped their demand for the Defendants to employ predictive coding and agreed to accept the Defendants’ use of their Boolean search methodology. This agreement was memorialized in a “Stipulation and Order Relating to ESI Search” issued by the Magistrate Judge on August 21, 2012, as an attachment to her “Memorandum Opinion and Order” of that date.

The parties’ agreement foreclosed the possibility of an order like Magistrate Judge Peck’s in Moore. There is no judicial decision here explicitly approving predictive coding or any other search methodology. The case is still important, however, for three reasons.

1. First, the transcripts of the hearings include extensive discussion of, and testimony about, predictive coding, Boolean searching, and validation methodologies, all of which is educational and worth reading:
 February 21, 2012 Hearing Transcript
 March 28, 2012 Hearing Transcript (Exhibit A to Joint Status Conference Report No. 3)

2. Second, it’s important for the emphasis the Magistrate Judge places on Sedona Principle 6. Although she did not have to choose a methodology for the parties, she seems to suggest that she would have deferred to the Defendants on the issue, if the parties could not cooperate. At the March 28th hearing she said:
“I am a believer of principle 6 of Sedona, and I’m not just because it’s Sedona, but I think the people who are producing the records, producing the documents, are in a better position to know [emphasis added], since they have to do the work, spend the money, spend the time, they know their people, they know their material, so as a basic premise, I think that’s a pretty fair premise here.”
This idea, that producing parties are in the best position to judge methods for production, is an important one that will come up again in later predictive coding cases.

3. Third, it’s important for the emphasis the Magistrate Judge places on the results and their validation rather than the tool or process used to achieve them. Also at the March 28th hearing, the Magistrate Judge said of the predictive coding topic that she would “. . . almost call it a detour . . . .” She went on to say, to the Plaintiffs:
“I assume . . . what you really are interested in is a search, regardless if it’s Boolean or computer-assisted [emphasis added], that is fair and statistically – and that can be validated statistically because that would be a good word search.”
This idea, that requesting parties’ real interest is in the validity of the resulting production rather than the method by which it is achieved, fits hand-in-glove with the idea that producing parties know best, and it too, will come up again in later predictive coding cases.

GLOBAL AEROSPACE
We turn our attention next to Global Aerospace Inc. v. Landow Aviation, L.P., No. 61040 (Loudoun County, Va. Cir. Ct. Apr. 23, 2012), which was the first case in which the use of predictive coding was judicially permitted over the requesting parties’ objections.

The Global Aerospace case concerns the collapse of three hangars at the Dulles Jet Center during a snowstorm in 2010, resulting in the destruction of 14 private jets. The Defendants were responsible for the facilities and undertook document preservation and collection activities soon after the incident. Over 10 weeks in February, March and April of 2010, the Defendants collected over 8TB of data from individual employees’ computers and from office servers.After collection, the Defendants’ vendor consolidated, de-duplicated, de-NISTed, and otherwise filtered the collection down to approximately 200GB of reviewable e-mail and office documents. In April 2011, a follow-up collection was completed to gather any subsequently generated materials, which added approximately another 50GB of reviewable e-mail and office documents. The Defendants estimated that this pool of approximately 250GB of reviewable data represented in excess of 2,000,000 unique documents requiring review for potential production.

Due to the size of this dataset, the Defendants proposed using predictive coding rather than either a complete, linear first-pass review or the use of keyword searching. The Plaintiffs did not agree to the Defendants’ proposal, and so, the Defendants moved the Judge for a protective order allowing them to proceed with predictive coding over the Plaintiffs’ objections.

The Defendants’ “Memorandum in Support of Motion for Protective Order Approving the Use of Predictive Coding” makes a clear, strong case for the use of predictive coding in the matter. In making this case, the Defendants do three things worth reviewing:
1. First, the Defendants discuss at length the relative merits of manual review, keyword searching, and predictive coding, using several prominent studies and the Moore case to demonstrate, with details and examples, the likely lower costs and better results of a predictive coding effort.

o Moreover, in evaluating the merits of keyword searching, the Defendants also include the results of preliminary testing done on proposed search terms to demonstrate concretely the extent of their imprecision and inconsistency, not just in theory, but in this matter, with this data.

2. Second, the Defendants describe with specificity the predictive coding protocol they intend to use and include within that protocol an opportunity for the Plaintiffs to review all non-privileged, non-sensitive documents reviewed to train the predictive coding system before the training is finalized.
o Additionally, the Defendants set forth a validation protocol for sampling the final classifications of the trained predictive coding system to measure overall recall, and unlike the Defendants in Moore, they commit themselves to achieving at least 75% recall at the end of the predictive coding process.
 That number was chosen based on what the reviewed studies revealed to be the average human review recall level (59.3%) and the average predictive coding recall level (76.7%).


3. Finally, the Defendants position their request well within the existing Rules of the Supreme Court of Virginia (which mirror the Federal Rules of Civil Procedure in these areas) and associated case law, in two ways:
o First, they make a proportionality argument based on the relative time and cost of manual, linear review versus predictive coding review.
o Second, they preemptively explain why achieving 75% recall is good enough to satisfy the “reasonable inquiry” requirement that all discovery productions carry, and how this exceeds what has been traditionally expected:
Ironically, what is being proposed in this case to ensure “reasonable inquiry” is far more than has ever been done with traditional discovery [emphasis added]. Never has it been suggested that a producing party would be obligated to sample the documents determined to be irrelevant by first-pass reviewers to 
demonstrate the adequacy of that first-pass review. Nor is it a typical practice to review the documents left behind by a keyword search, even though upwards of 80% of the relevant documents typically are missed. The ESI protocol proposed by [the Defendants] goes well beyond what would otherwise be available to opposing counsel to ensure that relevant documents are being reviewed for production.
As expected, the Plaintiffs principal objection was to the Defendants commitment to achieve a minimum recall of 75%. The Plaintiffs argued in their opposition brief that – despite the Defendants’ explanation above – the rules require the production of all responsive materials, not just ¾ of them. Unlike in Moore, however, the Plaintiffs did not just want the protocol changed, they wanted traditional human review methods used.

Ultimately, the Judge granted the Defendants’ motion and allowed them to proceed with their proposed predictive coding protocol, over the Plaintiffs’ objections. Unfortunately, the order does not go into any detail regarding the parties’ arguments for and against the protocol or regarding the basis for the Judge’s decision approving the proposed protocol.

Although this case does not provide us with an in-depth order, it does provide actual results of a completed predictive coding process:
 After completing the process, the Defendants’ executed their promised validation steps and determined that they had achieved a recall rate of 81%, exceeding their promised minimum of 75%.
 They also estimated their elusion rate at 2.9%, which despite being low, meant that more than 32,000 responsive documents may have been missed by the process.

Notwithstanding this, no timely objections to the resulting production were made by the Plaintiffs so no further debate regarding the sufficiency of this result is available for our review.

IN RE: ACTOS
We turn our attention next to In Re: Actos (Pioglitazone) Products Liability Litigation, No. 6:11-md-2299 (W.D. La. July 27, 2012), which was large-scale product liability litigation alleging that diabetes drug Actos increased the risk of bladder cancer in patients. Following closely on the heels of Moore, Kleen Products, and Global Aerospace, the Actos case was the first example of Defendants and Plaintiffs cooperatively reaching agreement on a predictive coding protocol that was then judicially memorialized.

The detailed protocol in Actos was memorialized in a Case Management Order issued by Judge Rebecca Doherty on July 27, 2012. The protocol is notable for its proof-of-concept nature, its transparency and cooperation, and its validation.

PROOF OF CONCEPT
The first notable aspect of this protocol is that it is designed as a “proof of concept” to see whether or not the predictive coding tool (Equivio Relevance, in this case) is a viable option for use in the case as a whole.To test this out, the parties agreed to first apply their predictive coding protocol to a “sample collection population” comprised of the e-mail of four mutually-chosen key custodians, plus selected regulatory documents. The entire, transparent and collaborative protocol would then be applied to the sample collection population, and afterwards, the parties would meet and confer to assess the results of the process, negotiate any adjustments, and agree on a final protocol for proceeding with discovery for the remainder of the custodians and sources.This proof-of-concept approach allowed both parties a lower-risk way to experiment with a novel approach to large-scale discovery, without having to commit entirely to adopting it untested.

TRANSPARENCY AND COOPERATION
The second notable aspect of this protocol is its emphasis on transparency for the Plaintiffs and on cooperation and collaboration between the parties.
At each point in the process where the control set, the training sets, or the validation sets are reviewed and coded, the Plaintiffs not only get to see all (non-privileged) documents reviewed by the Defendants and how they are coded, they get to participate in that review and coding directly. Moreover, at each phase of the protocol, both the Plaintiffs and the Defendants must agree about relevance classifications and be satisfied with the protocol results before the next phase of activity commences.
For example, the Plaintiffs and the Defendants each get to select three subject matter experts to collaborate on the relevance coding. If the experts cannot agree on relevance, the parties must meet and confer; and, if the parties cannot agree on relevance, the Court can be involved. The Plaintiffs and the Defendants also agreed to meet and confer, after the system is trained, to mutually determine the relevance score below which review and production should be cut off, based in part on the recall and precision estimates calculated by the software.

VALIDATION
The final notable thing about this protocol is its plan for validation of the process and the final protocol results before acceptance.
1. First, the parties agreed to continue their collaborative control set review until enough relevant documents are found to achieve the software’s highest category of statistical validation, and then to go beyond the minimum for that category until enough are found to reach a margin of error of +/-5% (at a 95% confidence level). The software would then use this control set to estimate prevalence and to calculate recall and precision during the subsequent iterative training phase.

2. Second, at the end of iterative training but before human review of the top scoring documents, the parties would collaboratively review a sample of the documents below the mutually-selected relevance score cutoff (i.e., the documents deemed not relevant by the software and excluded from subsequent human review). This is a measurement of elusion to check for missed materials, and its results wouldbe used to determine if prevalence of missed materials is sufficiently low and if proportionality is served by using the previously-agreed relevance score cutoff.

3. Third, the parties would collaboratively review a sample of documents that were above the mutually-selected relevance score cutoff but were coded as not relevant during subsequent human review for production (i.e., the documents for which human reviewers overruled the software’s determination of likely relevance). This is a check of the Defendants’ final human review process to let the Plaintiffs check for issues with their relevance determinations.
Taken as a whole, the discovery plan in the case management order provides a good example of detailed discovery planning and an even better example of how to approach discovery and the use of predictive coding cooperatively. It represents a particularly sharp contrast with the discovery issues in Kleen Products, which were as much about the failure of cooperation as about the use of predictive coding.

IN RE: BIOMET
Our next important case to review is In Re: Biomet M2a Magnum Hip Implant Products Liability Litigation, No. 3:12-MD-2391 (N.D. Ind. Aug.21, 2013), which was large-scale product liability litigation relating to “metal-on-metal” replacement hip systems manufactured by Biomet.

The Biomet litigation was initially dozens of separate lawsuits around the country, but on October 2, 2012, the United States Judicial Panel on Multidistrict Litigation issued an order consolidating the cases for combined pretrial proceedings. Ultimately, the combined litigation was settled in February 2014 for approximately $56 million, but before that settlement was reached, extensive discovery was undertaken, and two noteworthy orders were issued concerning the Defendants’ use of predictive coding.

The Defendants in Biomet initially collected a population of 19.5 million documents that might potentially be relevant. The Defendants then used keyword searches to cull that population down to 3.9 million documents and used de-duplication to further cull that down to about 2.5 million unique search results. The Defendants also used sampling to measure prevalence as they went:
 Sampling of the total 19.5 million document pool:

  • The Defendants estimated, with a 99% confidence level, that between 1.37% and 2.47% of this pool was relevant, i.e., that about 267,000 – 482,000 total relevant documents existed in the complete collection. Sampling of the 15.6 million document pool, post-keywords:
  • The Defendants estimated, with a 99% confidence level, that between 0.55% and 1.33% of this pool was still relevant, i.e., that about 85,000 – 207,000 total relevant documents were missed by their keyword searches and omitted from review and production. Sampling of the de-duplicated 2.5 million document pool:
  • The Defendants estimated, with a 95% confidence level, that between 14.41% and 17.91% of this pool was relevant, i.e., that about 360,000 – 448,000 total relevant documents were successfully found by their keyword searches.

 (The Defendants stated that “14.41% and 17.91%” of 2.5 million was equivalent to “between 184,268 and 229,162,” but this is mathematically inconsistent. Their reported sampling results of 273 relevant documents out of 1,689 documents reviewed indicate that it is the percentages claimed that are correct and the document counts that are stated incorrectly.)

The Defendants then applied predictive coding to the pool of 2.5 million de-duplicated search results using Recommind’s Axcelerate and its associated workflow. The Plaintiffs objected to this process, giving rise to the April 2013 Order.
APRIL 2013 ORDER

The Plaintiffs did not object to the use of predictive coding itself. Rather, the Plaintiffs argued that the Defendants had tainted their predictive coding process by applying keywords to the 19.5 million document population initially collected. The Plaintiffs argued that the Defendants should have applied the predictive coding process directly to the entire population, citing the familiar studies showing the superiority of predictive coding to keyword searching.

The Defendants offered to let the Plaintiffs choose additional keywords to be run against the larger population and to make other concessions, but the Plaintiffs wanted the Judge to order the Defendants to start over using the predictive coding process on the full 19.5 million document population so that more responsive materials could be located than had been found by the keyword searches.
The Judge, however, did not accept the Plaintiffs framing of the issue, writing:
The issue before me today isn’t whether predictive coding is a better way of doing things than keyword searching prior to predictive coding. I must decide whether [the Defendants’] procedure satisfies [their] discovery obligations and, if so, whether [they] must also do what the [Plaintiffs] seek].

The Judge goes on to reject the Plaintiffs’ request for two reasons:
1. First, the Judge concludes that the Defendants’ discovery process complied fully with the applicable rules and principles of practice, and therefore satisfied their discovery obligations.

  • This is an interesting conclusion in light of the Defendants’ own sampling showing that their keyword searching had likely missed 30-50% of the responsive documents.
  • In the Order, the Judge compares the prevalence of responsive documents in the search results (14.41% to 17.91%) to the prevalence of responsive documents in the null set (0.55% to 1.33%), as evidence of the efficacy of the process, but that comparison is misleading because of the enormous difference in population sizes to which those percentages applied (2.5 million versus 15.6 million).

 A comparison of the corresponding estimated document counts would have been more informative: 85,000 – 207,000 missed and 360,000 – 448,000 found out of 267,000 – 482,000 total relevant documents in the collection.

2. Second, the Judge concludes that the expense of the Plaintiffs’ proposed course of action – millions of additional dollars – would outweigh the likely benefits of finding additional relevant documents.

AUGUST 2013 ORDER:   The second noteworthy order in Biomet came four months later and concerned the disclosure of the seed set documents used to train the predictive coding software. The Plaintiffs asked the Defendants to disclose the documents that were part of the seed set, so that the Plaintiffs could review what was included to inform their suggestion of additional search terms to find what was excluded. The Defendants confirmed that all of the seed set documents had been produced already, but declined to identify which of the produced documents they were.The Plaintiffs then sought to have the Judge compel the identification they sought. The Judge, however, concluded that, because the documents had already been produced, he had no authority to compel anything further:The [Plaintiffs] want[] to know, not whether a document exists or where it is, but rather how [the Defendants] used certain documents before disclosing them. Rule 26(b)(1) doesn’t make such information disclosable.Although he concluded that he had no authority to compel the Defendants to identify their seed set (as long as all relevant documents had been produced), the Judge did describe the Defendants’ recalcitrant position as “troubling” and “below what the Sedona Conference endorses.” He also went on to suggest that no authority to compel did not mean no consequences for uncooperative conduct, writing:“An unexplained lack of cooperation in discovery can lead a court to question why the uncooperative party is hiding something, and such questions can affect the exercise of discretion.”

Thus, disclosure may not be required, de jure, but Judges’ preference for it may make disclosure a de facto requirement for using predictive coding.

PROGRESSIVE
We turn our attention next to Progressive Casualty Insurance Company v. Delaney, No. 2:11-cv-00678 (D. Nev. July 18, 2014), which concerned the failure of Sun West Bank and the Plaintiff’s potential insurer liability for lawsuits against the directors and officers of the bank. In Progressive, the parties negotiated a “Joint Proposed ESI Protocol,” which the Magistrate Judge approved in an order issued October 24, 2013.

According to the Protocol, the Plaintiff would collect materials from agreed upon sources, run agreed upon search terms against them, and then either produce all the non-privileged results or manually review for responsiveness, etc., and then produce. Clawback provisions were included to protect against the inadvertent disclosure of privileged materials. Pursuant to the Protocol, the Plaintiff’s eDiscovery vendor collected approximately 1.8 million documents.
Running the chosen search terms against this collection returned 565,000 documents. The Plaintiff began manual review of the results but determined it would take 6-8 months and “an unacceptably high cost” to complete manual review of that many documents. The Plaintiff began exploring other options instead, including predictive coding. On December 20, 2013, the Plaintiff notified the Defendants that the original review plan was not feasible and that it planned to propose an alternative but would not disclose the details of its new plan.

On December 27, 2013, the Defendants filed a Motion to Compel the Plaintiff to begin its production since the period for discovery was drawing to a close and the Defendants had not received any ESI production yet. Over the next three months, the parties engaged in a series of hearings with the Magistrate Judge and meetings with each other attempting to negotiate a joint revision to the ESI protocol. Ultimately, the parties were not able to reach agreement, and on March 20, 2014, the parties submitted competing proposals to the Magistrate Judge.

THE PLAINTIFF’S PROPOSAL
Faced with the prospect of manually reviewing more than half a million documents, the Plaintiff began working with Equivio’s Relevance software. After reviewing a few thousand documents as a control set and a training set, the software identified just over 90,000 potentially responsive documents for review. Automated privilege screening suggested that 63,000 of these documents were not likely to be privileged and 27,000 of them were likely to be privileged.

Having already performed this work, and preferring the time and expense of reviewing under 100,000 documents to reviewing over 500,000 documents, the Plaintiff proposed that the ESI Protocol be modified to cover the work already performed and allow it to continue with the predictive coding process. Its next step would be to produce the 63,000 predictive coding results deemed not likely to be privileged, without further review (relying upon the clawback provisions as necessary). After that, the Plaintiff would manually review the other 27,000 predictive coding results for privilege and produce the non-privileged documents. The Plaintiff estimated that this process would end up producing over 70% of the total number of responsive documents in the pool of 565,000 search results.


THE DEFENDANTS’ PROPOSAL
The Defendants opposed the Plaintiff’s predictive coding actions and proposal both for their departure from the original, negotiated agreement and for three additional reasons:
1. First, the Defendants argued that predictive coding is complicated and had led and would continue to lead to a host of “satellite disagreements” about the process being employed.
2. Second, the Defendants argued that the Plaintiff had developed its predictive coding process without the transparency and cooperation emphasized in the case law and advocated by the Plaintiff’s own expert.
3. Finally, the Defendants argued that the Plaintiff’s process was unacceptable, because it had failed to follow the software provider’s own best practices for its use by using it only on the keyword search results.
Rather than allowing the Plaintiff to proceed with predictive coding, the Defendants proposed that the Plaintiff be directed to immediately turn over all 565,000 results of the negotiated search terms and to rely on the clawback provisions to get back privileged documents produced within that population. In the alternative, if predictive coding was to be permitted, the Defendants proposed that the Plaintiff be required to start over using the predictive coding software on the full 1.8 million document collection.

THE MAGISTRATE JUDGE’S ORDER
The Magistrate Judge was persuaded by the Defendants and did not accept the Plaintiff’s proposal to switch to predictive coding mid-stream. The Magistrate Judge writes favorably of predictive coding in theory, but sees the terms of the negotiated ESI protocol and the Plaintiff’s attempt to deviate from them unilaterally as governing in this instance:

In this case, the parties negotiated an ESI protocol which was adopted by the court . . . Had the parties worked with their e-discovery consultants and agreed at the onset of this case to a predictive coding-based ESI protocol, the court would not hesitate to approve a transparent, mutually agreed upon ESI protocol. However, this is not what happened.

The Magistrate Judge then goes on to express significant concerns over the approach to predictive coding that the Plaintiff was trying to take – in terms of its transparency, their cooperation, and the technology:
 “[The Plaintiff] proposes a ‘do-over’ of its own invention that lacks transparency and cooperation regarding the search methodologies applied.”
 “[The Plaintiff] is unwilling to engage in the type of cooperation and transparency that its own e-discovery consultant has so comprehensibly and persuasively explained is needed for a predictive coding protocol to be accepted by the court or opposing counsel as a reasonable method to search for and produce responsive ESI.”
 “[The Plaintiff] is also unwilling to apply the predictive coding method it selected to the universe of ESI collected. The method described does not comply with all of Equivio’s recommended best practices.”

Based on these concerns, the Magistrate Judge concluded that the Defendants were correct in their prediction of complex, ongoing disputes if a transition to predictive coding were allowed. Instead, the Magistrate Judge adopted the Defendants’ proposal and ordered the Plaintiff to produce all 565,000 search results, excepting those identified by the automated privilege screening as likely to be privileged, which could be withheld and logged (an approach contemplated by the original negotiated protocol).

Although predictive coding was not permitted in this case, it is not really a blow to the further adoption and use of predictive coding. The Magistrate Judge makes clear that she would have approved its use under different circumstances, particularly had there been more transparency and cooperation.
It is worth noting, however, one small problem with the Magistrate Judge’s commentary on transparency. In reviewing the Moore and Actos cases, the Magistrate Judge refers to them as requiring full transparency (e.g., seed set disclosure) for predictive coding’s use, when in both cases, the producing parties actually offered such transparency voluntarily with no judicial requirement imposed.

BRIDGESTONE
The next important case for our review is Bridgestone Americas, Inc. v. International Business Machines Corp., No. 3:13-cv-01196 (M.D. Tenn. July 22, 2014), which concerned a multi-year project to develop and implement new integrated computer systems and software solutions for ordering, fulfillment and other back-office functions for Bridgestone. In this case, as in Progressive, a discovery plan had been agreed upon and memorialized in a case management order.

After beginning the discovery process and using search terms provided by the Defendant, the Plaintiff was faced with reviewing approximately 2,000,000 documents. Rather than proceeding with full manual review, the Plaintiff asked to be permitted to use predictive coding to review the document instead.
The Defendant objected that such allowance would be an “unwarranted change” in the original order and that “it is unfair to use predictive coding after an initial screening has been done with search terms.” Both parties filed extensive pleadings, and the Magistrate Judge conducted a “lengthy telephone conference” with the parties to evaluate the issues and their arguments.
In his July 22 Order, the Magistrate Judge opts not to rehash these issue and arguments, instead referencing the many other writings on predictive coding already available and saying that it boils down to a “judgment call” about efficiency and cost-efficacy:

Predictive coding is a rapidly developing field in which the Sedona Conference has devoted a good deal of time and effort to, and has provided various best practices suggestions. Magistrate Judge Peck has written an excellent article on the subject and has issued opinions concerning predictive coding. In the final analysis, the uses [sic] of predictive coding is a judgment call, hopefully keeping in mind the exhortation of Rule 26 that discovery be tailored by the court to be as efficient and cost-effective as possible. In this case, we are talking about millions of documents to be reviewed with costs likewise in the millions. There is no single, simple, correct solution possible under these circumstances [emphasis added].
In light of the number of documents, the potential costs, and the “exhortation of Rule 26,” the Magistrate Judge then allows the Plaintiff to “switch horses in midstream” and utilize predictive coding rather than manual review to get through the 2,000,000 search result documents. He goes on to emphasize the importance of transparency and openness, including the Plaintiff’s promise to share seed set documents with the Defendant. He finishes by extending the option to switch to predictive coding to the Defendant as well (even though the Defendant was already 1/3 to 1/2 done with its manual review).

This case is ongoing, as is the predictive coding effort, and on February 5, 2015, the Magistrate Judge issued another order that touched, inter alia, on predictive coding. In that order he writes that changes are being made to the seed set, “in view of the fact that on review some of the documents listed as nonresponsive were, in fact, responsive.” He indicates that the parties are in discussion and no issues yet require his resolution, but he emphasizes again that, “to the extent [the Parties] use predictive coding, he expects full transparency in how the predictive coding is established and used.”

DYNAMO
We turn our attention next to Dynamo Holdings Limited Partnership, et al., v. Commissioner of Internal Revenue, 143 T.C. No. 9 (Sept. 17, 2014), which concerned the taxable status of certain transfers made between the Petitioners, and which is the penultimate case in our review.

In this case, the Respondent sought to have the Petitioners produce the ESI stored on two specific back-up tapes or produce copies of the tapes themselves. Petitioners estimated that reviewing the ESI on the two back-up tapes for relevance, privilege and confidential information would “take many months and cost at least $450,000.”

To avoid this time and expense, the Petitioners asked the Judge to permit them to utilize predictive coding instead of manual review. The Respondent objected to this request arguing that: (a) predictive coding is an “unproven technology”; and (b) that the Petitioners can just produce the tapes in their entirety, subject to a clawback agreement, rather than actually reviewing the materials.

The Judge begins his analysis by addressing head-on the question of whether or not a producing party needs to seek permission before using predictive coding. In his opinion, producing parties should not need to involve judges in their electronic production processes any more than they did in their historical paper production processes. Moreover, he explains that the appropriate time for objections to a discovery methodology is still after a production has been made that the receiving party believes to be incomplete:

And although it is a proper role of the Court to supervise the discovery process and intervene when it is abused by the parties, the Court is not normally in the business of dictating to parties the process that they should use when responding to discovery [emphasis added]. If our focus were on paper discovery, we would not (for example) be dictating to a party the manner in which it should review documents for responsiveness or privilege, such as whether that review should be done by a paralegal, a junior attorney, or a senior attorney. Yet that is, in essence, what the parties are asking the Court to consider – whether document review should be done by humans or with the assistance of computers. Respondent fears an incomplete response to his discovery. If respondent believes that the ultimate discovery response is incomplete and can support that belief, he can file another motion to compel at that time [emphasis added].Despite this view, he goes on to assess the Petitioners’ request and to address the use of computer-assisted review tools, since the United States Tax Court had not yet done so.
After reviewing the testimony of experts for the Petitioners and the Respondent, as well as relevant articles and cases, including Moore, Progressive, and In Re: Actos, the Judge determines that predictive coding is not, in fact, “unproven technology.” Rather, he finds that it is a tool widely accepted in “the technology industry” and in past federal cases, particularly when coupled with transparency and cooperation. Based on this, he concludes that there is no reason not to let the Petitioners use predictive coding:

Where, as here, petitioners reasonably request to use predictive coding to conserve time and expense, and represent to the Court that they will retain electronic discovery experts to meet with respondent’s counsel or his experts to conduct a search acceptable to respondent, we see no reason petitioners should not be allowed to use predictive coding to respond to respondent’s discovery request.

RIO TINTO WRAP-UP
For our final case, we turn our attention to the recent Rio Tinto PLC v. Vale, S.A., et al., No. 1:14-cv-03042-RMB-AJP (S.D.N.Y. Mar. 2, 2015), which brings us back full circle to where we started – with a new decision from Magistrate Judge Peck. The Magistrate Judge’s order in Rio Tinto is his acceptance of a protocol including predictive coding to which both the requesting and producing parties stipulated, so he does not include the resolution of any new, contested issues in this area. He does, however, provide an excellent synthesis of the case law to date and the open issues we still face today.

This final case, then, also serves as a wrap-up of the material reviewed in this white paper, with Magistrate Judge Peck providing us with our four key takeaways:

1. First, he writes about the clear trend of judges accepting predictive coding, saying that, “[in the three years since Da Silva Moore, the case law has developed to the point that it is now black letter law that where the producing party wants to utilize TAR for document review, courts will permit it [emphasis added].”

2. Second, he explains that the primary “TAR issue that remains open is how transparent and cooperative the parties need to be with respect to the seed or training set(s),” and he explains that both case law and practitioner opinion are divided on this issue, noting that “where the parties do not agree to transparency, the decisions are split and the debate in the discovery literature is robust.”

3. Third, he makes the excellent-but-under-discussed point that seed set and training set disclosure is not the only avenue for validation by a requesting party:
i. In any event, while I generally believe in cooperation [footnote omitted], requesting parties can insure that training and review was done appropriately by other means, such as statistical estimation of recall at the conclusion of the review as well as by whether there are gaps in the production, and quality control review of samples from the documents categorized as non-responsive [emphasis added].

4. Finally, he emphasizes a critical point also made very clearly in Dynamo: that it’s time to stop judging predictive coding more harshly than older review techniques:
One point must be stressed - it is inappropriate to hold TAR to a higher standard than keywords or manual review [emphasis added]. Doing so discourages parties from using TAR for fear of spending more in motion practice than the savings from using TAR for review.

While it may be too soon to call the permissibility of predictive coding “black letter law” with so few jurisdictions having weighed in (and those only at the trial court level), the clear trend is towards acceptance. Even the Magistrate Judge who denied permission for its use in Progressive made clear that she did not do so because of any reservations about the use of the technology in general, but only because of the prior agreement and uncooperative behavior specific to that case.

 If you wish to receive prior judicial approval for the use of predictive coding and you have the agreement of the requesting party to your proposed process, then you should be able to count on receiving that approval.

 If you wish to receive prior judicial approval for the use of predictive coding but the requesting party opposes its use, then you should be able to count on receiving that approval anyway, if your proposal includes adequate transparency and cooperation (e.g., seed/training set disclosure, joint classification review, etc.) and does not run afoul of any previously approved discovery agreements.

 If you want prior judicial approval for the use of predictive coding but the requesting party opposes its use and you wish to proceed without transparency or cooperation in the execution, you may not receive prior approval. The outcome will depend on the judge, the circumstances, and the producing party’s conduct.
The more interesting question – one which is highlighted by the judges in Biomet, Dynamo, and Rio Tinto – is whether producing parties should seek prior judicial approval at all, when they would never do so for any traditional methodology. Our review of the predictive coding cases so far suggests that judges are eager for producing parties to get back to managing their own processes, for requesting parties to get back to focusing on the completeness of the results of those processes, and for all parties to get back to working more of it out between themselves without recourse to judicial resolution.

ABOUT THE AUTHOR
Matthew Verga is an electronic discovery consultant and practitioner proficient at leveraging a combination of legal, technical, and logistical expertise to develop pragmatic solutions for electronic discovery problems.
Matthew has spent the past eight years working in eDiscovery, four years as a practicing attorney with an AmLaw 100 firm and four years as a consultant with eDiscovery service providers. He has personally designed and managed many large-scale eDiscovery efforts, and overseen the design and management of numerous others, as both an attorney and a consultant. He has also provided consultation and training for AmLaw 100 firms and Fortune 100 companies, and has written and spoken widely on eDiscovery issues.
Matthew is currently the Director of Content Marketing and eDiscovery Strategy for Modus eDiscovery. In this dual role, he is responsible for managing assessments of law firms’ and corporations’ eDiscovery readiness, as well as for the creation of articles, white papers, educational programs, and other substantive content in support of Modus’ marketing, branding and thought leadership efforts

Looking for more of the latest headlines on LinkedIn?

Discover more stories