Research and Data in this Season of Change

September 3, 2020

This year has been one of those that will, for the rest of our lives, divide time into before and after. A global pandemic has altered so much that we assumed was static and unmoving. Unrest brought on by more and more senseless killings of Black men and women by the very people who are sworn to protect. We have a duty to ensure that we learn from this season of change and to use the knowledge we have gained to create a better world. To do so, we must be guided by data and evidence. We must also take stock of our data and research practices, acknowledge where we can improve, and do all that is within our power to make those improvements.

COVID-19 and the Urgent Call for Research

COVID-19 has changed so much about our legal system—changes that would have been unthinkable not even a year ago. Courthouses have closed. Court hearings have gone virtual. Many states have cancelled the bar exam and created alternative paths for new law graduates to be admitted to the bar. There has been widespread discussion about whether and how these changes might continue into the future. And there have been urgent calls for research to understand how well these changes work and how they impact the delivery of and access to justice. Of course, the urgency is real. But there are other realities that we must consider as we address the immediate need for data.

One of those realities is that research takes time. Research done well often takes a lot of time. Questions for the research must be carefully and thoroughly defined, procedures must be developed that provide a proper avenue for answering the research questions, the data must be collected and analyzed, and the findings must be interpreted and understood. Each of these steps has a number of subcomponents, and cutting corners at any stage has serious implications for the integrity of the work. With respect to studying issues that have arisen around COVID-19—or any emergent phenomenon—there is a need to ensure that enough time has passed to accumulate sufficient data to study. The need to allow time for research to be conducted, and conducted well, often runs in direct competition with the urgent need for data.

Another reality we are faced with is that much of the data we need doesn’t exist, isn’t made available, or is collected in inconsistent ways. For instance, state courts collect information about their cases and litigants in drastically different ways, which makes inter-jurisdictional comparisons difficult at best. Court case management systems are designed to do just that—manage cases—not accommodate research needs. Further, they often don’t collect some of the most important information to researchers, such as whether parties are self-represented. I am certainly not the first to raise these issues. The National Open Court Data Standards project, led by the Conference of State Court Administrators (COSCA) and the National Center for State Courts (NCSC), was an effort to create standards for data collection that courts could use to provide consistency in how they collect data, in order to increase transparency and accessibility for use in research. But few courts have used these standards to inform how they structure their case management systems and data collection processes.

These challenges are exacerbated by inter-rater reliability issues. Courts generally have many different people responsible for entering information into their case management system, and different people code things in different ways. Take, for example, a case in which the complaint lists two counts: one count for breach of contract and one for fraud. What’s the case type? It depends on who’s entering the data. These inconsistencies are rough terrain for researchers and highlight the inherent challenges in conducting research based on court docket data.

I don’t mean to pick on courts. We have worked with numerous wonderful courts who have been incredibly gracious in sharing their data with us and have gone to great lengths to help us understand the data they provide. And they certainly aren’t the only place we researchers encounter issues. Law school data presents its own set of challenges. The ABA requires disclosure of what is actually a large amount of data from each law school, but it is all aggregated—so deeper analyses (like, say, correlation between LSAT scores and graduation vs. attrition, or looking for statistically significant relationships between GPA and gender) are impossible. Further, as a recent Law School Transparency report aptly points out, the ABA-required disclosures still leave many questions unanswered—for instance, they do not publish data on gender or race/ethnicity breakdowns for applicants.

Again, I don’t intend this to be a diatribe on the shortcomings of data sources for empirical legal research. But these are real obstacles to rigorous inquiry and, therefore, considerations we must keep in mind as we forge ahead in conducting research on the issues of the day. If we rally around the need for consistency and transparency in data—and make needed adjustments to data practices—we would be doing ourselves a great service.

Data, Research, and Racial Equity

In addition to a global pandemic, this year has seen a great deal of social unrest around racial equity—and rightly so. The unjustified killings of George Floyd, Breonna Taylor, and so many others are just the tip of the systemic racism iceberg. But what does this mean for how we conduct empirical legal research? It presents us with the opportunity—and, in my opinion, the duty—to take an introspective and critical look at how data and research practices may contribute, even inadvertently, to the problem.

The low-hanging fruit here is the growing affinity for algorithms. While these tools present amazing opportunities for the future of the legal system, they are also fraught with danger if not designed and implemented with great care (see Cathy O’Neil’s excellent book Weapons of Math Destruction for more). When they are poorly designed, they can create feedback loops—think garbage in, garbage out, then garbage back in—which reinforce inequity and biased policies. In the criminal arena, use of algorithms for sentencing based on recidivism likelihood scores and for predictive policing creates incredible—and completely unacceptable—inequity for the poor and people of color.

Another example of harmful algorithms can be found in the U.S. News rankings. The U.S. News law school rankings are nonsensical for several reasons, but perhaps the most egregious is the feedback loop that’s created by their reputational component—a whopping 40 percent of the ranking calculation. Nearly half of a school’s score is based upon how other schools, lawyers, and judges nationwide view that school. These perceptions are, in turn, impacted by that school’s U.S. News ranking. Vicious. Feedback. Loop. Law schools spend ungodly sums of money trying to increase their ranking and often pass those costs on to students—which, of course, price out a great many socioeconomically disadvantaged would-be law students.

Another perhaps more subtle area where we are vulnerable to building in inequity is in the very design of our research studies. Even when we do our best to ensure BIPOC are represented in our data, we can still fall short. I recently attended a virtual event hosted by the Urban Institute that focused on this issue—and it was eye-opening. I strongly encourage everyone reading this to check out the materials they provided on this topic, but I’ll highlight a few key points here.

At the highest level, we should be thinking about racial equity throughout the lifespan of our projects—from planning through reporting, we must implement an equity framework. For instance, we need to ensure diverse perspectives are at the table during the brainstorming and project planning phase, we should consider using qualitative data to complement quantitative findings to contextualize the data and provide insight into people’s lived experiences, and we must not assume that white outcomes are normative.

The unfortunate reality is that the community of people who create and administer programs that collect data in the legal system—as well as the community that uses that data for research—is made up almost entirely of white people. We must make efforts to create more inclusive legal and research communities, and do so quickly. But in the meantime, we must put our data and research approaches under a microscope and make the needed changes to ensure we put equity at the forefront now and in the future.

The Future of Empirical Legal Research

“When you come out of the storm, you won’t be the same person who walked in. That’s what this storm’s all about.”Haruki Murakami, Kafka on the Shore

I resisted the urge to begin with a quote about change, but I can’t resist the urge to end with one. Our world will never be the same after we emerge from this period of rapid transformation. We have to do all we can to ensure the future is better than the past. This means conducting well-designed research that acknowledges and does its best to mitigate limitations. It means looking closely and critically at every single one of our research and data practices to weed out those that are destructive. We can do this. And we must.