Although DNA is individual to you—a “fingerprint” of your genetic code—DNA samples don’t always tell a complete story. The DNA samples used in criminal prosecutions are generally of low quality, making them particularly complicated to analyze. They are not very concentrated, not very complete, or are a mixture of multiple individuals DNA—and often, all of these conditions are true. If a DNA sample is like a fingerprint, analyzing mixed DNA samples in criminal prosecutions can often be like attempting to isolate a single person’s print from a doorknob of a public building after hundreds of people have touched it. Despite the challenges in analyzing these DNA samples, prosecutors frequently introduce those analyses in trials, using tools that have not been reviewed and jargon that can mislead the jury—giving a false sense of scientific certainty to a very uncertain process. This is why it is essential that any DNA analysis tool’s source code is made available for evaluation. It is critical to determine whether the software is reliable enough to be used in the legal system, and what weight its results should be given. 

A Breakdown of DNA Data

To understand why DNA software analyses can be so misleading, it helps to know a tiny bit about how it works. To start, DNA sequences are commonly called genes. A more generic way to refer to a specific location in the gene sequence is a “locus” (plural “loci”). The variants of a given gene or of the DNA found at a particular locus are called “alleles.” To oversimplify, if a gene is like a highway, the numbered exits are loci, and alleles are the specific towns at each exit.

[P]rosecutors frequently introduce those analyses in trials, using tools that have not been reviewed and jargon that can mislead the jury—giving a false sense of scientific certainty to a very uncertain process.

Forensic DNA analysis typically focuses on around 13 to 20 loci and the allele present at each locus, making up a person’s DNA profile. By looking at a sufficient number of loci, whose alleles are distributed among the population, a kind of fingerprint can be established. Put another way, knowing the specific towns and exits a driver drove past can also help you figure out which highway they drove on.

To figure out the alleles present in a DNA sample, a scientist chops the DNA into different alleles, then uses an electric charge to draw it through a gel in a method called electrophoresis. Different alleles will travel at different rates, and the scientist can measure how far each one traveled and look up which allele corresponds to that length. The DNA is also stained with a dye, so that the more of it there is, the darker that blob will be on the gel.

Analysts infer what alleles are present based on how far they traveled through the gel, and deduce what amounts are present based on how dark the band is—which can work well in an untainted, high quality sample. Generally, the higher the concentration of cells from an individual and the less contaminated the sample by any other person’s DNA, the more accurate and reliable the generated DNA profile.

The Difficulty of Analyzing DNA Mixtures

Our DNA is found in all of our cells. The more cells that we shed, the higher the concentration of our DNA can be found, which generally also means more accuracy from DNA testing. However, our DNA can also be transferred from one object to another. So it’s possible that your DNA can be found on items you’ve never had contact with or at locations you’ve never been. For example, if you’re sitting in a doctor’s waiting room and scratch your face, your DNA may be found on the magazines on a table next to you that you never flipped through. Your DNA left on a jacket you lent a friend can transfer onto items they brush by or at locations they travel to. 

Given the ease at which DNA is deposited, it is no surprise that DNA samples from crime scenes are often a mixture of DNA from multiple individuals, or “donors.” Investigators gather DNA samples by swiping a cotton swab at the location that the perpetrator may have deposited their DNA, such as a firearm, a container of contraband, or the body of a victim. In many cases where the perpetrator’s bodily fluids are not involved, the DNA sample may only contain a small amount of the perpetrator’s DNA, which could be less than a few cells, and is likely to also contain the DNA of others. This makes trying to identify whether a person’s DNA is found in a complex DNA mixture a very difficult problem. It’s like having to figure out whether someone drove on a specific interstate when all you have is an incomplete and possibly inaccurate list of towns and exits they passed, all of which could have been from any one of the roads they used. You don’t know the number of roads they drove on, and can only guess at which towns and exits were connected. 

Running these DNA mixture samples through electrophoresis creates much noisier results, and often contains errors that indicate additional alleles at a locus or ignore alleles that are present. Human analysts then decide which alleles appear dark enough in the gel to count or which are light enough to ignore. At least, traditional DNA analysis worked in this binary way: an allele either counted or did not count as part of a specific DNA donor profile.

Probabilistic Genotyping Software and Their Problems 

Enter probabilistic genotyping software. The proprietors of these programs—the two biggest players are STRMix and TrueAllele—claim that their products, using statistical modeling, can determine the likelihood that a DNA profile or combinations of DNA profiles contributed to a DNA mixture, instead of the binary approach. Prosecutors often describe the analysis from these programs this way: It is X times more likely that defendant, rather than a random person, contributed to this DNA mixture sample.

However, these tools, like any statistical model, can be constructed poorly. And whether, what, and how assumptions are incorporated in them can cause the results to vary. They can be analogized to the election forecast models from FiveThirtyEight, The Economist, and The New York Times. They all use statistical modeling, but the final numbers are different because of the myriad design differences from each publisher. Probabilistic genotyping software is the same: they all use statistical modeling, but the output probability is affected by how that model is built. Like the different election models, different probabilistic DNA software has diverging approaches for which, and at what threshold, factors are considered, counteracted, or ignored. Additionally, input from human analysts, such as the hypothetical number of people who contributed to the DNA mixture, also change the calculation. If this is less rigorous than you expected, that’s exactly the point—and the problem. In our highway analogy, this is like a software program that purports to tell you how likely it is that you drove on a specific road based on a list of towns and exits you passed. Not only is the result affected by the completeness and accuracy of the list, but the map the software uses, and the data available to it, matter tremendously as well.

If this is less rigorous than you expected, that’s exactly the point—and the problem. 

Because of these complex variables, a probability result is always specific to how the program used was designed, the conditions at the lab, and any additional or discretionary input used during the analysis. In practice, different DNA analysis programs have resulted in substantially different probabilities for whether a defendant’s DNA appeared in the same DNA sample, even breathtaking discrepancies in the millions-fold!

And yet it is impossible to determine which result, or software, is the most accurate. There is no objective truth against which those numbers can be compared. We simply cannot know what the probability that a person contributed to a DNA mixture is. In controlled testing, we know whether a person’s DNA was part of a DNA mixture or not, but there is no way to figure out whether it was 100 times more likely that the donor’s DNA rather than an unknown person’s contributed to the mixture, or a million times more likely. And while there is no reason to assume that the tool that outputs the highest statistical likelihood is the most accurate, the software’s designers may nevertheless be incentivized to program their product in a way that is more likely to output a larger number, because “1 quintillion” sounds more precise than “10,000”—especially when there is no way to objectively evaluate the accuracy.

DNA Software Review is Essential

Because of these issues, it is critical to examine any DNA software’s source code that is used in the legal system. We need to know exactly how these statistical models are built, and looking at the source code is the only way to discover non-obvious coding errors. Yet, the companies that created these programs have fought against the release of the source code—even when it would only be examined by the defendant’s legal team and be sealed under a court order. In the rare instances where the software code was reviewed, researchers have found programming errors with the potential to implicate innocent people.

Forensic DNA analyses have the whiff of science—but without source code review, it’s impossible to know whether or not they pass the smell test. Despite the opacity of their design and the impossibility of measuring their accuracy, these programs have become widely used in the legal system. EFF has challenged—and continues to challenge—the failure to disclose the source code of these programs. The continued use of these tools, the accuracy of which cannot be ensured, threatens the administration of justice and the reliability of verdicts in criminal prosecutions.

Related Issues