Op-Ed Policy Science

Genetics, Genealogy, and Personal Privacy


On TV, DNA evidence is portrayed as a slam-dunk way to pin a suspect to a crime. But what if you have DNA, but no suspect?

In April 2018, Joseph James DeAngelo, better known as the “Golden State Killer,” was arrested over 30 years after committing burglaries, rapes, and murders that terrorized Californians. This arrest was made possible with advances in DNA sequencing, but the genetic evidence did not directly lead authorities to him. Instead, they used the DNA to search public databases. That search identified DeAngelo’s third cousin, who had uploaded their DNA information to the website GEDmatch. From there, a quick investigation of the cousin’s family led authorities to DeAngelo.

This high-profile arrest is just the first of many examples where authorities used large public databases to identify a suspect via distant relatives. This technique has been called “long-range familial searches.” Dr. Yaniv Erlich, a computational biologist at Columbia University, found that in a six month period at least 25 arrests were made in the U.S. based on long-range familial searches.

Such arrests have raised ethical questions surrounding genetic data privacy. GEDmatch contains genetic data for over 1 million people, largely due to consumer genetic testing kits offered by companies such as AncestryDNA and 23andMe. While these companies do not currently share their databases with authorities or third parties, individuals can upload their raw data to third party websites, such as GEDmatch. Once in these public databases, it can then be accessed by law enforcement, criminals, and private companies to identify distant relatives.

In other words, sharing your genetic data not only puts yourself at risk of identification, but also distant relatives you might not even know. Indeed, Erlich estimates that “with a database size of ~3 million U.S. individuals of European descent…, more than 99% of people with this ethnicity would have at least a single third-cousin match.”

AncestryDNA and 23andMe already have databases with information about 10 and 5 million people, respectively. Although these companies do not currently share their data with law enforcement, that may not always be the case. Another consumer genetic testing company, FamilyTreeDNA, was recently exposed for sharing a database of over two million individuals with the FBI. FamilyTreeDNA did not notify customers about the change in their terms and services.

Even if a company never shares its database with law enforcement officials, there is the risk of hackers stealing the information. While a large-scale theft of credit card information can be remedied by customers canceling their credit cards, no such analog exists for genetic data. You can’t change your DNA, so once someone has it, there is nothing that can be done.

Additionally, local police departments can collect crime scene DNA samples, obtain results in-house, and upload the data to Codis, the national DNA database. While law enforcement may benefit from rapid DNA tests, such techniques are prone to abuse. Police departments can collect genetic data not only at major crime scenes, but also from individuals who seems suspicious. Racial profiling and implicit biases may inflate how often a certain group of people are entered in Codis or other databases. For example, the Chinese government has recently come under scrutiny for collecting genetic data from a majority Muslim ethnic group without informed consent. With this data, the Chinese government can track and identify individuals who do not conform to the Communist Party, and place them in “re-education” camps.

Currently, there are no clear standards for when it is appropriate for law enforcement and government agencies to collect someone’s DNA. Policymakers should pass legislation that protects individuals from DNA collection for minor crimes and misdemeanors. Law enforcement should not collect DNA until someone is convicted of a crime. Otherwise, law enforcement may abuse the power to arrest people solely for the purpose of collecting their DNA. As an alternative, law enforcement could collect DNA upon arrest, but only enter the suspect’s information into databases if they are found guilty.

The U.S. Department of Health and Human Services does not consider genetic data as identifiable information. However, Erlich found that long-range familial searches can identify research subjects even though the original, publicly available data was made anonymous. To protect individuals from unnecessary identification, the scientific community must establish ethical standards for personal genetic data privacy. Such standards will help shape policy regarding DNA collection and long-range familial searches.

Ethical standards will also pressure consumer genetic testing companies to protect and inform their customers. Companies must protect their customers from unnecessary and unwanted identification. Companies should also clearly notify consumers when terms and services change and give consumers the opportunity to purge their information from company databases. Without clear ethical standards, there is no way to hold consumer genetic testing companies accountable.

In the meantime, what can you do to protect your genetic data? If you are considering purchasing a consumer genetic testing kit, ask yourself if learning about your ancestry is worth the risk of someone else identifying you or your relatives. If you have already purchased a kit, think twice before uploading your data to a third party website. Unless you are trying to find your birth parents or a long-lost relative, uploading your data is probably not worth the risk. Have a conversation with your family so they know the risks of long-range familial searches. Lastly, encourage your legislators to propose policies that protect personal genetic data and limit when law enforcement can collect DNA samples.

Due to advances in DNA sequencing, it is now possible to obtain large-scale personal genetic data at a relatively low cost. Genetic and genealogical databases are large enough to identify an individual based on distant relatives’ genetics. Thus, new laws and ethical standards are necessary to protect individuals from unnecessary and unwanted identification.

Leave a Reply

Your email address will not be published. Required fields are marked *