Gene mutations can be benign or cancerous – a new way to tell them apart could lead to better treatments

Scientists can now easily read cellular DNA to identify mutations. The challenge is that the human genome is massive and mutations are an integral part of evolution. (Picture Pixabay)

Most of the roughly 40 trillion cells in your body have nearly identical copies of your genome — DNA inherited from your parents, containing instructions for everything from converting food into energy to fighting infections. Healthy cells become cancerous due to harmful mutations in the genome. If a cell’s genome is damaged by ultraviolet light, for example, it can lead to mutations that signal the cell to grow out of control and form a tumor.

Identifying the genetic changes that make healthy cells malignant can help doctors select therapies that specifically target the tumor. For example, about 25% of breast cancers are HER2-positive, which means cells in this type of tumor have mutations that cause them to produce more of a protein called HER2 that helps them grow. Treatments that specifically target HER2 have dramatically increased survival rates for this type of breast cancer.

Scientists can now easily read cellular DNA to identify mutations. The challenge is that the human genome is massive and mutations are an integral part of evolution. The human genome is long enough to fill a 1.2 million page book, and two people can have about 3 million genetic differences. Finding a carcinogenic mutation in a tumor is like finding a needle in a pile of needles.

I am a computer scientist exploring large and complex genetic datasets to answer fundamental questions about biology and disease. My research team and I recently published a study using the DNA of thousands of healthy people to help identify disease-causing mutations using the principle of natural selection.

While genetic mutations are part of everyday life, some can lead to cancer.

Using big data to find cancerous mutations

To determine the type of cancerous mutation in a patient, the gold standard is to compare two samples from the patient: one from the tumor and one from healthy tissue (usually blood). As the two samples came from the same person, most of their DNA is identical; focusing only on genetic regions that differ from each other greatly narrows the location of a possible carcinogenic mutation.

The problem is that healthy tissue isn’t always harvested from patients, for reasons ranging from clinical costs to narrow research protocols.

One way around this problem is to look at massive public DNA databases. Since carcinogenic mutations are detrimental to survival, natural selection tends to eliminate them over time in successive generations. Of all the mutations in a tumor, those that occur less frequently in a given population are more likely to be harmful than changes shared by many people. By counting the frequency with which a mutation occurs in these databases, researchers can distinguish between common and probably benign genetic changes and those that are rare and potentially cancerous.

Given the power of this approach, there has recently been a flurry of projects aimed at collecting and sharing the DNA sequences of hundreds to thousands of individuals. These projects include the 1000 Genomes Project, the Simons Genome Diversity Project, GnomAD, and All of Us. There will likely be many more in the future.

Estimating the likelihood of a mutation causing disease based on how often it occurs in a genome is common for small genetic changes called single nucleotide variants (SNVs). SNVs affect a single position in the human genome of 3 billion neuclotides. It could, for example, change from a thymine T to a cytosine C.

Most researchers and clinical pathologists use a catalog of variants that have been detected in thousands of samples. If an SNV identified in a tumor is not listed in the catalog, we can assume that it is rare and may cause cancer. This works well for SNVs because detection of these mutations is usually accurate, with few false negatives.

However, this process breaks down in case of genetic changes on longer DNA strands called structural variants (SV). SVs are more complex as they include addition, deletion, inversion or duplication of sequences. Compared to much simpler SNVs, SVs have higher detection error rates. False negatives are relatively common, resulting in incomplete catalogs that make it difficult to compare mutations with them. Finding an SV tumor that is not listed in a catalog may mean that it is rare and likely to cause cancer, or that it was missed when the catalog was created.

Focus on verification

My colleagues and I solved these problems by moving from a process focused on detection to a process focused on verification. Detection is difficult – it requires the processing of complex data to determine if there is enough evidence to support the existence of a mutation. On the other hand, verification limits decision-making to the simple question of whether or not the available evidence supports the existence of a specific event. Instead of looking for a needle in a pile of needles, we just wonder if the needle we have is the one we want.

Our method takes advantage of this strategy by searching the raw data of thousands of DNA samples for any evidence in support of a specific SV. In addition to the efficiency advantages of only looking at data flanking the target variant, if there is no such evidence, we can confidently conclude that the target variant is rare and potentially pathogenic.

Using our method, we scanned SVs identified in previous cancer studies and found that thousands of SVs previously associated with cancers also appear in normal healthy samples. This indicates that these variants are more likely to be benign hereditary sequences rather than pathogenic sequences.

All of Us is a National Institutes of Health research program with the goal of making medicine more responsive to individual needs.

More importantly, our method performed as well as the traditional strategy that requires both tumor and healthy samples, opening the door to reducing costs and increasing the accessibility of high-quality analysis of cancerous mutations.

My team and I plan to expand our research to include large collections of tumors from different types of cancers such as breast and lung. Determining which organ a tumor originated from is critical to prognosis and treatment, as it can indicate whether or not the cancer has metastasized. Since most tumors have specific mutational signatures, retrieving evidence of VS in a specific tumor sample could help identify the patient’s tumor type and lead to faster treatment.The conversation

Ryan Layer, assistant professor of computer science, University of Colorado Boulder

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Source link

About Donald P. Hooten

Check Also

The 369 Manifestations Method: How to Make Your Therapy Sessions Work

Psychologist Nancy Sokarno walks us through the 369 Manifestations Method and how to get the …