HudsonAlpha ED Guidebook 2021_22

NEW FINDINGS

predicts the structure of human proteins

Proteins fold their string of amino acids into specific three-dimensional structures to carry out their biological function. Knowing that three- dimensional structure allows scientists to understand how the protein works, identify what goes wrong when it’s been altered and design medications to boost or silence its activity. Unfortunately, predicting protein structure is hard work and can take months of computer-based simulations and modeling. Less than ⅓ of all human proteins (collectively known as the proteome) have a known structure. DeepMind, a sister company to Google, is an artificial intelligence labo- ratory based in the UK. They developed AlphaFold2, a machine-learning platform that predicts protein structures with a very high degree of accuracy. The platform uses a repetitive system of model refinement, applied millions of times to improve predictions based on prior expe- riences. With a relatively high level of confidence, AlphaFold2 predict- ed structures for nearly the entire human proteome, as well as the proteome of 20 model organisms such as mouse, fruit fly and E. coli . DeepMind has predicted more than 350,000 protein structure and plans to submit up to 130 million more by the end of 2021, nearly half of all known proteins. Protein predictions are freely available and academic research teams can use AlphaFold2 at no charge. Even though these predictions must still be experimentally verified, the sudden availability of so many protein structures will likely transformmany aspects of biology and human health. REFERENCES: Tunyasuvunakool K. et al. Highly accurate protein structure prediction for the human proteome. Nature (2021) 596:590-596. DOI: 10.1038/s41586-021- 03828-1. And AlphaFold protein structure database: https://alphafold.ebi.ac.uk/

Long-read sequencing identifies “missed” disease-causing variants

Many neurodevelopmental diseases are genetic in nature. Despite advances in genome sequencing technology, specific diagnoses for these disorders remain elusive. This is likely because certain disease-causing genetic variants are challenging to detect with typical sequencing approaches. Traditionally, genome sequencing is performed by generating millions of “short” sequences, called reads, generally around 150 base pairs long. These short-reads are pieced back together like a puzzle using a human reference genome as a template. However, it is hard to accurately map certain types of short reads, especially regions containing highly repetitive stretches of DNA. These portions of the genome often go unanalyzed. One approach to overcome this limitation is to use a se- quencing platform that produces longer reads. “Long-read” sequencers generate sequences up to 1,000 times longer than short-read systems. Fewer, bigger puzzle pieces means fewer gaps in the assembled sequence. Greater genome coverage lets researchers and clinicians more accurately detect DNA variants. Recently, scientists used long-read sequencing to reanalyze the genomes of six families with children suspected of having a genetic neurodevelopmental disorder. The families had previously been sequenced using short-read technology, but no disease-causing genetic variant had been identified. Long-read sequencing found multiple genetic variants in each family that had previously been missed. Among these newly detected variants, disease-causing DNA changes were identified in two of the six children. If these findings are extended to larger populations, long-read sequencing may supplement or even replace short-read analysis pipelines, improving the rare disease genetic discovery rates. REFERENCE: Hiatt S.M. et al. Long-read genome sequencing for the molecular diagnosis of neurodevelopmental disorders. HGC Advances (2021) 2:100023. DOI: 10.1016/j.xhgg.2021.100023. The laboratories of HudsonAlpha faculty researchers Jane Grimwood PhD, Jeremy Schmutz and Greg Cooper PhD contributed to this work.

A replacement for karyotypes?

For over 50 years, clinicians have used photographs of stained chro- mosomes called karyotypes to identify chromosome duplications, deletions and large rearrangements associated with genetic disor- ders. Sometimes more precise methods of chromosome analysis such as FISH and CNV microarray have replaced karyotyping, but each ap- proach still has its limitations. However, a technology known as optical genome mapping may one day replace all three methods. Optical genome mapping begins by extracting DNA molecules hun- dreds of thousands of nucleotides long. These pieces are fluorescently labeled at commonly repeating DNA sequences and aligned into tiny nanochannels on a laboratory chip. Photographs are taken of the DNA fragments and the pattern of labels is identified and compared to a reference genome. Chromosome deletions, insertions, duplications and other abnormalities appear as changes in the pattern. In a series of head-to-head tests, optical genome mapping outper- formed karyotypes, FISH and CNV microarrays. Among a panel of 99 chromosomal variants often identified at birth, the tool correct- ly identified every alteration. It also scored well when identifying chromosome alterations linked to blood cancers that form over time. That said, more extensive validation is needed before this technology replaces existing detection methods. REFERENCES: Mantere T. et al. Optical genome mapping enables constitution- al chromosomal aberration detection. Am J Hum Genet (2021) 108:1409-1422. DOI: 10.1016/j.ajhg.2021.05.012. And Neveling K. et al. Next-generation cytogenetics: Com- prehensive assessment of 52 hematological malignancy genomes by optical genome mapping. Am J Hum Genet (2021) 108:1423-1435. DOI: 10.1016/j.ajhg.2021.06.001.

8

Made with FlippingBook - Online magazine maker