ICCFGG program 2022

ICCFGG 2022

#23 WAGS: Computational pipelines for the processing and analysis of animal whole-genome sequencing data Jonah Cullen and Steven G. Friedenberg cull0084@umn.edu Department of Veterinary Clinical Sciences, College of Veterinary Medicine, University of Minnesota, USA Next-generation sequencing technologies have revolutionized genetics and ushered in today’s genomics era. Continued advances and decreased costs empower researchers to generate massive amounts of data, particularly for whole-genome sequencing (WGS). This revolution has been exceedingly fruitful in veterinary medicine, dogs in particular, with researchers across the globe generating WGS data. However, WGS data processing is not always straightforward and may be inaccessible to many researchers. The Broad Institute is at the vanguard in developing open-source tools (e.g., Genome Analysis Toolkit [GATK]), analysis pipelines, and best practices for human WGS projects. Yet reconfiguring GATK pipelines for non-human samples and generating sample-specific input can be daunting and potentially error-prone when scaled to dozens or hundreds of samples. To address these hurdles, we developed publicly available, containerized, and scalable pipelines around the best practices favored by GATK developers, lowering the barrier to entry for researchers seeking to analyze WGS data. Our WAGS (Whole Animal Genome Sequencing) pipelines enable raw data processing, joint genotyping and annotation across samples, identification of novel variants, and calculation of allele frequencies by cohort. Our pipelines’ fast processing times (e.g. genomic variant call format files in ~15h for 20X coverage) make using WGS clinically a distinct possibility. To date, these pipelines have cataloged genetic variants from over 670 dogs of more than 50 breeds. On-going analyses demonstrate the usefulness of this catalog in identifying genetic variants associ- ated with various Mendalian canine diseases. Additionally, our pipelines have been applied to many species, including the horse, tiger, red fox, and falcon. #24 Full-length amplicon sequencing and analysis of the class I and class II major histocompatibility complex genes in dogs

Jonah N Cullen1 , Farah Almeer1, Amy Treeful1, Lorna J Kennedy2 , Steven G Friedenberg1 cull0084@umn.edu

1Department of Veterinary Clinical Sciences, College of Veterinary Medicine, University of Minnesota, Minneapolis, MN, USA; 2Centre for Integrated Genomic Medical Research, Stopford Building, University of Manchester, Manchester, UK The class I and class II major histocompatibility complex (MHC) is a gene cluster encoding proteins involved in antigen presentation. The MHC is associated with over 100 human autoimmune diseases. Similarly, the canine MHC has been implicated in autoimmune diseases including type I diabetes and Addison’s disease. A challenge in evaluating genetic MHC variation is the difficulty of scalable, cost-effective, and efficient genotyping. Canine class I genes in particular are relatively unexplored because of their high GC content and recently reported novel haplotypes. Moreover, current MHC genotyping focuses on Sanger-based sequencing of one to two exons per gene, leaving the remainder of each gene unexplored. We previously demonstrated overcoming these challenges in a small set of dogs (n=8) using long-range sequencing. In this study we improve

72

Made with FlippingBook - Online magazine maker