--Must See--

Bioinformatics Summer Internship 2024 With Hands-On-Training + Project / Dissertation - 30 Days, 3 Months & 6 Months Duration

Hundreds of thousands of human genomes are now being sequenced in order to characterize genetic variation and use this information to augment association mapping studies of complex disorders.

However, these approaches are biased against discovery of structural variations present in the more complex parts of the genome. Hence, a large-scale de novo assembly is needed. Which is exactly what the GenomeDenmark project was all about. After close to 5 years of work, the consortium has now finalized the efforts to establish a Danish Reference genome.

The Danish study, under Prof. Karsten Kristiansen, has been able to thoroughly map non-simple variation among individuals by assembling 150 genomes sequenced from 50 family trios using a combination of paired-end and mate-pair libraries on the Illumina HiSeq2000 instrument at an average read depth of 78x.

Kristiansen says “With a combined national effort and unique approach, we have demonstrated that even a relatively small consortium can provide an important new genomics standard to the scientific community. We see our approach as part of the basis for the precision medicine agenda and hope that the national funding bodies and government will exploit the opportunity to increase the Danish efforts in this area — to improve the future public health and

capture a health economics impact.”

The scientists used three different programs to construct de novo assemblies for each of 150 individuals sequenced: SOAPdenovo2, SGA, and Allpaths-LG14. The paper appearing in the journal Nature, details how the assemblies had a median scaffold length of 21 megabases. The 100 largest scaffolds in each of the 140 best assemblies typically covered more than 75 percent of the genome, read the report. To gauge the accuracy of these assemblies, the scientists aligned the scaffolds to the human reference genome. They then compared these assemblies to a long-read assembly based on BioNano mapping and PacBio sequencing, that led the authors to describe the quality of the de novo assemblies as “similar to those obtained using the more expensive long-read technology.”

Using the data, the Danish team was also able to resolve major histocompatibility complex haplotypes in half the trios, resulting in 100 complete MHC haplotypes. Additionaly they were able to discover 181 indels in fixed major haplogroups R,I, and Q that had not been previously reported.

However, efforts in this direction is not something now, researchers globally are increasingly seeking to produce reference genomes or panels for specific populations of interest. Last year, for instance, saw the publication of a Korean reference genome. Scientists at the Estonian Biocenter also recently generated a reference panel for their population.

In search of the perfect burger. Serial eater. In her spare time, practises her "Vader Voice". Passionate about dance. Real Weird.