The entire DNA sequence of a living organism is that organism’s genome. The human genome is ~3 billion base pairs long and contains all the instructions necessary for forming a living, breathing person. Whole-genome sequencing determines the precise order of every single base pair in a genome, providing comprehensive information about both protein-coding regions and non-coding regions that may have other functions. Next-generation sequencing methods have made whole genome sequencing faster, cheaper, and more powerful than ever.
The first genome to be fully sequenced was the genome of the bacterium that causes bacterial influenza, Haemophilus influenzae. This bacterial genome was sequenced in 1995, using shotgun sequencing, which breaks the genome into small DNA pieces that are cloned into bacteria for growth, isolation, and sequencing. Sequences are then reassembled into the full genome using bioinformatics tools [1]. When the Human Genome Project attempted to sequence the first human genome in 2001, shotgun sequencing was also used, but, since the human genome is so much larger than a bacterial genome, the entire human genome could not be fully sequenced. Although faster than other sequencing methods at the time, it still took over ten years to fully sequence the human genome using shotgun sequencing.
Next generation sequencing (NGS) is a much faster sequencing approach that does not require cloning DNA fragments. Instead, DNA is extracted from an organism’s tissue and fragmented, then sequencing libraries are created by adding adapters that are later recognized by the sequencing platform. The libraries are then loaded onto the sequencer which uses a platform-specific technology to detect nucleotides one by one. Using NGS, a human genome can be sequenced in just a single day [2].
WGS is an approach to sequencing the entire DNA sequence of an organism, while NGS is one of several sequencing technologies (as described below):
Whole genome sequencing can be performed on as little as 1 ng of DNA that has been extracted from a target tissue or cell samples. Libraries are then prepared using one of several available library preparation kits and adapters which fragment the DNA then add adapters to the ends of the resulting pieces. Most sequencing libraries will contain indexing barcodes―short, fixed DNA sequences―unique to each sample, enabling multiple samples to be sequenced at one time (i.e., multiplexing). These barcodes also allow sequencing libraries to be separated during the sequence analysis step. Currently, there are companies and research cores that offer a wide range of sequencing services. A variety of free bioinformatics tools are available to analyze your data as well.
Whole genome sequencing (WGS) provides the most comprehensive data about a given organism. NGS can deliver large amounts of data in a short amount of time. Profiling the entire genome facilitates discovery of novel genes and variants associated with disease, particularly those in non-coding areas of the genome. Although it can be more expensive and time-consuming than targeted sequencing approaches and technologies like microarrays, these key advantages to sequencing entire genomes with WGS may prove worth it:
WGS is a powerful tool for variant discovery with several downstream applications including cancer research, genetic diseases research, epidemiology, and genotyping. WGS can be used not only to determine variant frequencies or how often a difference occurs within a population of organisms, but also to associate genetic variants with disease through genome-wide association studies (GWAS). As the price of WGS decreases, it is becoming more common to use it as a translational research tool. Having achieved the “$1000 genome,” multiple companies are pushing towards the next goal of the “$100 genome” [2-4].
This detailed overview walks you through major advances in sequencing technology, types of next generation sequencing, their applications and more.
Of the roughly 3 billion base pairs in the human genome, only about 1–2% are translated into functional proteins. The areas of the genome that encode functional proteins are called exons Sequencing only exons (whole exome sequencing; WES) is cheaper and faster than sequencing the entire genome and is more than a suitable approach for research groups that are only interested in protein-coding regions of the genome. The main differences between WGS and WES are:
Are you working in an area that would benefit from WGS? See how we can help you easily improve your workflows and results.