Tag: whole genome

  • Nabsys Genome Mapping Technology Launches at ASHG 2023

    Nabsys Genome Mapping Technology Launches at ASHG 2023

    Introduction to genetic structural variation

    It is an exciting time to be involved in genetics and its application to healthcare. It was only a little over two decades ago the first draft of the Human Genome Project was published, and last year the first telomere-to-telomere sequence of a single chromosome was achieved.  And the impact of next-generation sequencing is seeing in increasingly valuable applications in the clinic: as a companion diagnostic for targeted cancer therapy, as a method for prenatal non-invasive testing for trisomy, and for rare disease diagnostics. Yet there are still so many big problems that remain unsolved.

    One of the larger problems in genetics (and by association application and impact to healthcare) is the detection and characterization of structural variation. A single gene can be damaged via a multitude of mechanisms (such as non-homologous recombination), and is a different kind of variation compared to Single Nucleotide Polymorphisms (SNPs) which genotyping microarrays measure, or insertion / deletion mutations (called indels where a single base to several dozens of bases can be inserted or deleted) which sequencing can also detect.

    The size of these insertions and deletions, however, can exceed the resolving power of next-generation sequencing, where readlengths can be limited to 150 to 300 bases. There can be insertions and deletions of kilobases or hundreds of kilobases long, which will be invisible to NGS analyses.

    In a given individual’s whole-genome sequence, there will be some 4 to 5 million SNPs and indels detected. The structural rearrangements (above 50 bases of inserted or deleted nucleotides, to several million bases or even entire chromosome arms) go undetected. For clinical cases, a pathology cytogenetics laboratory routinely uses techniques such as Fluorescent In-Situ Hybridization (known by its acronym FISH), karyotyping and microarrays (typically aCGH or array Comparative Genomic Hybridization) to detect structural rearrangements and specific gene fusions for diagnosing and appropriately guiding the treatment of cancer.

    Figure 1 below (kindly provided by Nabsys) compares conventional next-generation Sequencing by Synthesis (SBS) to genome mapping.

    Figure 1: Sequencing by Synthesis (typical NGS method) compared to genome mapping. Image kindly provided by Nabsys.

    There are estimated over 20,000 structural variants in a single human genome, yet with current sequencing technology (including single molecule sequencing from manufacturers such as Pacific Biosciences or Oxford Nanopore Technologies) large swaths of genome sequence can be rearranged but go undetected.

    For example, say there is a balanced structural variant, where a large multi-megabase region is inverted. It is called balanced because there is no gain or loss of DNA sequence, however there is a stretch of several megabases in the completely opposite orientation. Even with the technical advances of single-molecule sequencing to the tens or even hundreds of kilobases long, detecting all the different kinds of variation with a wide range of sizes and complexity remains a challenge.

    Mapping versus long read sequencing

    One definite trend over the past few years has been a consistent increase in throughput of short read sequencing, in addition to the similar throughput increases in long read sequencing as well. However on a cost-per-gigabase basis, long read sequencing remains 5-fold to 10-fold more expensive, severely limiting its applicability to clinical applications.

    Genome mapping using an optical method has been on the market for several years from Bionano Genomics, and is accepted as a complement to whole genome or whole exome sequencing to understand the nature of structural variants and disease. Nabsys now offers better resolution of variants at lower cost, detecting SV’s as small as 300 base pairs with >100kb long segments of the genome electronically mapped.

    Nabsys OhmX™ technology

    For a Nabsys run, high-molecular weight genomic DNA (50 kb to 500 kb) is first nicked using sequence-specific nickase enzymes, that could be used alone or in combinations, then labeled and coated with a protein called RecA (the RecA protein serves to stiffen the DNA for analyses). The samples are injected into the instrument, and the data is collected.

    Single DNA molecules are translocated through a silicon nanochannel, and the labeled locations are electronically detected to determine the distance between sequence-specific tags on individual molecules. While each electronic event is measured across the linear DNA molecule, there is a time-to-distance conversion and the entire genome has enough overlap to assemble what is effectively a restriction map of overlapping fragments (see figure 2).

    Figure 2: Individual molecules labeled with sequence specific labels, measured in a Nabsys OhmX Analyzer using a Nabsys OhmX-8 nanochannel device, and assembled into a Genome Map. Drawing courtesy of Nabsys.

    This capability was showcased a few years ago for microbial genomes, and a few publications1, 2, 3 show the proof of the approach for analyzing DNA maps this way at single-molecule resolution in bacterial genomes.

    With the recent commercial release of the Nabsys OhmX Analyzer system and OhmX-8 Detector consumables, a 10-fold increase in throughput has been achieved combined with 250 electronic detectors per channel. Nabsys uses a kit for efficient high molecular-weight DNA extraction and labeling in preparation for loading onto the system. (The sample input requirement is 5 ug of starting material, sufficient for several instrument runs if necessary; less input can be used if DNA quantities are limited.) In addition, as there are no optics (only fluidics and electronics) the Nabsys instrument is much more compact and less expensive than the equivalent optical instrument, as well as less expensive to run.

    Applications for human disease: cancer and rare disease

    Cancer has been correctly described as a ‘disease of the genome’, and as a research tool understanding the role structural variation has in cancer progression and treatment is an ongoing area of important work. Another important application of genomic mapping is for rare disease; currently it is estimated that about 70% of suspected Mendelian disorders go undiagnosed even with current short-read whole-genome sequencing4.

    It remains to be seen whether better detection and characterization of structural variation can provide the needed insights into these two important research areas, currently limited by cost of existing technology.

    Nabsys at ASHG 2023

    At the upcoming American Society for Human Genetics conference in Washington DC November 2 – 5, 2023 Nabsys will be present in the Hitachi High-Tech America Booth 1423. Hitachi will present their Human Chromosome Explorer bioinformatics pipeline for a low-cost, scalable Structural Variation validation and discovery platform.

    You can find out more about the Nabsys OhmX Analyzer here (a downloadable brochure is available on that page) and also more information about the overall approach to electronic genome mapping is here. A handy whitepaper about EGM can be found here (PDF).

    1. Passera A and Casati P et al. Characterization of Lysinibacillus fusiformis strain S4C11: In vitro, in planta, and in silico analyses reveal a plant-beneficial microbe. Microbiol Res. (2021) 244:126665. doi:10.1016/j.micres.2020.126665
    2. Weigand MR and Tondella ML et al. Screening and Genomic Characterization of Filamentous Hemagglutinin-Deficient Bordetella pertussis. Infect Immun. (2018) 86(4):e00869-17.  doi:10.1128/IAI.00869-17
    3. Abrahams JS and Preston A et al. Towards comprehensive understanding of bacterial genetic diversity: large-scale amplifications in Bordetella pertussis and Mycobacterium tuberculosis. Microb Genom. (2022) 8(2):000761. doi:10.1099/mgen.0.000761
    4. Rehm HL. Evolving health care through personal genomics. Nat Rev Genet. (2017) 18(4):259-267. doi:10.1038/nrg.2016.162
  • High Throughput NGS Systems: Throughput, Time and Cost Graphic

    High Throughput NGS Systems: Throughput, Time and Cost Graphic

    It is an exciting time for next-generation sequencing (NGS) with new platforms being launched. Below is a chart that illustrates recent progress.

    High Throughput NGS Systems compared by Gb/run, Days Required and $ per Gb
    High Throughput NGS Systems compared by Gb/run, Days Required and dollars per Gb

    A chart from 7 years ago…

    A few days ago I was reminded of a chart from 2016, when a blogger named Lex Nederbragt (now at the University of Oslo, Norway) as a result of competition in the NGS marketplace for both readlength and throughput made a handy chart with a lot of platforms on it. (You can see his original blog post with links to the image in a post called “Developments in high throughput sequencing – July 2016 edition“.)

    I was hopeful that that chart would have been updated in the intervening years, but alas his blogging moved platforms and then went quiet a year or so later. (And as a blogger myself, I can relate to the pressures of work and life as well as affiliation, which may or may not be conducive to this kind of activity.)

    The current NGS “arms race”

    So I thought about the current “arms race” of new platforms in the wings (i.e. Ultima Genomics is the top sponsor in February 2024’s Advances in Genome Biology and Technology), as well as new platforms only now getting into the hands of customers (specifically the Pacific Biosciences’ Revio, Element Biosciences’ AVITI, Singular Genomics’ G4, and the Pacific Biosciences’ Onso), and the renewed efforts of BGI / MGI / Complete Genomics now that they have the ability to sell their systems in North America and Europe. However complicating the MGI / Complete Genomics story, about a year ago parent company BGI Genomics was added to the US Department of Defense list of blacklisted companies – a GenomeWeb story with more details here, however as far as I can tell MGI / Complete Genomics continues to do business in the US.

    By pulling together specifications and prices (along with some handy source materials assembled by others) I constructed a list of some 21 existing (or in the case of Ultima soon to exist) offerings for sale, from Illumina’s iSeq all the way up to MGI’s monster DNBSEQ-T20x2. I took this list, calculated a US Dollar per Gigabase cost based on the highest throughput x readlength x time to sequence, and excluded all the other configurations. (For example, using a lower number of flowcells, or shorter runtimes for tag-counting applications were excluded.) I also noted the number of hours it took for this highest-throughput-per-system calculation.

    I then excluded all systems whose price-per-Gigabase of sequence was greater than $10 per Gb. (For those curious, if you figure 100 Gb of sequence per genome as a 33x WGS coverage, that’s the “$1000 Genome”). Therefore any system above the magic “$1,000 Genome” mark is not included, and you have the chart above: Gigabases per run on the X-axis, Hours per run on the Y-axis, and the size of the bubble in terms of US Dollars per Gigabase is relative to each other; the smaller the bubble, the lower the per-Gb cost.

    A few observations

    The market leader (estimated market share is about 75%) is of course Illumina, going through an upgrade cycle on the NovaSeq X where the per-Gb price on the NovaSeq 6000 at $4.84 drops to $3.20 in the latest iteration of the flowcell (these were released in February 2023, the 10B). A newer one (25B) will further drop that per-Gb price to $2.00 or so in the latter half of this year.

    Element Biosciences has a ‘package deal’ to get to $2.00 per-Gb, however that’s dependent on special discounting and large purchase commitment; I’ve left it at their current maximum capacity use-case.

    The Pacific Biosciences’ Revio did not make the cut due to higher than $10/Gb cost (from the pricing I’ve seen it’s about double that), but the Oxford Nanopore PromethION made it at exactly $10/Gb. Pretty remarkable that you can get a long-read whole genome for $1,000 when you think about it, even if it takes several days to produce the data.

    The MGI / Complete Genomics systems are certainly price-competitive – and the DNBSEQ-T20x2 broke the chart at 72,000 Gigabases per run, at $0.99 per Gb. Yes, that’s 720 whole genomes at 33x every 4 days. Their other system, the T7, has a few installations worldwide when they were effectively blocked from selling them in North America and Europe due to patent infringement (and an injunction).

    For the new Ultima system (called the UG 100), it has a relatively short runtime (24 h), a very low per-Gb price at $1.00, and at 3,000 Gb/run that is 30 whole genomes a day. Certainly a platform to watch, especially with the November 2023 ASHG conference coming up next month (in Washington DC) and the February 2024 AGBT conference (in Florida).

    I will be attending ASHG this year, and if you’d like to meet in-person during that conference be sure to reach out!