Observations about Helicos, a single molecule sequencer from 2008

Authored by Dale Yuzuki on September 4, 2023

Table Of Contents

A brief history of Helicos Biosciences

Does anyone remember Helicos Biosciences? Way back when, in 2009 (per Wikipedia) Stanford co-founder Stephen Quake had his genome sequenced (and published in the prestigious journal Nature Biotechnology) for a reported $50K cost in Helicos reagents; that year I remember hearing a talk given by Arul Chinnaiyan at the NCI with single-molecule RNA-Seq data, it was an exciting time.

Crunchbase indicates Helicos raised $77M and went public in 2007; they shipped their first Heliscope in 2008, only to be delisted in 2010 and then declared bankruptcy in 2012. Remember,  the Solexa 1G / Illumina Genome Analyzer (“GA”) only started selling commercially in late 2006. So those early days it was something of a dogfight. I was selling Illumina microarrays for GWAS to the NIH from 2005 onward through the first GA’s and GA IIx’s and then from 2010 started selling the Applied Biosystems / Life Technologies’ SOLiD 4’s.

A few distinguishing features

For those familiar with library preparation for Illumina sequencing, it takes time and several rounds of PCR and PCR cleanup, along with quantification, to be ready for sequencing. For Helicos, it used poly-adenylated nucleic acid to bind to the flowcell (and then the chemistry would sequence the DNA directly, without any further amplification in emulsions, nanoballs, or clusters depending on the platform).

As the worlds’ “first true single molecule sequencer”, the DNA sequence had no amplification bias and could read high GC or low GC stretches of DNA without any impairment to the accuracy. The bases were accurate to about 96%; this 4% error was not sequence dependent and was basically random, simplifying analysis. There was only a tiny amount of sample input required; 3 ng input amounts of RNA or DNA. The two flowcells gave the instrument the capability to run 50 samples in 1 run, which took 7 or 8 days to complete. And at a 2008 price per sample of about $325 (for about 14M unique reads, this info is from an old GenomeWeb interview), the price/sample for RNA-Seq was attractive, although it would require a 48-sample experiment (two of the 50 lanes were reserved for controls), or some $15,600 for a single experiment, which naturally would limit the market for high-throughput operations.

In 2008 the Solexa 1G and Genome Analyzers were all the rage

The Solexa acquisition occurred in Nov 2006 and several Solexa 1G’s had already been shipped and started producing data in customer laboratories, and at only 25 basepair (bp) reads they still produced about 800 MB of sequencing data per 3+ days’ sequencing run. I was selling microarrays to the NIH for Illumina since 2005, and in early 2007 the first Solexa 1G was installed at the laboratory of Keji Zhao at NHLBI, who ended up being the first person to publish a ChIP-Seq paper using NGS (it was in the journal Cell in May 2007, here’s a PubMed link).

By the time Helicos commercially launched with their first commercial sale in February 2008, Illumina had already sold 50 Genome Analyzers by the summer of 2007, and by February 2008 had updated the instrument to do paired ends and the readlengths were extended from 25 bases to 35 bases. Illumina announced progress to moving to 50 base readlengths.

Against this backdrop, Helicos designed and build their ‘Heliscope’ single molecule sequencer to be highly scaled: 50 channels, about 25GB of sequence data per 7 or 8 day run, read lengths 25 to 55 basepairs long with an average of about 30 to 35, about 4% error rate with bases having a G/C content ranging from 20% to 80%, and the error model was random (no systematic bias which was their big selling point against Illumina, where cluster generation as well as library preparation uses a form of PCR amplification introducing bias).

And according to a conversation I had this week, the price of the Heliscope was also scaled: $1.2M was the instrument at the start in 2008, and steadily lowered over time to about $900K in 2011 when they ceased operations. Requiring 48 samples for an RNA-Seq experiment, taking an entire week to generate data, and costing over $15,000 was a tall order to fill; sequencing a whole genome for $50,000 was also not something many laboratories or individuals could afford to spend in 2008.

Important aspects of the Heliscope

Being able to source a 2008 product sheet of the HelicoScope (PDF), the data storage capacity on-board (remember this is 2008) was a whopping 28 Terabytes. This was to store enormous imaging data for the flowcells, of which there were a pair of them, and to do all the image registration and base calling. Any way you look at it, a single run producing 25 Gigabases of sequencing data in 2008 was going to pose some challenges.

And this instrument was big: the spec sheet says the main Heliscope sequencer was four feet by 3 feet by 6 feet tall, and a whopping 1,890 lbs. An 800 lb block of Vermont granite was included at the bottom of the instrument to stabilize it against vibration. However it’s clear from a photograph of the instrument that they were fitted with wheels, so you can say it was portable, as much as a 2,000 lb instrument is portable.

The world’s first single molecule sequencing technology (they trademarked the name of their technology, calling it True Single Molecule Sequencing (tSMS)™), the chemistry was not in ‘real-time’ like the latest PacBio Revio™ or Oxford Nanopore PromethION™, it was sequencing-by-synthesis of a single base and then imaging the entire flowcell surface. With two flowcells (each with 25 lanes), one would be imaged while the other had its flowcell biochemistry being performed. Impressively (or perhaps not that realistically?) they claimed improvements in the flowcell density and tSMS reagent efficiency they promised to eventually produce 1GB of sequence per hour (about 7x the above numbers in terms of density and thus overall throughput).

One source told me in those days the flowcell had uneven densities of poly-T molecules, so there were unusable areas to call bases. If it was too sparse, not worth the effort of scanning and analyzing; if it was too dense, the signals would collide and no usable sequence could be obtained. The original design however scanned all the surface of all 50 channels; usable data or not, all the images were scanned and analyzed. There wasn’t the luxury of time and engineering resources to optimize this.

What was the cause of the ultimate demise of the Heliscope?

Not only was the instrument cost an issue, there was also the problem of getting to longer reads. In 2008 Illumina was getting 35bp reads and on their way to 50bp reads, along with paired-end capability that meant a large increase in throughput. (For those unaware, in 2023 these reads now go out to 300bp.) Helicos could not catch up; due to the likeliness of restrictions on detectability and the optical system, the laser illumination to excite the fluor labels on the nucleotides would also hold the potential to damage the DNA from being a usable molecule. And thus Helicos could talk about extending the average readlength from 35 (plus or minus 10 or 15 bases as it was a distribution of reads) to 50 or longer, but it just did not happen in the timeframe from 2008 to 2011 when Helicos stopped selling the Heliscope systems. It is my understanding that they did not sell many of these $1M systems, less than a dozen or so worldwide.

Price of a new instrument from a new company at the $1M pricepoint is a tall order. One life science company that sold single-cell analysis equipment and consumables, Berkeley Lights (now renamed PhenomeX after an acquisition of the single-cell proteomics company Isoplexis) tried for years to reduce the size and cost of their flagship Beacon system, however were unable to and has a limited market for their analzyer.

You can say Helicos paved the way for market reception of PacBio in 2012 and then Oxford Nanopore a few years later in 2015. The relatively high cost (some 7x to 10x on a cost-per-base relative to sequence data coming off of Illumina’s flagship NovaSeq X) remains a large barrier.

Now that Element Biosciences, Singular Genomics, and Ultima Genomics (and let’s not forget PacBio’s Onso) are competing head-to-head with Illumina on short reads, is there room for innovation (and cost reduction) in single molecule long reads? I would certainly hope so.

Recent Comments

Leave a Reply

Your email address will not be published. Required fields are marked *