Single-Cell Sequencing With Fluidigm C1

The NGSC has recently purchased a Fluidigm C1 system that greatly eases sequencing and other measurements of single cells. The most common application is mRNA RNA-Seq libraries, but whole genome, targeted sequencing, and other custom applications are possible. Materials extracted from the cells can be sequenced at the NGSC or sent to the Molecular Profiling Core for PCR-based assessment of miRNAs or target RNA expression levels using the Fluidigm BioMark HD.

Table Of Contents

  1. Important Links
  2. Basics
  3. Preparing Cells
  4. RNA-Seq
  5. Experiment Design
  6. Genomic DNA
  7. Other Standard Applications
  8. Custom Applications
  9. Sample Annotation

1.  Important Links

Fluidigm Double Capture Report discusses the rate and detection of cryptic two-cell captures which means that RNA sample is mixture, not single-cell.

2.  Basics

The C1 uses an 'integrated microfluidic chip' (IFC) to automatically capture individual cells from a single-cell suspension (see details below.) The IFCs come in three size ranges to match the size of the cells to be captured. Up to 96 cells from between 1000 and 5000 cells (in 5\(\mu\)l) can be captured. The cells are randomly captured, i.e., no selection is made for markers etc. Capturing takes about 1 hour.

After capture, the IFC is examined under a light microscope at the NGSC to evaluate the number and quality of the cells captured. If need be one or two more rounds of capture can be attempted to fill the remaining empty capture sites using abother 5\(\mu\)l each time.

Cells can be stained for live/dead and any staining used in prior FACS sorting, or GFP, etc. may be detected and annotated for later selection. The annotations are eventually recorded in an Excel spreadsheet.

Once the capture is completed, the IFC is returned to the C1 for nucleic acid (cDNA or whole genome). This process is usually done overnight and is timed so that the IFC can be removed when NGSC staff return the next morning. The extracted NA for each cell is rapidly transfered to a 96-well plate for further processing.

The NGSC maintains a stock of IFCs and reagents for mRNA RNA-Seq libraries. We are happy to order other IFCs and reagents, but this must be scheduled a few weeks before the experiment is performed.

3.  Preparing Cells

Cell Size Cell Density
[um] [cells/ul]
5-10 1000
10-17 500
17-25 200
Recommended cell density for each size IFC.

Each attempt to capture cells on the C1 uses 5!!ml!! of a single cell suspension. The cell density should be between 200 cells (large-cell chip) to 1000 cells (small-cell chip). See Fluidigm's guide for more details on preparing cells. The table to the right shows the target cell densities for each size of IFC.

The NGSC has a cell counter which is used just prior to capture to assess the cell density and fraction of live cells. Since the capture is random, a higher density of cells increases the chance of high captur count. However, if the density is too high, then clumping and clogging may occur.

A captured cell is picture to the right. Here are the major points to note.

4.  RNA-Seq

RNA extraction and conversion to amplified cDNA is done using the Clontech SMARTer chemistry. The SMARTer chemistry produces long, nearly full-length, polyA-enriched transcripts. Separate preps of cDNA is made for each cell.

Once the cDNA is ready Illumina sequencing libraries are made using the Nextera XT kit which uses ‘tagmentation’ to simultaneously fragment and add adapters. The kit is designed for high-plex pooling. It uses dual indexing and can be performed in a 96-well plate, so libraries can be prepared for all cDNA samples from the C1. The libraries are pooled, then cleaned up to produce a pooled set of libraries for sequencing. It is possible to combine the output from several captures into one plate for library prep. This allows one to aggregate low efficiency captures or to ensure an even mixture of cell types from multiple captures of FACS-sorted cells.

If desired cDNA can be assessed for target genes on the BioMark HD system to further characterize the captured cells prior to library prep. In this case, only cDNA from the cells of interest would be processed with Nextera to make libraries. Libraries could also be assessed using low-depth sequencing on the Illumina miSeq or hiSeq prior to selection for deeper sequencing. If the pool is found to be substantially imbalanced, then we have to re-pool and clean up the pool again. This process can only be repeated a few times. Changes to the pool balance may be limited by sample volumes.

Spike-ins

The standard recommendation from Fluidigm includes the use of polyA RNA spikes to demonstrate proper cDNA preparation. The spikes consist of 3 E. coli sequences which are included in the IFC at the time the cellular RNA is extracted. The spike RNA is then converted to cDNA at the same time. Depending on the size of the cell, the standard amount of spike-in can result in a library that is between 10 and 50% spike-in sequence.

A more useful spike-in is the EERC set. This contains a set of spikes at a range of concentrations. When properly diluted, the EERC can provide a sensitivity and linearity analysis of the cell-to-cDNA and cDNA-to-reads process. Details to follow.

5.  Experiment Design

The large sample count from a C1 creates a scenario that is different from the typical low sample count situation. An Illumina miSeq will generate about 20 million reads per run. The HiSeq 2000 will generate up to 200 million reads per lane. If all 96 samples are run at once, the average sample would get 200,000 reads (MiSeq) or 2 million reads (HiSeq). These numbers are too low for anything but a cursory assessment of the libraries. However, they should allow the user to identify failed or aberrant cells as well as to classify cells for selection for deeper sequencing.

If the number of cell types is small, the data from all cells of one type can be merged to create one more deeply-sequenced replicate. In this case, a single replicate comparison of the cell types could be performed using tools like DESeq. Additional capture and sequencing runs could be used to generate more biological replicates for more robust analyses.

If the objective is to assess the variability of a set of cell types or to de novo identify cell populations, then the pool of all or selected libraries can be sequenced across multiple lanes to achieve the desired sequencing depth. Typically an RNA-Seq experiment with about 30 million reads is though to be equivalent to a microarray. It would take about 15 HiSeq 2000 lanes to reach this level of coverage for a pool of 96 individual samples. However clustering and variability assessment should be possible with many fewer lanes, though the assessment would be emphasizing the more highly-expressed genes.

6.  Genomic DNA

Here is a summary of the whole genome amplification process from Fluidigm:

Whole genome amplification as described in this protocol is achieved through multiple displacement amplification with the illustra GenomiPhi V2 DNA Amplification Kit (GE Healthcare Life Sciences). After cells are lysed and the DNA denatured, random hexamers anneal to single-stranded DNA. Phi29 polymerase is used to amplify the randomly primed regions of the genome. The amplification process continues with the displacement of the DNA strand and annealing of random hexamers, generating a high single-cell genomic yield for analysis. Multiple displacement amplification is isothermal and has a low error rate due to the proof-reading capability of the Phi29 polymerase. The average product length with multiple displacement amplification is ~10kb, and the average yield using a C1 IFC is 12 ng/\(\mu\)l per cell in about 13 \(\mu\)l final volume.

7.  Other Standard Applications

The C1 also offers targeted gene expression, miRNA expression, exome sequencing, and targeted DNA sequencing. Please contact us if you are interested in any of these applications.

8.  Custom Applications

Fluidigm released a programming tool and special programmable IFCs in late 2014. They will soon be sponsoring a library of protocols that can be run on the machine using this IFC. Details to follow once this is available

9.  Sample Annotation

This section describes the C1 process from an informatics perspective at the NGSC. Each C1 run is given an identifier which looks like NFnnnnn, e.g., NF00021. During the visual examination under the microscope wells maybe annotated for the following attributes using a spreadsheet provided by Fluidigm.

Once the annotations are presented to the NGSC informatics staff, they are converted into a 96-well plate sample submission Excel file. Additional annotations, such as sequencing length, dual indexing barcodes, and the presumed pool name are also added at this time. The sample name is set to the well location and a '_'-delimited string of the well annotation. Thus a sample in well B4 that has 2 cells and some debris will get a name like [[B4]] 2_d____. The sample type is set to cDNA or gDNA as appropriate. The samples are then loaded into the database.

If the whole plate is to be made into libraries, then the library samples are created when the libraries have been made. All sample information is transfered from cDNA samples to the RNA-Seq samples. Libraries can be made one-to-one from a whole plate, from a few cDNA plates replated to form a new plate, or in as few as 6 individual samples at a time. Annotation for the libraries are set to reflect the final configuration. When plates are merged, the sample name retains the original capture plate, well, and observational information. The new plate location is stored in a another part of the database.