10x Genomics provides support, reference transcriptomes, and primers for human samples and mouse strains (C57BL/6 and BALB/c). The reference sequences and/or primers may not be optimal for other mouse strains. Custom primers and reference sequences must be created for all non-human and non-mouse species. Improper primers or references could result in the loss of clonotypes.
The spiking of cell lines into samples (as a control) may have unintended consequences. Cell-spiking may introduce high background mRNA (particularly if the cell line has large or leaky cells) that could confound the calling algorithm.
The primary recommendations for V(D)J sequencing are:
Chemistry | Read configuration (bp) | Recommended read pairs per cell |
---|---|---|
v1 & v1.1 | Read 1: 26 i7: 8 i5: 0 Read 2: 91 | 5000 |
v2 | Read 1: 26 i7: 10 i5: 10 Read 2: 90 | 5000 |
Cell refers to a recovered, targeted cell. A recovered cell is a cell captured in a GEM. A targeted cell may be a T or B cell, depending on which enrichment primers are used.
Cell Ranger estimates the number of recovered targeted cells based on sequence data. 10x Genomics recommends 26 x 90 (or 26 x 91 for v1 chemistry) read pairs as it facilitates efficient sequencing of multiple library types including V(D)J and Gene Expression in a single sequencing run.
For most samples, the recommended sequencing depth will approach the limit of what you would obtain at high depth, even using longer (150 x 150) reads. However, there are exceptions. Sequencing saturation in the following plots is computed as:
number of productive pairs obtained using 26 x 91 read pairs at given depth
number of productive pairs obtained using 150 x 150 read pairs at high depth
Here high depth is ten-fold higher than the recommendation.
The plasma cell enriched sample is from a tumor, and in addition to having many plasma cells, may have a significant fraction of dying cells.
Some samples do not saturate at the recommended sequencing depth. The most common cause is extreme variation in expression levels between cells in a sample. Cells with high mRNA expression use most of the sequencing resources, leaving little for the low expression cells. For example:
- Samples that are rich in Plasma B cell express IG genes at roughly two orders of magnitude higher than other B cells.
- Tumor samples that contain many dying cells have abnormally high gene expression.
Samples for which overall expression levels are very low may also require greater sequencing depth to approach saturation.
However, saturation may not be needed to achieve experimental goals. For example, obtaining data from many dying cells may or may not be useful.
In general, use the 10x Genomics recommended sequencing depth. Sequencing depth may be adjusted based on the sample type and experiment needs:
- Sequence precious samples at higher depth in order to get data from as many cells as possible.
- Sequence at higher depth when studying low expression cells in a population of cells with heterogeneous expression.
- Sequence at lower depth if large savings on sequencing cost are more important than modest increases to yield.
Reads from a library that was sequenced across multiple flow cells can be pooled. Follow the steps in Specifying Input FASTQs to combine them in a single cellranger vdj
run.
10x Genomics also supports a 150 x 150 configuration:
Alternate read configuration | Recommended read pairs per cell |
---|---|
150 x 150 | 2000 |
This produces about the same number of read bases as the primary recommendation (26 x 91, 5000 read pairs per cell). The coverage response curves for these two configurations match closely.
Other read lengths may work but have not been tested. Coverage would need to be adjusted in the same fashion so that the number of read bases is about the same.
Use of a second read that is significantly shorter than 91 bases may not work well.