Project: PRJNA1137951
Recent technological developments in single-cell RNA-seq CRISPR screens enable high-throughput investigation of the genome. Through transduction of a gRNA library to a cell population followed by transcriptomic profiling by scRNA-seq, it is possible to characterize the effects of thousands of genomic perturbations on global gene expression. A major source of noise in scRNA-seq CRISPR screens are ambient gRNAs, which are contaminating gRNAs that likely originate from other cells. If not properly filtered, ambient gRNAs can result in an excess of false positive gRNA assignments. Here, we utilize CRISPR barnyard assays to characterize ambient gRNA noise in single-cell CRISPR screens. We use these datasets to develop and train CLEANSER, a mixture model that identifies and filters ambient gRNA noise. This model takes advantage of the bimodal distribution between native and ambient gRNAs and includes both gRNA and cell-specific normalization parameters, correcting for confounding technical factors that affect individual gRNAs and cells. The output of CLEANSER is the probability that a gRNA-cell assignment is in the native distribution over the ambient distribution. We find that ambient gRNA filtering methods impact differential gene expression analysis outcomes and that CLEANSER outperforms alternate approaches by increasing gRNA-cell assignment accuracy. Overall design: gRNA library transduction: HEK293T dCas9KRAB cells were seeded at a density of 5x104 cells/cm2 and NIH3T3 dCas9KRAB cells were seeded at a density of 1.25x104 cells/cm2 on 6-well plates in one biological replicate each. The cells were transduced with lentivirus using 8 μg/mL polybrene at a multiplicity of infection (MOI) of ~10 as determined by titration. Two days post-transduction, cells were treated with either 500 (HEK293T dCas9KRAB + non-targeting library #1 cells) or 1000 (NIH3T3 dCas9KRAB + non-targeting library #2 cells) ng/mL puromycin or 20 (HEK293T dCas9KRAB cells + non-targeting library #1) or 80 (NIH3T3 dCas9KRAB cells + non-targeting library #2) μg/mL blasticidin and were selected for 10 days. 7 days post-transduction, cells were trypsinized and seeded on 6-well plates in three conditions: 1) HEK293T dCas9KRAB + non-targeting library #1 cells at a density of 3.9 x 104 cells/cm2 2) NIH3T3 dCas9KRAB + non-targeting library #2 cells at a density of 1.5 x 104 cells/cm2 3) HEK293T dCas9KRAB + non-targeting library #1 cells at a density of 2.0 x 104 cells/cm2 and NIH3T3 dCas9KRAB + non-targeting library #2 cells at a density of 2.0 x 104 cells/cm2 CRISPR barnyard single-cell RNA-seq: 10 days post transduction, cells were washed three times, trypsinized, and strained through a 40 µm cell strainer. The cells were diluted to 1K cells/µL and a fourth condition of HEK293T dCas9KRAB + non-targeting library #1 and NIH3T3 dCas9KRAB cells + non-targeting library #2 were mixed. Eight lanes were loaded for single-cell transcriptome profiling, with one lane per condition for each CROP-seq and modified direct capture perturb-seq vector. Approximately 10,000 cells were captured per lane of a 10x Chromium chip (Next GEM Chip G) using Chromium Next GEM Single Cell 3ʹ HT Reagent Kits v3.1 with Feature Barcoding technology for CRISPR Screening (10x Genomics, Inc, Document number CG000418, Rev D). CROP-seq protospacer sequences were amplified from barcoded cDNA as described previously. CRISPR barnyard single-cell RNA-seq library sequencing: Final libraries were pooled and sequenced on a NovaSeq S4 flow cell (R1:28 I1:10, I2:10, R2:90) aiming for ~15,000 reads per cell for gene expression libraries and ~5,000 reads per cell for gRNA libraries
General