First set up Zorn/Bascet according to the install instructions. This tutorial assumes that you have already produced a Seurat object using one of the count workflows, e.g. the KRAKEN2 workflow, the alignment-based workflow, the informative KMER workflow, or the count sketch workflow. You should have a count table of some sorts ready for analysis.
When to remove doublets
A doublet in single-cell metagenomics is a barcode that captures DNA from more than one cell. These can arise for at least two common reasons:
If more cells are collected than there are barcodes, you get barcode collisions. Droplet and split-pool methods rely on the cells being much fewer than the avaiable barcodes. By doing “Poisson loading” (diluting enough), doublets are typically kept <2%; but they still occur and must be handled
If cells stick together, the will also receive the same barcode. Ensuring a good single-cell suspension is crucial
Zorn/Bascet produces count tables and you can thus use many
single-cell tools to process the data, such as to remove doublets. Here
we give an example using scDblFinder.
Removing doublets using scDblFinder
scDblFinder
lives in Bioconductor and operates on a
SingleCellExperiment object, so the workflow is to wrap the
count matrix from your Seurat object, run the caller, then copy the
per-cell scores back into the Seurat object’s metadata.
library(SingleCellExperiment)
library(scDblFinder)
## Wrap the raw counts as a SingleCellExperiment (rows = features, cols = cells)
sce <- SingleCellExperiment::SingleCellExperiment(
list(counts = GetAssayData(adata,layer = "counts"))
)
## Call doublets
sce <- scDblFinder::scDblFinder(sce)
## Copy the per-cell score and class back into the Seurat object
adata$scDblFinder.score <- sce$scDblFinder.score
adata$scDblFinder.class <- sce$scDblFinder.classYou can now overlay the score and class on the UMAP to see whether doublets form a coherent population (often they sit between two species clusters, or they all form a “blob” in the middle of the UMAP):
FeaturePlot(adata, features = "scDblFinder.score")
DimPlot(adata, group.by = "scDblFinder.class")To remove doublets, just subset the Seurat object:
adata <- adata[, adata$scDblFinder.class == "singlet"]