Take debarcoded reads, merge them, and split them into suitable numbers of shards.
Source:R/debarcoding.R
BascetShardify.Rd
The reads from one cell is guaranteed to only be present in a single shard. This makes parallel processing simple as each shard can be processed on a separate computer. Using more shards means that more computers can process the data in parallel. However, if you perform all the calculations on a single computer, having more than one shard will not result in a speedup. This option is only relevant when using a cluster of compute nodes.
Usage
BascetShardify(
debstat,
numOutputShards = 1,
outputName = "filtered",
overwrite = FALSE,
runner = GetDefaultBascetRunner(),
bascetInstance = GetDefaultBascetInstance()
)