ALLCools.clustering.doublets
Contents
ALLCools.clustering.doublets
¶
Package Contents¶
- class MethylScrublet(sim_doublet_ratio=2.0, n_neighbors=None, expected_doublet_rate=0.1, stdev_doublet_rate=0.02, metric='euclidean', random_state=0, n_jobs=- 1)[source]¶
- fit(self, mc, cov, clusters=None, batches=None)¶
- simulate_doublets(self)¶
Simulate doublets by adding the counts of random observed cell pairs.
- pca(self)¶
- get_knn_graph(self, data)¶
- calculate_doublet_scores(self)¶
- call_doublets(self, threshold=None)¶
- plot(self)¶
- _plot_cluster_dist(self)¶
- coverage_doublets(allc_dict: dict, resolution: int = 100, cov_cutoff=2, region_alpha=0.01, tmp_dir='doublets_temp_dir', cpu=1, keep_tmp=False)¶
Quantify cell high coverage bins for doublets evaluation
- Parameters
allc_dict – dict with cell_id as key, allc_path as value
resolution – genome bin resolution to quantify, bps
cov_cutoff – cutoff the cov, sites within cov_cutoff < cov <= 2 * cov_cutoff will be count
region_alpha – FDR adjusted P-value cutoff
tmp_dir – temporary dir to save the results
cpu – number of cpu to use
keep_tmp – Whether save the tem_dir for debugging