`ALLCools._bam_to_allc`¶

This file is modified from methylpy https://github.com/yupenghe/methylpy

Author: Yupeng He

Module Contents¶

log[source]¶

_read_faidx(faidx_path)[source]¶: Read fadix of reference fasta file samtools fadix ref.fa

_get_chromosome_sequence_upper(fasta_path, fai_df, query_chrom)[source]¶: read a whole chromosome sequence into memory

_get_bam_chrom_index(bam_path)[source]¶

_bam_to_allc_worker(bam_path, reference_fasta, fai_df, output_path, region=None, num_upstr_bases=0, num_downstr_bases=2, buffer_line_number=100000, min_mapq=0, min_base_quality=1, compress_level=5, tabix=True, save_count_df=False)[source]¶: None parallel bam_to_allc worker function, call by bam_to_allc

_aggregate_count_df(count_dfs)[source]¶

bam_to_allc(bam_path, reference_fasta, output_path=None, cpu=1, num_upstr_bases=0, num_downstr_bases=2, min_mapq=10, min_base_quality=20, compress_level=5, save_count_df=False)[source]¶

Generate 1 ALLC file from 1 position sorted BAM file via samtools mpileup.

Parameters

bam_path – Path to 1 position sorted BAM file
reference_fasta – {reference_fasta_doc}
output_path – Path to 1 output ALLC file
cpu – {cpu_basic_doc} DO NOT use cpu > 1 for single cell ALLC generation. Parallel on cell level is better for single cell project.
num_upstr_bases – Number of upstream base(s) of the C base to include in ALLC context column, usually use 0 for normal BS-seq, 1 for NOMe-seq.
num_downstr_bases – Number of downstream base(s) of the C base to include in ALLC context column, usually use 2 for both BS-seq and NOMe-seq.
min_mapq – Minimum MAPQ for a read being considered, samtools mpileup parameter, see samtools documentation.
min_base_quality – Minimum base quality for a base being considered, samtools mpileup parameter, see samtools documentation.
compress_level – {compress_level_doc}
save_count_df – If true, save an ALLC context count table next to ALLC file.

Returns

a pandas.DataFrame for overall mC and cov count separated by mC context.

Return type

count_df

ALLCools._bam_to_allc

Contents

ALLCools._bam_to_allc¶

Module Contents¶

`ALLCools._bam_to_allc`¶