autometa.binning package¶
Submodules¶
autometa.binning.recursive_dbscan module¶
autometa.binning.summary module¶
Script to summarize Autometa binning results
-
autometa.binning.summary.fragmentation_metric(df: pandas.core.frame.DataFrame, quality_measure: float = 0.5) → int¶ Describes the quality of assembled genomes that are fragmented in contigs of different length.
Note
For more information see this metagenomics wiki from Matthias Scholz
Parameters: - df (pd.DataFrame) – DataFrame to assess fragmentation within metagenome-assembled genome.
- quality_measure (0 < float < 1) – Description of parameter quality_measure (the default is .50). I.e. default measure is N50, but could use .1 for N10 or .9 for N90
Returns: Minimum contig length to cover quality_measure of genome (i.e. percentile contig length)
Return type: int
-
autometa.binning.summary.get_metabin_stats(bin_df: pandas.core.frame.DataFrame, markers_fpath: str, cluster_col: str = 'cluster') → pandas.core.frame.DataFrame¶ Retrieve statistics for all clusters recovered from Autometa binning.
Parameters: - bin_df (pd.DataFrame) – Autometa binning table. index=contig, cols=[‘cluster’,’length’, ‘GC’, ‘coverage’, …]
- markers_fpath (str) – Path to autometa annotated and filtered markers table respective to kingdom binned.
- cluster_col (str, optional) – Clustering column by which to group metabins
Returns: dataframe consisting of various metagenome-assembled genome statistics indexed by cluster.
Return type: pd.DataFrame
-
autometa.binning.summary.get_metabin_taxonomies(bin_df: pandas.core.frame.DataFrame, ncbi: autometa.taxonomy.ncbi.NCBI, cluster_col: str = 'cluster') → pandas.core.frame.DataFrame¶ Retrieve taxonomies of all clusters recovered from Autometa binning.
Parameters: - bin_df (pd.DataFrame) – Autometa binning table. index=contig, cols=[‘cluster’,’length’,’taxid’, *canonical_ranks]
- ncbi (autometa.taxonomy.ncbi.NCBI instance) – Autometa NCBI class instance
- cluster_col (str, optional) – Clustering column by which to group metabins
Returns: Dataframe consisting of cluster taxonomy with taxid and canonical rank. Indexed by cluster
Return type: pd.DataFrame
-
autometa.binning.summary.main()¶
-
autometa.binning.summary.write_cluster_records(bin_df: pandas.core.frame.DataFrame, metagenome: str, outdir: str, cluster_col: str = 'cluster') → None¶ Write clusters to outdir given clusters df and metagenome records
Parameters: - bin_df (pd.DataFrame) – Autometa binning dataframe. index=’contig’, cols=[‘cluster’, …]
- metagenome (str) – Path to metagenome fasta file
- outdir (str) – Path to output directory to write fastas for each metagenome-assembled genome
- cluster_col (str, optional) – Clustering column by which to group metabins