autometa.binning package

Submodules

autometa.binning.recursive_dbscan module

autometa.binning.summary module

Script to summarize Autometa binning results

autometa.binning.summary.fragmentation_metric(df: pandas.core.frame.DataFrame, quality_measure: float = 0.5) → int

Describes the quality of assembled genomes that are fragmented in contigs of different length.

Note

For more information see this metagenomics wiki from Matthias Scholz

Parameters:
  • df (pd.DataFrame) – DataFrame to assess fragmentation within metagenome-assembled genome.
  • quality_measure (0 < float < 1) – Description of parameter quality_measure (the default is .50). I.e. default measure is N50, but could use .1 for N10 or .9 for N90
Returns:

Minimum contig length to cover quality_measure of genome (i.e. percentile contig length)

Return type:

int

autometa.binning.summary.get_metabin_stats(bin_df: pandas.core.frame.DataFrame, markers_fpath: str, cluster_col: str = 'cluster') → pandas.core.frame.DataFrame

Retrieve statistics for all clusters recovered from Autometa binning.

Parameters:
  • bin_df (pd.DataFrame) – Autometa binning table. index=contig, cols=[‘cluster’,’length’, ‘GC’, ‘coverage’, …]
  • markers_fpath (str) – Path to autometa annotated and filtered markers table respective to kingdom binned.
  • cluster_col (str, optional) – Clustering column by which to group metabins
Returns:

dataframe consisting of various metagenome-assembled genome statistics indexed by cluster.

Return type:

pd.DataFrame

autometa.binning.summary.get_metabin_taxonomies(bin_df: pandas.core.frame.DataFrame, ncbi: autometa.taxonomy.ncbi.NCBI, cluster_col: str = 'cluster') → pandas.core.frame.DataFrame

Retrieve taxonomies of all clusters recovered from Autometa binning.

Parameters:
  • bin_df (pd.DataFrame) – Autometa binning table. index=contig, cols=[‘cluster’,’length’,’taxid’, *canonical_ranks]
  • ncbi (autometa.taxonomy.ncbi.NCBI instance) – Autometa NCBI class instance
  • cluster_col (str, optional) – Clustering column by which to group metabins
Returns:

Dataframe consisting of cluster taxonomy with taxid and canonical rank. Indexed by cluster

Return type:

pd.DataFrame

autometa.binning.summary.main()
autometa.binning.summary.write_cluster_records(bin_df: pandas.core.frame.DataFrame, metagenome: str, outdir: str, cluster_col: str = 'cluster') → None

Write clusters to outdir given clusters df and metagenome records

Parameters:
  • bin_df (pd.DataFrame) – Autometa binning dataframe. index=’contig’, cols=[‘cluster’, …]
  • metagenome (str) – Path to metagenome fasta file
  • outdir (str) – Path to output directory to write fastas for each metagenome-assembled genome
  • cluster_col (str, optional) – Clustering column by which to group metabins

autometa.binning.unclustered_recruitment module

Module contents