wub.read_stats package

Submodules

wub.read_stats.contig_stats module

wub.read_stats.contig_stats.GC_per_read(seq_rec, fq=False)[source]

Calculates the number of bases per sequence, GC content and mean Q score if fastq is given

Parameters:
  • seq_rec – sequence records with attr from biopython
  • fq – boolean
Returns:

dataframe

Return type:

dataframe

wub.read_stats.contig_stats.L50(df, col, percent=50)[source]

Calculate the L50 by default however, by changing percent to 75, N75 can be calculated

Parameters:
  • df – dataframe with seqlen column
  • col – column with sequence length
  • percent – percentage to be calculated
Returns:

N50 Value

Return type:

int

wub.read_stats.contig_stats.N50(df, col, percent=50)[source]

Calculate the N50 by default however, by changing percent to 75, N75 can be calculated.

Parameters:
  • df – dataframe with seqlen column
  • col – column with sequence length
  • percent – percentage to be calculated
Returns:

N50 Value

Return type:

int

wub.read_stats.contig_stats.get_stats(df)[source]

Calcualtes the summary stats

Parameters:df – dataframe from GC_per_read
Returns:summary Series
Return type:Series
wub.read_stats.contig_stats.readfast(fast)[source]

reads a fasta or fastq file.

Parameters:fast – fastq or fasta
Returns:list of records with attr
Return type:generator object

Module contents