Skip to content

Instantly share code, notes, and snippets.

View idolawoye's full-sized avatar

Idowu Olawoye idolawoye

  • Toronto, ON
  • 18:19 (UTC -05:00)
View GitHub Profile
@idolawoye
idolawoye / kraken_bash.txt
Last active December 7, 2022 20:08
One liner to print out percentage of reads from a Kraken report file
grep -w -F -f taxon.txt *report.txt | awk 'BEGIN{OFS="\t"}{ print $1,$2}' > tb.txt
# taxon.txt is a file containing the Taxon name you want to summarize, e.g: Mycobacterium tuberculosis complex
# *.report.txt is the wildcard for selecting multiple Kraken report files
# tb.txt is the output TSV file
@idolawoye
idolawoye / count_N_percentage.py
Created June 11, 2022 13:37
Python script to count number of Ns in a multifasta file
#!/usr/bin/env python
from Bio import SeqIO
fasta = "the_fasta_file.fasta"
for record in SeqIO.parse(fasta, "fasta"):
print("ID: %s" % record.id)
print("Sequence length: %s" % len(record))
print("Number of Ns: %s" % record.seq.count('N'))
@idolawoye
idolawoye / assembly_coverage.txt
Created October 2, 2020 08:30
Calculate average genome coverage on aligned BAM files
samtools depth CIV3724802_ref_bwa_sorted.bam | awk '{sum+=$3} END { print "Average = ",sum/NR}'
@idolawoye
idolawoye / gist:0c219560f82e8981aefc716b78d1c019
Created April 8, 2019 11:24
Shell script for downloading bulk files
list=`cat TEXT_FILE` # list of the record file IDs.
for i in $list
do echo $i
SHELL COMMAND [OPTIONS] $i #Command with file id
done
@idolawoye
idolawoye / gist:069615f51911b1c64d985cf816fa04be
Last active January 30, 2019 13:42
BWA mapping of different samples against a reference genome
total_files=`find -name '*.fastq' | wc -l`
arr=( $(ls *.fastq) )
echo "mapping started" >> map.log
echo "---------------" >> map.log
for ((i=0; i<$total_files; i+=2))
{
ref_genome=../ref.gb
sample_name=`echo ${arr[$i]} | awk -F "_" '{print $1}'`
echo "[mapping running for] $sample_name"