Skip to content

Instantly share code, notes, and snippets.

@shujishigenobu
Created August 16, 2012 04:58
Show Gist options
  • Select an option

  • Save shujishigenobu/3367058 to your computer and use it in GitHub Desktop.

Select an option

Save shujishigenobu/3367058 to your computer and use it in GitHub Desktop.
Obtain full path of lineage from root to target species
#===
# taxpath.rb
# Obtain full path of lineage from root to target species
### conf ###
taxtab = "/home/DB/public/processed/NCBI/taxonomy/taxdump/nodes.dmp"
nametab = "/home/DB/public/processed/NCBI/taxonomy/taxdump/names.dmp"
###
infile = ARGV[0]
tid_position = (ARGV[1] || 0).to_i
rel = Hash.new
File.open(taxtab).each do |l|
a = l.chomp.split(/\|/).map{|x| x.strip}
rel[a[0].to_i] = a[1].to_i
end
name = {}
File.open(nametab).each do |l|
a = l.chomp.split(/\|/).map{|x| x.strip}
next unless a[3] == "scientific name"
name[a[0].to_i] = a[1]
end
File.open(infile).each do |l|
path = []
a = l.chomp.split(/\t/)
tid = a[tid_position]
if /\d+/.match(tid)
q = tid.to_i
path << q
loop do
r = rel[q]
path << r
break if q == 1
q = r
end
path.pop
path_txt = path.map{|x| name[x]}
out_path = path.reverse.join(":")
out_path_txt = path_txt.reverse.join(":")
else
out_path = nil
out_path_txt = nil
end
out = []
out += a
out << out_path
out << out_path_txt
puts out.join("\t")
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment