Skip to content

Instantly share code, notes, and snippets.

@decodebiology
Last active February 24, 2017 17:18
Show Gist options
  • Select an option

  • Save decodebiology/30cf37a451ce411a4da698e38cc4e782 to your computer and use it in GitHub Desktop.

Select an option

Save decodebiology/30cf37a451ce411a4da698e38cc4e782 to your computer and use it in GitHub Desktop.
Extract CpG island shores and CpG Island Shelves from CpG Islands (CGI)
#!/usr/bin/perl
$/=undef;
## Run script: perl CpG_feature_extraction.pl CpG_islands.bed shores/shelves (This gives warnings for 'rm' command, just ignore it)
# Reference for definition of shores and shelves from:
# http://www.mdpi.com/2073-4425/5/2/347/htm
# http://www.mdpi.com/genes/genes-05-00347/article_deploy/html/images/genes-05-00347-g001-1024.png
extend_coordinates($ARGV[0] ,$ARGV[1]);
sub extend_coordinates
{
$type=$ARGV[1];
open(FILE, ">>${ARGV[0]}_${type}.bed");
open(IN,$ARGV[0]) or die $!;
$file=<IN>;
@myfile=split("\n",$file);
foreach $myfile(@myfile)
{
($chr,$start,$stop,$name)=split("\t",$myfile);
if($type eq "shores")
{
$up_start=$start-2000;
$up_end=$start;
$dn_start=$stop;
$dn_end=$stop+2000;
}
if($type eq "shelves")
{
$up_start=$start-4000;
$up_end=$start-2000;
$dn_start=$stop+2000;
$dn_end=$stop+4000;
}
print FILE "$chr\t$up_start\t$up_end\t${name}_up\n$chr\t$dn_start\t$dn_end\t${name}_dn\n";
}
}
close(FILE);
system( "rm ${ARGV[0]}_${ARGV[1]}.bed ${ARGV[0]}_${ARGV[1]}_noOverlap.bed" );
&extend_coordinates;
system( "sort -k1,1 -k2,2n ${ARGV[0]}_${ARGV[1]}.bed | awk '{if(\$1!=l||\$2>e){if(l)print l,b,e,\$4;l=\$1;b=\$2;e=\$3}else e=\$3}END{print l,b,e,\$4}' | sed 's/ /\t/g' > ${ARGV[0]}_${ARGV[1]}_noOverlap.bed " );
system( "rm ${ARGV[0]}_${ARGV[1]}.bed" );
@decodebiology
Copy link
Author

decodebiology commented Feb 24, 2017

For CGI Shores:

perl CpG_feature_extraction.pl CpG_islands.bed shores

For CGI Shelves:

perl CpG_feature_extraction.pl CpG_islands.bed shelves

Note: Still you need to use bedtools to get rid of CpG shores overlapping with CpG shelves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment