Download - 96-Summer 生物資訊程式設計實習 ( 二 )
96-Summer生物資訊程式設計實習(二 )
Bioinformatics with Perl
8/13~8/22 蘇中才8/24~8/29 張天豪
8/31 曾宇鳯
ScheduleDate Time Subject Spea
ker
8/13 一
13:30~17:30 Perl Basics 蘇中才
8/15 三
13:30~17:30 Programming Basics 蘇中才
8/17 五
13:30~17:30 Regular expression 蘇中才
8/20 一
13:30~17:30 Retrieving Data from Protein Sequence Database
蘇中才
8/22 三
13:30~17:30 Perl combines with BLAST and ClustalW 蘇中才
8/24 五
13:30~17:30 PDB database and structure files 張天豪
8/27 一
8:30~12:30 Extracting ATOM information 張天豪
8/27 一
13:30~17:30 Mapping of Protein Sequence IDs and Structure IDs
張天豪
8/31五 13:30~17:30 Final and Examination 曾宇鳳
Process Management
Introduction
Introduction
systemsystem(“date”);
` ``date`;
execexec(“date”);
Introduction - system
system (“ … ”);
Examplesystem “date”;system 'ls', "-al", '/home/course1/';system ‘for i in *; do echo == $i ==; cat $i; done’;system “~/course5/output.pl”;
Return 0 if success
Execute ‘date’ by system
#!/usr/bin/perl -w
# date1.pl : execute a shell command - date
my $ret = system "date";
print "return = $ret\n";
Execute ‘date’ by system without message
#!/usr/bin/perl -w
# date2.pl : execute a shell command - date
my $ret = system "date > /dev/null";
print "return = $ret\n";
Introduction - ` `
` … `
Example$now = `date`;$result = `ls –al /home/course1/`;$result = `for i in *; do echo == $i ==; cat $i; do
ne`;`~/course5/output.pl`;
Execute ‘date’ by system without message
#!/usr/bin/perl -w
# date3.pl : execute a shell command - date
my $ret = `date`;
print "return = [$ret]\n";
Print message to STDOUT and STDERR
#!/usr/bin/perl -w
# output.pl : output a message to STDOUT and STDERR
print STDOUT "print to STDOUT\n";
print STDERR "print to STDERR\n";
Print message to STDOUT and STDERR
[course5]$ ./output.pl
print to STDOUT
print to STDERR
[course5]$ ./output.pl > log
print to STDERR
[course5]$ cat log
print to STDOUT
[course5]$ (./output.pl 2>&1) > log
[course5]$ cat log
print to STDOUT
print to STDERR
Execute ‘date’ by system without message
#!/usr/bin/perl -w
# redirect.pl : STDOUT and STDERR
my $ret = system "./output.pl";
print "redirect nothing ($ret)\n";
$ret = system "./output.pl 1>/dev/null";
print "redirect STDOUT to /dev/null ($ret)\n";
$ret = system "./output.pl 1>/dev/null 2>&1";
print "redirect STDOUT and STDERR to /dev/null ($ret)\n";
Print message to STDOUT and STDERR
#!/usr/bin/perl -w
# exec_output1.pl : execute output.pl
my $ret = `./output.pl`;chomp($ret);print "return = [$ret]\n";
Print message to STDOUT and STDERR
#!/usr/bin/perl -w
# exec_output2.pl : execute output.pl
my $ret = `./output.pl 2>&1`;chomp($ret);print "return = [$ret]\n";
%ENV
Shell commandenv
Example$ENV{‘PATH’}$ENV{‘HOME’}$ENV{‘HOSTNAME’}$ENV{‘USER’}
%ENV#!/usr/bin/perl -w
# env.pl : execute a shell command - env
my @ret = `env`;foreach (@ret) { chomp; if (/PATH/) { print "$_\n"; }}
print "PATH=$ENV{'PATH'}\n";
Process Management
Arguments, Here-document
Arguments parsinguse Getopt::Std;getopts( "hvf:", \%opt ) or usage();usage() if $opt{h};usage() if (!defined{$opt{f});
sub usage(){ print STDERR << "EOF";usage: $0 [-hv] [-f file] -h : this (help) message -v : verbose output -f file : file containing usersnames, one per line
example: $0 -v -f fileEOF exit;}
Here-document
print <<EOF;print me!!!!print you!!!!print us!!!!EOF
print << x 3;print me!!!!
Exercise
system and ` `
Quiz – system & ` `
Are they workable ?system ‘for i in *; do echo == $i ==; cat $i; done’;
$result = `for i in *; do echo == $i ==; cat $i; done`;
Why ?
Quiz – sleep10.pl
#!/usr/bin/perl -w
# sleep10.pl : sleep 10 seconds
foreach (1..10) { print "$_\n"; sleep 1;}
Quiz – system & ` `
Do they execute by background mode?system ‘./sleep10.pl &’;
$result = `./sleep10.pl &`;
Why ?
Project
BLAST, ClustalW
Project1 - BLAST
TodoGet the result from BlastExtract its homology (evalue <= 10^-1)
Input A protein sequence (FASTA format)
Output All homologous sequences of the query sequence
BLAST
Get BLAST packages ftp://ftp.ncbi.nih.gov/blast/
Get nr database ftp://ftp.ncbi.nih.gov/blast/db/
Command ~/tools/PSSM/BLAST/blastall -p blastp -i P53_HUMAN.fa
-o output.txt -d /home/sbb/tools/PSSM/BLAST/db/SwissProt.v50.fa –m 9
Project2 - ClustalW
Tododo multiple sequence alignment by ClustalW
Input A protein sequence with its homology (FASTA format)
Output The conserved score of each residue in the query sequen
ce
ClustalW
Get ClustalW package ftp://ftp.ebi.ac.uk/pub/software/unix/clustalw/
Command ./clustalW <fasta sequences>