bioinformatics 生物信息学理论和实践 唐继军 [email protected] 13928761660

66
Bioinformatics 生生生生生生生生生生 生生生 [email protected] 13928761660

Upload: lenard-dean

Post on 14-Dec-2015

277 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Bioinformatics生物信息学理论和实践

唐继军

[email protected]

Page 2: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

www.cse.sc.edu/~jtang/BJFU

Page 3: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

作业• GTTGCAGCAATGGTAGACTCAACGGTAGCAAT

AACTGCAGGACCTAGAGGAAAAACAGTAGGGATTAATAAGCCCTATGGAGCACCAGAAATTACAAAAGATGGTTATAAGGTGATGAAGGGTATCAAGCCTGAA

•为什么用缺省 blast出不来结果?需要如何选择?

•相关物种的最新 pubmed文章有哪些?

Page 4: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660
Page 5: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Working with Directories

• Directories are a means of organizing your files on a Linux computer. • They are equivalent to folders on Windows and

Macintosh computers • Directories contain files, executable

programs, and sub-directories • Understanding how to use directories is

crucial to manipulating your files on a Linux system.

Page 6: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

File & Directory Commands• This is a minimal list of Linux commands that you

must know for file management:

• All of these commands can be modified with many options. Learn to use Linux ‘man’ pages for more information.

ls (list) mkdir (make directory)

cd (change directory)

pwd (present directory)

cp (copy) rm (remove)

mv (move) more (view by page)

cat (view entire) man (help)

Page 7: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Navigation• pwd (present working directory) shows the name

and location of the directory where you are currently working: > pwd

/home/jtang• This is a “pathname,” the slashes indicate sub-directories• The initial slash is the “root” of the whole filesytem

• ls (list) gives you a list of the files in the current directory:• > ls

assembin4.fasta Misc test2.txtbin temp testfile

• Use the ls -l (long) option to get more information about each file

> ls -l total 1768

drwxr-x--- 2 browns02 users 8192 Aug 28 18:26 Opioid-rw-r----- 1 browns02 users 6205 May 30 2000 af124329.gb_in2-rw-r----- 1 browns02 users 131944 May 31 2000 af151074.fasta

Page 8: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Sub-directories• cd (change directory) moves you to another

directory>cd Misc> pwd/u/browns02/Misc

• mkdir (make directory) creates a new sub-directory inside of the current directory

> lsassembler phrap space> mkdir subdir> lsassembler phrap space subdir

• rmdir (remove directory) deletes a sub-directory, but the sub-directory must be empty

> rmdir subdir> lsassembler phrap space

Page 9: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Create new files

• nano• vi/vim• emacs

Page 10: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Programming

• perl• python• c/c++• R• Java

Page 11: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

more• Use the command more to view at the contents of

a file one screen at a time:> more t27054_cel.pep!!AA_SEQUENCE 1.0P1;T27054 - hypothetical protein Y49E10.20 - Caenorhabditis elegansLength: 534 May 30, 2000 13:49 Type: P Check: 1278 .. 1 MLKKAPCLFG SAIILGLLLA AAGVLLLIGI PIDRIVNRQV IDQDFLGYTR

51 DENGTEVPNA MTKSWLKPLY AMQLNIWMFN VTNVDGILKR HEKPNLHEIG101 PFVFDEVQEK VYHRFADNDT RVFYKNQKLY HFNKNASCPT CHLDMKVTIP

t27054_cel.pep (87%)

• Hit the spacebar to page down through the file• Ctrl-U moves back up a page• At the bottom of the screen, more shows how much of

the file has been displayed

• Similar command: less

Page 12: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Copy & Move• cp lets you copy a file from any directory to

any other directory, or create a copy of a file with a new name in one directory

• cp filename.ext newfilename.ext• cp filename.ext subdir/newname.ext• cp /u/jdoe01/filename.ext ./subdir/newfilename.ext

• mv allows you to move files to other directories, but it is also used to rename files. • Filename and directory syntax for mv is exactly the same as

for the cp command. • mv filename.ext subdir/newfilename.ext

• NOTE: When you use mv to move a file into another directory, the current file is deleted.

Page 13: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Delete• Use the command rm (remove) to delete

files• There is no way to undo this command!!!

• We have set the server to ask if you really want to remove each file before it is deleted.

• You must answer “Y” or else the file is not deleted.

• But can use –f• rm –rf

Page 14: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

View File Permissions• Use the ls -l command to see the permissions for all files

in a directory:

• The username of the owner is shown in the third column. (The owner of the files listed above is jtang)

• The owner belongs to the group “None”

• The access rights for these files is shown in the first column. This column consists of 10 characters known as the attributes of the file: r, w, x, and -

r indicates read permission w indicates write (and delete) permissionx indicates execute (run) permission - indicates no permission for that operation

$ ls -ltotal 2-rw-r--r-- 1 jtang None 56 Feb 29 11:21 data.txt-rwxr-xr-x 1 jtang None 33 Feb 29 11:21 test.pl

Page 15: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Change Protections• Only the owner of a file can change its protections• To change the protections on a file use the chmod

(change mode) command. [Beware, this is a confusing command.]

• Taken all together, it looks like this: > chmod 644 data.txtThis will set the owner to have read, write; add the permission for the group

and the world to read

600, 755, 700,

Page 16: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Commands for Files• Files are used to store information, for

example, data or the results of some analysis.• You will mostly deal with text files• Files on the RCR Alpha are automatically backed up to tape

every night.

• cat dumps the entire contents of a file onto the screen. • For a long file this can be annoying, but it can also be

helpful if you want to copy and paste (use the buffer of your telnet program)

Page 17: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

FTP/SCP is Simple• File Transfer Protocol is standard for all

computers on any network.• The best way to move lots of data to and

from remote machines: • put raw data onto the server for analysis• get results back to the desktop for use in papers

and grants• Graphical FTP applications for desktop PCs

• On a Mac, use Fetch, CyberDuck (!) • On a Windows PC, use WS_FTP, FileZilla• winscp

Page 18: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Some More Advanced Linux Commands

• grep: searches a file for a specific text pattern

• cut: copies one or more columns from a tab-delimited text file

• wc: word count• | : the pipe — sends output of one

command as input to the next • > : redirect output to a file

Page 19: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Perl

Page 20: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Why Write Programs?

• Automate computer work that you do by hand - save time & reduce errors

• Run the same analysis on lots of similar data files = scale-up

• Analyze data, make decisions • sort Blast results by e-value &/or species of best mach

• Build a pipeline • Create new analysis methods

Page 21: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Why Perl?

• Fairly easy to learn the basics• Many powerful functions for working with

text: search & extract, modify, combine • Can control other programs • Free and available for all operating systems• Most popular language in bioinformatics• Many pre-built “modules” are available

that do useful things

Page 22: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Get Perl

• You can install Perl on any type of computer

• Download and install Perl on your own computer:

www.perl.org

Page 23: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Programming Concepts

• Program = a text file that contains instructions for the computer to follow

• Programming Language = a set of commands that the computer understands (via a “command interpreter”)

• Input = data that is given to the program• Output = something that is produced by

the program

Page 24: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Programming

• Write the program (with a text editor)• Run the program• Look at the output• Correct the errors (debugging)• Repeat(computers are VERY dumb -they do exactly

what you tell them to do, so be careful what you ask for…)

Page 25: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Basic Concepts

• Variables and Assignment• Conditions• Loop• Input/Output (I/O)• Procedures/functions

Page 26: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Strings

• Text is handled in Perl as a string• This basically means that you have to put

quotes around any piece of text that is not an actual Perl instruction.

• Perl has two kinds of quotes - single ‘and double “

(they are different- single quote will print as is)

Page 27: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Print

• Perl uses the term “print” to create output• Without a print statement, you won’t

know what your program has done• You need to tell Perl to put a carriage

return at the end of a printed line• Use the “\n” (newline) command

• Include the quotes• The “\” character is called an escape - Perl

uses it a lot

Page 28: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Your First Perl Program

• Open a new text file>nano prog1.pl

• Type:#!/usr/bin/perl

#my first perl program

print "Hello world\n";

Page 29: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Program details

• Perl programs always start with the line:

#!/usr/bin/perl• this tells the computer that this is a Perl program and

where to get the Perl interpreter

• All other lines that start with # are considered comments, and are ignored by Perl

• Lines that are Perl commands end with a ;

Page 30: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Run your Perl program

>perl prog1.pl[#use the perl interpreter to run your script]

>chmod 755 *.pl [#make the file executable]

>./prog1.pl [run it]

Page 31: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl

$DNA = 'ACGT';

# Next, we print the DNA onto the screenprint $DNA, "\n";

print '$DNA\n';

print "$DNA\n";

exit;

Page 32: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Numbers and Functions• Perl handles numbers in most common formats:

4565.67436.3E-26

• Mathematical functions work pretty much as you would expect:

4+76*443-27256/122/(3-5)

Page 33: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Do the Math (your 2nd Perl program)

#!/usr/bin/perlprint "4+5\n";print 4+5 , "\n";print "4+5=" , 4+5 , "\n";

[Note: use commas to separate multiple items in a print statement, whitespace is ignored]

Page 34: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Variables• To be useful at all, a program needs to be able to

store information from one line to the next• Perl stores information in variables• A variable name starts with the “$” symbol, and it

can store strings or numbers• Variables are case sensitive• Give them sensible names

• Use the “=”sign to assign values to variables$one_hundred = 100;$my_sequence = "ttattagcc";

Page 35: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

You can do Math with Variables

#!/usr/bin/perl#put some values in variables$sequences_analyzed = 200 ;$new_sequences = 21 ;#now we will do the work$percent_new_sequences =( $new_sequences / $sequences_analyzed) *100 ;print "% of new sequences = " , $percent_new_sequences;

% of new sequences = 952.381

Page 36: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

• Strings (text) in variables can be used for some math-like operations

• Concatenate (join) use the dot . operator$seq1= "ACTG";$seq2= "GGCTA";$seq3= $seq1 . $seq2;print $seq3;

ACTGGGCTA

String Operations

Page 37: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl# Storing DNA in a variable, and printing it out

# First we store the DNA in a variable called $DNA$DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC';

# Next, we print the DNA onto the screenprint $DNA;

# Finally, we'll specifically tell the program to exit.exit;

Page 38: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$DNA1 = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC';$DNA2 = 'ATAGTGCCGTGAGAGTGATGTAGTA';

print "Here are the original two DNA fragments:\n\n";print $DNA1, "\n";print $DNA2, "\n\n";

# Using "string interpolation"$DNA3 = "$DNA1$DNA2";

print "Here is the concatenation of the first two fragments (version 1):\n\n";print "$DNA3\n\n";

# An alternative way using the "dot operator":$DNA3 = $DNA1 . $DNA2;print “Here is the concatenation of the first two fragments (version 2):\n\n”;print "$DNA3\n\n";

exit;

Page 39: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl –w$DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC';

print "Here is the starting DNA:\n\n";print "$DNA\n\n";

# Transcribe the DNA to RNA by substituting all T's with U's.$RNA = $DNA;

$RNA =~ s/T/U/g;

# Print the RNA onto the screenprint "Here is the result of transcribing the DNA to RNA:\n\n";

print "$RNA\n";

# Exit the program.exit;

Page 40: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Exercises

• Create a dir named Exercises in your home dir

• Create a folder Class1 in your Exercises dir• Create three perl programs

• Prog2: Cancatenate three DNAs• Prog3: Convert a DNA to one with lower cases

• A->a, C->c, G->g, T->t• Chmod, Test and Debug

Page 41: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC';

print "$DNA\n\n";

$revcom = reverse $DNA;

$revcom =~ s/A/T/g;$revcom =~ s/T/A/g;$revcom =~ s/G/C/g;$revcom =~ s/C/G/g;

# Print the reverse complement DNA onto the screenprint "Here is the reverse complement DNA:\n\n";

print "$revcom\n";

Page 42: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC';

print "$DNA\n\n";

$revcom = reverse $DNA;

# See the text for a discussion of tr///$revcom =~ tr/ACGTacgt/TGCAtgca/;

# Print the reverse complement DNA onto the screenprint "Here is the reverse complement DNA:\n\n";

print "$revcom\n";

exit;

Page 43: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Exercise

• Change your previous program so that it can convert to lowercases easier

Page 44: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

More

• In Exercise, create a dir named Class2• Using nano, create a file named NM_021964fragment.pep• Put some amino acid sequence into it• Save and quit

Page 45: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

# The filename of the file containing the protein sequence data$proteinfilename = 'NM_021964fragment.pep';

# First we have to "open" the fileopen(PROTEINFILE, $proteinfilename);

$protein = <PROTEINFILE>;

# Now that we've got our data, we can close the file.close PROTEINFILE;

# Print the protein onto the screenprint "Here is the protein:\n\n";

print $protein;

exit;

Page 46: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

More

• Using nano, add two more lines to NM_021964fragment.pep• Save and quit

Page 47: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$proteinfilename = 'NM_021964fragment.pep';

open(PROTEINFILE, $proteinfilename);

# First line$protein = <PROTEINFILE>;print “\nHere is the first line of the protein file:\n\n”;print $protein;

# Second line$protein = <PROTEINFILE>;print “\nHere is the second line of the protein file:\n\n”;print $protein;

# Third line$protein = <PROTEINFILE>;print “\nHere is the third line of the protein file:\n\n”;print $protein;

close PROTEINFILE;exit;

Page 48: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Exercise

• Create a file named dna.fasta• Add two lines to this file:

• >DNA1• ATGCGGGATGGAGCGCGC

• Write a program, open it, print the DNA name and the sequence

• How to avoid the print of “>”?

Page 49: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

# The filename of the file containing the protein sequence data$proteinfilename = 'NM_021964fragment.pep';

# First we have to "open" the fileopen(PROTEINFILE, $proteinfilename);

# Read the protein sequence data from the file, and store it# into the array variable @protein@protein = <PROTEINFILE>;

# Print the protein onto the screenprint @protein;

# Close the file.close PROTEINFILE;

exit;

Page 50: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w# "scalar context" and "list context"

@bases = ('A', 'C', 'G', 'T');

print "@bases\n";

$a = @bases;

print $a, "\n";

($a) = @bases;

print $a, "\n";

exit;

Page 51: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w# array indexing

@bases = ('A', 'C', 'G', 'T');

print "@bases\n";

print $bases[0], "\n";

print $bases[1], "\n";

print $bases[2], "\n";

print $bases[3], "\n";

exit;

Page 52: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w# array indexing

@coins = ("Quarter","Dime","Nickel");

print $coins;

print $coins[0], "\n";

exit;

Page 53: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w# array indexing

@coins = qw(Quarter Dime Nickel);

print $coins[0], "\n";

exit;

Page 54: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w# array indexing

@coins = qw(Quarter Dime Nickel);

$x = join('‘, @coins);print $x;

print join(' ', @coins);

exit;

Page 55: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w# array indexing

$coins = "Quarter Dime Nicke";

@y = split(' ', $coins);

print $y[0], "\n";@y = split(',', $coins);

print $y[0];

exit;

Page 56: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

String functions

• Chomp• Length of a string• Substring

Page 57: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$proteinfilename = 'NM_021964fragment.pep';

open(PROTEINFILE, $proteinfilename);

$protein = <PROTEINFILE>;

close PROTEINFILE;

$len = length $protein;

print $len, "";

exit;

Page 58: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$proteinfilename = 'NM_021964fragment.pep';

open(PROTEINFILE, $proteinfilename);

$protein = <PROTEINFILE>;

close PROTEINFILE;

chomp $protein;

$len = length $protein;

print $len, "";

exit;

Page 59: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$proteinfilename = 'NM_021964fragment.pep';

open(PROTEINFILE, $proteinfilename);

$protein = <PROTEINFILE>;

close PROTEINFILE;

chomp $protein;

$st1 = substr($protein, 0, 2);

print $st1, "";

exit;

#or substr $protein, 0, 2;

Page 60: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$proteinfilename = 'NM_021964fragment.pep';

open(PROTEINFILE, $proteinfilename);

$protein = <PROTEINFILE>;

close PROTEINFILE;

chomp $protein;

$st1 = substr($protein, 3);

print $st1, "";

exit;

#or substr $protein, 0, 2;

Page 61: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Exercise

• Create a DNA fasta file with one > and three lines of sequence data

• Show those lines onto the screen• Show the number of characters in the

sequence• How can we show them into one line?• Play with substr method• Can we tell how many A in the sequence?

Page 62: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl -w

$proteinfilename = 'NM_021964fragment.pep';

unless ( open(PROTEINFILE, $proteinfilename) ) {

print "Could not open file $proteinfilename!\n"; exit;}

while( $protein = <PROTEINFILE> ) {

print " ###### Here is the next line of the file:\n";

print $protein;}

# Close the file.close PROTEINFILE;

exit;

Page 63: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Bigger Exercise

• Create a DNA fasta file with one > and several lines of sequence data

• Show those lines onto the screen• Show the number of characters in the

sequence• How can we show them into one line?

Page 64: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Comparison

• String comparison (are they the same, > or <)• eq (equal ) • ne (not equal ) • ge (greater or equal ) • gt (greater than ) • lt (less than )• le (less or equal )

Page 65: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

Conditions

• if () {}• elsif() {}• else {}

Page 66: Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660

#!/usr/bin/perl –w$word = 'MNIDDKL';if($word eq 'QSTVSGE') {

print "QSTVSGE\n";} elsif($word eq 'MRQQDMISHDEL') { print "MRQQDMISHDEL\n";} elsif ( $word eq 'MNIDDKL' ) { print "MNIDDKL-the magic word!\n";} else { print "Is \”$word\“ a peptide?\n";}exit;