protein sequencing and identification with mass spectrometry · pdf filean introduction to...
Post on 05-Feb-2018
231 Views
Preview:
TRANSCRIPT
www.bioalgorithms.infoAn Introduction to Bioinformatics Algorithms
Protein Sequencing and
Identification With Mass
Spectrometry
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Outline• Tandem Mass Spectrometry
• De Novo Peptide Sequencing
• Spectrum Graph
• Protein Identification via Database Search
• Identifying Post Translationally Modified Peptides
• Spectral Convolution
• Spectral Alignment
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Amino Acids vs. Nucleic Acids
Amino Acids: Amine, Carboxylic Acid, R-group
Nucleic Acids:Sugar, Phosphate, Base
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Protein Backbone
H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH
Ri-1 Ri Ri+1
AA residuei-1 AA residuei AA residuei+1
N-terminus C-terminus
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Breaking of Protein Backbone
H...-HN-CH-CO NH-CH-CO-NH-CH-CO-…OH
Ri-1 Ri Ri+1
AA residuei-1 AA residuei AA residuei+1
N-terminus C-terminus
H+
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Breaking Peptides into Fragment Ions
• Proteases, e.g. trypsin, break protein intopeptides.
• A Tandem Mass Spectrometer further breaksthe peptides down into fragment ions andmeasures the mass of each piece.
• Mass Spectrometer electrically accelerates thefragmented ions; heavier ions accelerate slowerthan lighter ones.
• Mass Spectrometers measure mass/chargeratio of an ion.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Mass SpectrometryMatrix-assisted Laser Desorption/Ionization
From lectures by Vineet Bafna (UCSD)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Tandem Mass SpectrometryRT: 0.01 - 80.02
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80Time (min)
0
10
20
30
40
50
60
70
80
90
100
Relative Abundance
1389 19911409 21491615 16211411 2147
161119951655
15931387
21551435 1987 2001 21771445 16611937
22051779 21352017
1313 22071307 23291105 17071095
2331
NL:1.52E8Base Peak F: + c Full ms [ 300.00 - 2000.00]
S#: 1708 RT: 54.47 AV: 1 NL: 5.27E6T: + c d Full ms2 638.00 [ 165.00 - 1925.00]
200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Relative Abundance
850.3
687.3
588.1
851.4425.0
949.4
326.0524.9
589.2
1048.6397.1226.9
1049.6489.1
629.0
Scan 1708
LC
S#: 1707 RT: 54.44 AV: 1 NL: 2.41E7F: + c Full ms [ 300.00 - 2000.00]
200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Relative Abundance
638.0
801.0
638.9
1173.8872.3 1275.3
687.6944.7 1884.51742.11212.0783.3 1048.3 1413.9 1617.7
Scan 1707
MS
MS/MSIon
Source
MS-1collision
cell MS-2
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Using Tandem Mass Spectrometry
SSeeqquueennccee
S#: 1708 RT: 54.47 AV: 1 NL: 5.27E6T: + c d Full ms2 638.00 [ 165.00 - 1925.00]
200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Relative Abundance
850.3
687.3
588.1
851.4425.0
949.4
326.0524.9
589.2
1048.6397.1226.9
1049.6489.1
629.0
MS/MS instrumentMS/MS instrument
Database search•Sequestde Novo interpretation•Sherenga
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Tandem Mass Spectrum• Tandem Mass Spectrometry (MS/MS): mainly
generates partial N- and C-terminal peptides
• Spectrum consists of different ion typesbecause peptides can be broken in severalplaces.
• Chemical noise often complicates thespectrum.
• Represented in 2-D: mass/charge axis vs.intensity axis
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Tandem Mass Spectrum: An Example
Secondary Fragmentation
Ionized parent peptide
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
N- and C-terminal Peptides
N-term
inal
pep
tides
C-te
rmin
al p
eptid
es
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Terminal peptides and ion types
Peptide
Mass (D) 57 + 97 + 147 + 114 = 415
Peptide
Mass (D) 57 + 97 + 147 + 114 – 18 = 397
without
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Peptide Fragmentation
y3
b2
y2 y1
b3a2 a3
HO NH3+
| |
R1 O R2 O R3 O R4
| || | || | || |H -- N --- C --- C --- N --- C --- C --- N --- C --- C --- N --- C -- COOH | | | | | | | H H H H H H H
b2-H2O
y3 -H2O
b3- NH3
y2 - NH3
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
De novo Peptide Sequencing
S#: 1708 RT: 54.47 AV: 1 NL: 5.27E6T: + c d Full ms2 638.00 [ 165.00 - 1925.00]
200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Relative Abundance
850.3
687.3
588.1
851.4425.0
949.4
326.0524.9
589.2
1048.6397.1226.9
1049.6489.1
629.0
SequenceSequence
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Theoretical Spectrum
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Theoretical Spectrum (cont’d)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Theoretical Spectrum (cont’d)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Building Spectrum Graph• How to create vertices (from peaks)
• How to create edges (from mass differences)
• How to score paths
• How to find best path
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
S E Q U E N C E
b
Mass/Charge (M/Z)Mass/Charge (M/Z)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
a
Mass/Charge (M/Z)Mass/Charge (M/Z)
S E Q U E N C E
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
S E Q U E N C E
Mass/Charge (M/Z)Mass/Charge (M/Z)
a is an ion type shift in b
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
y
Mass/Charge (M/Z)Mass/Charge (M/Z)
E C N E U Q E S
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
y with corresponding intensities
Mass/Charge (M/Z)Mass/Charge (M/Z)
E C N E U Q E S
Inte
nsity
Inte
nsity
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Mass/Charge (M/Z)Mass/Charge (M/Z)
Inte
nsity
Inte
nsity
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Mass/Charge (M/Z)Mass/Charge (M/Z)
Inte
nsity
Inte
nsity
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
noise
Mass/Charge (M/Z)Mass/Charge (M/Z)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
MS/MS Spectrum
Mass/Charge (M/z)Mass/Charge (M/z)
Inte
nsity
Inte
nsity
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Mass Differences Correspond to Amino Acids
ss
ssss
ee
eeee
ee
ee
ee
ee
ee
qquu
uu
uu
nn
nn
nn
ee
cc
cc
cc
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Ion Types• Some peaks correspond to fragment ions,
others are just random noise
• Knowing ion types _={_1, _2,…, _k} lets us
distinguish fragment ions from noise
• We can learn ion types _i and their
probabilities qi by analyzing a large test
sample of annotated spectra.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Example of Ion Type
• _={_1, _2,…, _k}
• _={b, b-NH3, b-H2O}
• Corresponding values of _={0, 17, 18}
• *Note: In reality the _ value of ion type b is -1
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Peptide Sequencing ProblemGoal: Find a peptide with maximal match between
an experimental and theoretical spectrum.Input:
• S: experimental spectrum• _: set of possible ion types• m: parent mass
Output:• P: peptide with mass m, whose theoretical
spectrum matches the experimental Sspectrum the best
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Vertices• Masses of potential N-terminal peptides
• Vertices are generated by reverse shift
• Every peak s in a spectrum generates
vertices
• V(s) = {s+_1, s+ _2, …, s+ _k}
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Vertices (cont’d)
• Vertices of the spectrum graph:
• {vinit}∪V(s1) ∪V(s2) ∪... ∪V(sm) ∪{vfin}
• Where _={_1, _2,…, _k} are ion types.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Reverse Shifts
• Two peaks b-H2O and b are given by the MassSpectrum
• With a +H2O shift, if two peaks coincide that is apossible vertex.
Mass/Charge (M/Z)Mass/Charge (M/Z)
Inte
nsity
Inte
nsity Red: Mass Spectrum
Blue: shift (+H2O)
b/b-H2O+H2O
b-H2O b+H2O
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Example of Reverse Shift
Shift in H2O and NH3
Shift in H2O
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Edges
• Two vertices with mass difference
corresponding to an amino acid A:
• Connect with an edge labeled by A
• Gap edges for di- and tri-peptides
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Paths
• Path in the graph corresponds to an aminoacid sequence
• There are many paths, how to find the correctone?
• We need scoring to evaluate paths
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Path Score• p(P,S) = probability that peptide P produces
spectrum S = {s1,s2,…sq}
• p(P, s) = the probability that peptide Sgenerates a peak s
• Scoring = computing probabilities
• p(P,S) = !s_S p(P, s)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
• For a position t that represents ion type dj :
qj, if peak is generated at t
p(P,st) =
1-qj , otherwise
Peak Score
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Peak Score (cont’d)
• For a position t that is not associated with anion type:
qR , if peak is generated at t
pR(P,st) =
1-qR , otherwise
• qR = the probability of a noisy peak that doesnot correspond to any ion type
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Finding Optimal Paths in the Spectrum Graph
• For a given MS/MS spectrum S, find apeptide P’ maximizing p(P,S) over allpossible peptides P:
• Peptides = paths in the spectrum graph
• P’ = the optimal path in the spectrum graph
p(P,S)p(P',S) Pmax=
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Ions and Probabilities• Tandem mass spectrometry is characterized
by a set of ion types {•‰1,•‰2,..,•‰k} and theirprobabilities {q1,...,qk}
¶U•‰i-ions of a partial peptide are producedindependently with probabilities qi
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Ions and Probabilities
• A peptide has all k peaks with probability
• and no peaks with probability
• A peptide also produces a ``random noise''with uniform probability qR in any position.
∏=
k
iiq
1
∏=
−k
iiq
1
)1(
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Ratio Test Scoring for Partial Peptides
• Incorporates premiums for observed ionsand penalties for missing ions.
• Example: for k=4, assume that for a partialpeptide P’ we only see ions •‰1,•‰2,•‰4.
The score is calculated as:RRRR q
q
q
q
q
q
q
q 4321
)1(
)1(⋅
−
−⋅⋅
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Scoring Peptides
• T- set of all positions.
• Ti={t _1,, t _2,..., ,t _k,}- set of positions thatrepresent ions of partial peptides Pi.
• A peak at position t_j is generated withprobability qj.
• R=T- U Ti - set of positions that are notassociated with any partial peptides (noise).
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Probabilistic Model
• For a position t _j ∈ Ti the probability p(t, P,S) thatpeptide P produces a peak at position t.
• Similarly, for t∈R, the probability that P produces arandom noise peak at t is:
−=
otherwise1
position tat generated ispeak a if),,( j
j
j
q
qSPtP δ
−=
otherwise1
position tat generated ispeak a if)(
R
RR q
qtP
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Probabilistic Score
• For a peptide P with n amino acids, the scorefor the whole peptides is expressed by thefollowing ratio test:
∏∏= =
=n
i
k
j iR
i
R j
j
tp
SPtp
Sp
SPp
1 1 )(
),,(
)(
),(
δ
δ
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Role of de novo Interpretation
• Interpreting MS/MS of novel peptides
• Automatic validation of MS/MS database
matches.
• Leveraging homology matching across
species
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Post-Translational ModificationsProteins are involved in cellular signaling and
metabolic regulation.
They are subject to a large number of biologicalmodifications.
Almost all protein sequences are post-translationally modified and 200 types ofmodifications of amino acid residues areknown.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Examples of Post-TranslationalModification
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Difficulties in Finding Post-Translational ModificationsCurrently post-translational modifications
cannot be inferred from DNA sequences.
Finding post-translational modificationsremains an open problem even after thehuman genome is completed.
Post-translational modifications increase thenumber of “letters” in amino acid alphabetand lead to a combinatorial explosion.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Sequencing of Modified Peptides
De novo peptide sequencing is invaluable foridentification of unknown proteins:
However, de novo algorithms are designed forworking with high quality spectra with goodfragmentation and without modifications.
Another approach is to compare a spectrumagainst a set of known spectra in a database.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Functional Proteomics
• Problem: Given a large collection ofuninterpreted spectra, find out which spectracorrespond to similar peptides.
• A method that cross-correlates relatedspectra (e.g., from normal and diseasedindividuals) would be valuable in functionalproteomics.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Protein identification Problem
• Input: A database of proteins, anexperimental spectrum S, a set of ion types_, and a parent mass m.
• Output: A peptide of mass m from thedatabase with the best match to spectrumS.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
MS/MS Database Search
Database search in mass-spectrometry has been verysuccessful in identification of already known proteins.
Experimental spectrum can be compared with theoreticalspectra database peptides to find the best fit.
SEQUEST (Yates et al., 1995)
But reliable algorithms for identification of modifiedpeptides are not yet known.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Search for Modified Peptides:Virtual Database ApproachYates et al.,1995: an exhaustive search in a
virtual database of all modified peptides.
Exhaustive search leads to a large combinatorialproblem, even for a small set of modificationstypes.
Problem (Yates et al.,1995). Extend the virtualdatabase approach to a large set ofmodifications.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Modified Peptide Identification Problem
Input: Experimental spectrum S Database of peptides Parameter k (# of mutations/modifications) A set of ion types _
Parent mass m
Output: a peptide with the best match to thespectrum S that is at most kmutations/modifications apart from a databasepeptide.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Peptide Identification Problem: Challenge
Very similar peptides may have very differentspectra!
Goal: Define a notion of spectral similarity thatcorrelates well with the sequence similarity.
If peptides are a few mutations/modificationsapart, the spectral similarity between theirspectra should be high.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Deficiency of the Shared Peaks Count
Shared peaks count (SPC): intuitive measureof spectral similarity.
Problem: SPC diminishes very quickly as thenumber of mutations increases.
Only a small portion of correlations betweenthe spectra of mutated peptides is capturedby SPC.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
SPC Diminishes Quickly
S(PRTEIN) = {98, 133, 246, 254, 355, 375, 476, 484, 597, 632}
S(PRTEYN) = {98, 133, 254, 296, 355, 425, 484, 526, 647, 682}
S(PGTEYN) = {98, 133, 155, 256, 296, 385, 425, 526, 548, 583}
no mutations
SPC=10
1 mutation
SPC=5
2 mutations
SPC=2
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Convolution
)0)((
))((,
12
12
122211
22111212
:
SS
xSSssSsSs
}S,sS:ss{sSSx
−
−−∈∈
∈∈−=−=
:peak) (SPC count peaks shared The
with pairs of Number
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Elements of S2 S1 represented as elements of a difference matrix. Theelements with multiplicity >2 are colored; the elements with multiplicity =2are circled. The SPC takes into account only the red entries
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spe
ctra
lC
onvo
lutio
n
1
2
3
4
5
0 -150 -100 -50 0 50 100150
x
Spectral Convolution: An Example
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Comparison: Difficult Case
S = {10, 20, 30, 40, 50, 60, 70, 80, 90, 100}
Which of the spectra
S’ = {10, 20, 30, 40, 50, 55, 65, 75,85, 95}
or
S” = {10, 15, 30, 35, 50, 55, 70, 75, 90, 95}
fits the spectrum S the best?
SPC: both S’ and S” have 5 peaks in common with S.
Spectral Convolution: reveals the peaks at 0 and 5.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Comparison: Difficult Case
S S’
S S’’
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Limitations of the Spectrum Convolutions
Spectral convolution does not reveal thatspectra S and S’ are similar, while spectra Sand S” are not.
Clumps of shared peaks: the matchingpositions in S’ come in clumps while thematching positions in S” don't.
This important property was not captured byspectral convolution and was overlooked inthe previous MS/MS algorithms.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Edit Distance Between SpectraA = {a1 < … < an} : an ordered set of natural
numbers.A shift Δi transforms {a1, …., an}
Into {a1, ….,ai-1,ai+Δi,…,an+ Δi }
e.g.
20 30 40 50 60 70 80 90
10 20 30 35 45 55 65 75 85
10 20 30 35 45 55 62 72 82
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Alignment Problem
• Find a series of k shifts that make the sets
A={a1, …., an} and B={b1,….,bn}
as similar as possible.
• k-similarity between sets
• D(k) - the maximum number of elements incommon between sets after k shifts.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Alignment vs. Sequence Alignment
• Manhattan-like graph with different alphabetand scoring.
• Axes in the graph correspond to peaks in thetwo spectra.
• In this case, score is 1 if the diagonal linegoes through a peak on both axes, 0otherwise.
• Movement can be diagonal or perpendicular(but only k times total).
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Alignment =Sequence Alignment in 0-1 Alphabet
• Convert spectrum to a string with eachindex being 1 if it corresponds to a peakand 0 otherwise.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral ProductA={a1, …., an} and B={b1,…., bn}
Spectral product A⊗B: two-dimensional matrix withnm 1s corresponding to all pairs of
indices (ai,bj) and remaining
elements being 0s.
10 20 30 40 50 55 65 75 85 95
δ
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
SPC: the number of 1s atthe main diagonal.
δ-shifted SPC: the numberof 1s on the diagonal (i,i+ δ)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Alignment: k-similarityk-similarity between spectra: the maximum number
of 1s on a path through this graph that uses at mostk+1 diagonals.
k-optimal spectral
alignment = a path.
The spectral alignmentallows one to detectmore and more subtlesimilarities betweenspectra by increasing k.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
SPC reveals onlyD(0)=3 matchingpeaks.
Spectral Alignmentreveals morehidden similaritiesbetween spectra:D(1)=5 and D(2)=8and detectscorrespondingmutations.
Use of k-Similarity
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Black lines represent the paths for k=0
Red lines represent the paths for k=1
blue line in Fig.(b) represents the path for k=2
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Convolution’ Limitation The spectral convolution considers diagonals
separately without combining them into feasiblemutation scenarios.
D(1) =10 shift function score = 10 D(1) =6
10 20 30 40 50 55 65 75 85 95
10
20
30
40
50
60
70
80
90
100
10 15 30 35 50 55 70 75 90 95
10
20
30
40
50
60
70
80
90
100
δ δ
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Dynamic Programming forSpectral AlignmentDij(k): the maximum number of 1s on a path to
(ai,bj) that uses at most k+1 diagonals.
Running time: O(n4 k)
otherwisekD
jijiifkDkD
ji
ji
jijiij ,1)1(
),(~)','(,1)(max)(
''
''
),()','({
+−
+=
<
)(max)( kDkD ijij
=
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Edit Graph for Fast Spectral Alignment
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Fast Spectral Alignment Algorithm
+−
+=
−− 1)1(
1)(max)(
1,1
),(
kM
kDkD
ji
jidiagij
)(max)( ''),()','(
kDkM jijiji
ij<
=
=
−
−
)(
)(
)(
max)(
1,
,1
kM
kM
kD
kM
ji
ji
ij
ij
Running time: O(n2 k)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Alignment: Complications
• Simultaneous analysis of N- and C-terminalions
• Taking into account the intensities andcharges
• Analysis of minor ions
• Much more complicated!
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info
Spectral Alignment: Complications
Spectra are combinations of an increasing (N-terminal ions) and a decreasing (C-terminalions) number series.
These series form two diagonals in thespectral product, the main diagonal and theperpendicular diagonal.
The described algorithm deals with the maindiagonal only.
top related