Download - Alignment
![Page 1: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/1.jpg)
Sequence Alignment
Kun-Mao Chao (趙坤茂 )Department of Computer Science an
d Information EngineeringNational Taiwan University, Taiwan
E-mail: [email protected]
WWW: http://www.csie.ntu.edu.tw/~kmchao
![Page 2: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/2.jpg)
2
Bioinformatics
![Page 3: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/3.jpg)
3
Bioinformatics and Computational Biology-Related Journals:
• Bioinformatics (previously called CABIOS)• Bulletin of Mathematical Biology• Genome Research• Genomics• IEEE/ACM Transactions on Computational Biology and
Bioinformatics• Journal of Bioinformatics and Computational Biology• Journal of Computational Biology• Journal of Molecular Biology• Nature• Nucleic Acid Research• Science
![Page 4: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/4.jpg)
4
Bioinformatics and Computational Biology-Related Conferences:
• Intelligent Systems for Molecular Biology (ISMB)• Pacific Symposium on Biocomputing (PSB)• The Annual International Conference on Research
in Computational Molecular Biology (RECOMB)• Workshop on Algorithms in Bioinformatics
(WABI)• The IEEE Computer Society Bioinformatics Conf
erence (CSB)
![Page 5: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/5.jpg)
5
Bioinformatics and Computational Biology-
Related Books:• Calculating the Secrets of Life: Applications of the Mathematical Sciences in Molecular Biology, by Eric S. Lander and Michael S. Waterman (1995)
• Introduction to Computational Biology: Maps, Sequences, and Genomes, by Michael S. Waterman (1995)
• Introduction to Computational Molecular Biology, by Joao Carlos Setubal and Joao Meidanis (1996)
• Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, by Dan Gusfield (1997)
• Computational Molecular Biology: An Algorithmic Approach, by Pavel Pevzner (2000)
• Introduction to Bioinformatics, by Arthur M. Lesk (2002)
![Page 6: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/6.jpg)
6
Useful Websites• MIT Biology Hypertextbook
– http://www.mit.edu:8001/afs/athena/course/other/esgbio/www/7001main.html
• The International Society for Computational Biology:– http://www.iscb.org/
• National Center for Biotechnology Information (NCBI, NIH):– http://www.ncbi.nlm.nih.gov/
• European Bioinformatics Institute (EBI):– http://www.ebi.ac.uk/
• DNA Data Bank of Japan (DDBJ):– http://www.ddbj.nig.ac.jp/
![Page 7: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/7.jpg)
7
Sequence Alignment
![Page 8: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/8.jpg)
8
Dot MatrixSequence A: CTTAACT
Sequence B: CGGATCATC G G A T C A T
C
T
T
A
A
C
T
![Page 9: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/9.jpg)
9
C---TTAACTCGGATCA--T
Pairwise AlignmentSequence A: CTTAACTSequence B: CGGATCAT
An alignment of A and B:
Sequence A
Sequence B
![Page 10: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/10.jpg)
10
C---TTAACTCGGATCA--T
Pairwise AlignmentSequence A: CTTAACTSequence B: CGGATCAT
An alignment of A and B:
Insertion gap
Match Mismatch
Deletion gap
![Page 11: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/11.jpg)
11
Alignment GraphSequence A: CTTAACT
Sequence B: CGGATCATC G G A T C A T
C
T
T
A
A
C
T
C---TTAACTCGGATCA--T
![Page 12: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/12.jpg)
12
A simple scoring scheme
• Match: +8 (w(x, y) = 8, if x = y)
• Mismatch: -5 (w(x, y) = -5, if x ≠ y)
• Each gap symbol: -3 (w(-,x)=w(x,-)=-3)
C - - - T T A A C TC G G A T C A - - T
+8 -3 -3 -3 +8 -5 +8 -3 -3 +8 = +12
Alignment score
![Page 13: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/13.jpg)
13
An optimal alignment-- the alignment of maximum score
• Let A=a1a2…am and B=b1b2…bn .
• Si,j: the score of an optimal alignment between a1a2…ai and b1b2…bj
• With proper initializations, Si,j can be computedas follows.
),(
),(
),(
max
1,1
1,
,1
,
jiji
jji
iji
ji
baws
bws
aws
s
![Page 14: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/14.jpg)
14
Computing Si,j
i
j
w(ai,-)
w(-,bj)
w(ai,b
j)
Sm,n
![Page 15: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/15.jpg)
15
Initializations
0 -3 -6 -9 -12 -15 -18 -21 -24
-3
-6
-9
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
![Page 16: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/16.jpg)
16
S3,5 = ?
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5 2 -1 -4 -7 -10 -13
-6 5 3 0 -3 7 4 1 -2
-9 2 0 -2 -5 ?
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
![Page 17: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/17.jpg)
17
S3,5 = 5
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5 2 -1 -4 -7 -10 -13
-6 5 3 0 -3 7 4 1 -2
-9 2 0 -2 -5 5 -1 -1 9
-12 -1 -3 -5 6 3 0 7 6
-15 -4 -6 -8 3 1 -2 8 5
-18 -7 -9 -11 0 -2 9 6 3
-21 -10 -12 -14 -3 8 6 4 14
C G G A T C A T
C
T
T
A
A
C
T
optimal score
![Page 18: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/18.jpg)
18
C T T A A C – TC G G A T C A T
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5 2 -1 -4 -7 -10 -13
-6 5 3 0 -3 7 4 1 -2
-9 2 0 -2 -5 5 -1 -1 9
-12 -1 -3 -5 6 3 0 7 6
-15 -4 -6 -8 3 1 -2 8 5
-18 -7 -9 -11 0 -2 9 6 3
-21 -10 -12 -14 -3 8 6 4 14
C G G A T C A T
C
T
T
A
A
C
T
8 – 5 –5 +8 -5 +8 -3 +8 = 14
![Page 19: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/19.jpg)
19
Now try this example in class
Sequence A: CAATTGASequence B: GAATCTGC
Their optimal alignment?
![Page 20: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/20.jpg)
20
Initializations
0 -3 -6 -9 -12 -15 -18 -21 -24
-3
-6
-9
-12
-15
-18
-21
G A A T C T G C
C
A
A
T
T
G
A
![Page 21: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/21.jpg)
21
S4,2 = ?
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 -5 -8 -11 -14 -4 -7 -10 -13
-6 -8 3 0 -3 -6 -9 -12 -15
-9 -11 0 11 8 5 2 -1 -4
-12 -14 ?
-15
-18
-21
G A A T C T G C
C
A
A
T
T
G
A
![Page 22: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/22.jpg)
22
S5,5 = ?
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 -5 -8 -11 -14 -4 -7 -10 -13
-6 -8 3 0 -3 -6 -9 -12 -15
-9 -11 0 11 8 5 2 -1 -4
-12 -14 -3 8 19 16 13 10 7
-15 -11 -6 5 16 ?
-18
-21
G A A T C T G C
C
A
A
T
T
G
A
![Page 23: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/23.jpg)
23
S5,5 = 14
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 -5 -8 -11 -14 -4 -7 -10 -13
-6 -8 3 0 -3 -6 -9 -12 -15
-9 -11 0 11 8 5 2 -1 -4
-12 -14 -3 8 19 16 13 10 7
-15 -11 -6 5 16 14 24 21 18
-18 -7 -9 2 13 11 21 32 29
-21 -10 1 -1 10 8 18 29 27
G A A T C T G C
C
A
A
T
T
G
A
optimal score
![Page 24: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/24.jpg)
24
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 -5 -8 -11 -14 -4 -7 -10 -13
-6 -8 3 0 -3 -6 -9 -12 -15
-9 -11 0 11 8 5 2 -1 -4
-12 -14 -3 8 19 16 13 10 7
-15 -11 -6 5 16 14 24 21 18
-18 -7 -9 2 13 11 21 32 29
-21 -10 1 -1 10 8 18 29 27
G A A T C T G C
C
A
A
T
T
G
A
-5 +8 +8 +8 -3 +8 +8 -5 = 27
C A A T - T G AG A A T C T G C
![Page 25: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/25.jpg)
25
Global Alignment vs. Local Alignment
• global alignment:
• local alignment:
![Page 26: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/26.jpg)
26
An optimal local alignment
• Si,j: the score of an optimal local alignment ending at ai and bj
• With proper initializations, Si,j can be computedas follows.
),(
),(),(
0
max
1,1
1,
,1
,
jiji
jji
iji
ji
baws
bwsaws
s
![Page 27: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/27.jpg)
27
local alignment
0 0 0 0 0 0 0 0 0
0 8 5 2 0 0 8 5 2
0 5 3 0 0 8 5 3 13
0 2 0 0 0 8 5 2 11
0 0 0 0 8 5 3 ?
0
0
0
C G G A T C A T
C
T
T
A
A
C
T
Match: 8
Mismatch: -5
Gap symbol: -3
![Page 28: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/28.jpg)
28
local alignment
0 0 0 0 0 0 0 0 0
0 8 5 2 0 0 8 5 2
0 5 3 0 0 8 5 3 13
0 2 0 0 0 8 5 2 11
0 0 0 0 8 5 3 13 10
0 0 0 0 8 5 2 11 8
0 8 5 2 5 3 13 10 7
0 5 3 0 2 13 10 8 18
C G G A T C A T
C
T
T
A
A
C
T
Match: 8
Mismatch: -5
Gap symbol: -3
The best
score
![Page 29: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/29.jpg)
29
0 0 0 0 0 0 0 0 0
0 8 5 2 0 0 8 5 2
0 5 3 0 0 8 5 3 13
0 2 0 0 0 8 5 2 11
0 0 0 0 8 5 3 13 10
0 0 0 0 8 5 2 11 8
0 8 5 2 5 3 13 10 7
0 5 3 0 2 13 10 8 18
C G G A T C A T
C
T
T
A
A
C
T
The best
score
A – C - TA T C A T8-3+8-3+8 = 18
![Page 30: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/30.jpg)
30
Now try this example in class
Sequence A: CAATTGASequence B: GAATCTGC
Their optimal local alignment?
![Page 31: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/31.jpg)
31
Did you get it right?
0 0 0 0 0 0 0 0 0
0 0 0 0 0 8 5 2 8
0 0 8 8 5 5 3 0 5
0 0 8 16 13 10 7 4 2
0 0 5 13 24 21 18 15 12
0 0 2 10 21 19 29 26 23
0 8 5 7 18 16 26 37 34
0 5 16 13 15 13 23 34 32
G A A T C T G C
C
A
A
T
T
G
A
![Page 32: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/32.jpg)
32
0 0 0 0 0 0 0 0 0
0 0 0 0 0 8 5 2 8
0 0 8 8 5 5 3 0 5
0 0 8 16 13 10 7 4 1
0 0 5 13 24 21 18 15 12
0 0 2 10 21 19 29 26 23
0 8 5 7 18 16 26 37 34
0 5 16 13 15 13 23 34 32
G A A T C T G C
C
A
A
T
T
G
A
A A T – T GA A T C T G8+8+8-3+8+8 = 37
![Page 33: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/33.jpg)
33
Affine gap penalties• Match: +8 (w(x, y) = 8, if x = y)
• Mismatch: -5 (w(x, y) = -5, if x ≠ y)
• Each gap symbol: -3 (w(-,x)=w(x,-)=-3)
• Each gap is charged an extra gap-open penalty: -4.
C - - - T T A A C TC G G A T C A - - T
+8 -3 -3 -3 +8 -5 +8 -3 -3 +8 = +12
-4 -4
Alignment score: 12 – 4 – 4 = 4
![Page 34: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/34.jpg)
34
Affine gap panalties• A gap of length k is penalized x + k·y.
gap-open penalty
gap-symbol penaltyThree cases for alignment endings:
1. ...x...x
2. ...x...-
3. ...-...x
an aligned pair
a deletion
an insertion
![Page 35: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/35.jpg)
35
Affine gap penalties
• Let D(i, j) denote the maximum score of any alignment between a1a2…ai and b1b2…bj ending with a deletion.
• Let I(i, j) denote the maximum score of any alignment between a1a2…ai and b1b2…bj ending with an insertion.
• Let S(i, j) denote the maximum score of any alignment between a1a2…ai and b1b2…bj.
![Page 36: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/36.jpg)
36
Affine gap penalties
),(
),(
),()1,1(
max),(
)1,(
)1,(max),(
),1(
),1(max),(
jiI
jiD
bawjiS
jiS
yxjiS
yjiIjiI
yxjiS
yjiDjiD
ji
(A gap of length k is penalized x + k·y.)
![Page 37: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/37.jpg)
37
Affine gap penalties
SI
D
SI
D
SI
D
SI
D
-y-x-y
-x-y
-y
w(ai,bj)
![Page 38: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/38.jpg)
38
Constant gap penalties• Match: +8 (w(x, y) = 8, if x = y)
• Mismatch: -5 (w(x, y) = -5, if x ≠ y)
• Each gap symbol: 0 (w(-,x)=w(x,-)=0)
• Each gap is charged a constant penalty: -4.
C - - - T T A A C TC G G A T C A - - T
+8 0 0 0 +8 -5 +8 0 0 +8 = +27
-4 -4
Alignment score: 27 – 4 – 4 = 19
![Page 39: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/39.jpg)
39
Constant gap penalties
• Let D(i, j) denote the maximum score of any alignment between a1a2…ai and b1b2…bj ending with a deletion.
• Let I(i, j) denote the maximum score of any alignment between a1a2…ai and b1b2…bj ending with an insertion.
• Let S(i, j) denote the maximum score of any alignment between a1a2…ai and b1b2…bj.
![Page 40: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/40.jpg)
40
Constant gap penalties
gap afor penalty gapconstant a is where
),(
),(
),()1,1(
max),(
)1,(
)1,(max),(
),1(
),1(max),(
x
jiI
jiD
bawjiS
jiS
xjiS
jiIjiI
xjiS
jiDjiD
ji
![Page 41: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/41.jpg)
41
Restricted affine gap panalties• A gap of length k is penalized x + f(k)·y.
where f(k) = k for k <= c and f(k) = c for k > c
Five cases for alignment endings:
1. ...x...x
2. ...x...-
3. ...-...x
4. and 5. for long gaps
an aligned pair
a deletion
an insertion
![Page 42: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/42.jpg)
42
Restricted affine gap penalties
),(');,(
),(');,(
),()1,1(
max),(
)1,(
)1,('max),('
)1,(
)1,(max),(
),1(
),1('max),('
),1(
),1(max),(
jiIjiI
jiDjiD
bawjiS
jiS
cyxjiS
jiIjiI
yxjiS
yjiIjiI
cyxjiS
jiDjiD
yxjiS
yjiDjiD
ji
![Page 43: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/43.jpg)
43
D(i, j) vs. D’(i, j)
• Case 1: the best alignment ending at (i, j) with a deletion at the end has the last deletion gap of length <= c D(i, j) >= D’(i, j)
• Case 2: the best alignment ending at (i, j) with a deletion at the end has the last deletion gap of length >= c
D(i, j) <= D’(i, j)
![Page 44: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/44.jpg)
44
Max{S(i,j)-x-ky, S(i,j)-x-cy}
kc
![Page 45: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/45.jpg)
45
k best local alignments
• Smith-Waterman(Smith and Waterman, 1981; Waterman and Eggert, 1987)
• FASTA(Wilbur and Lipman, 1983; Lipman and Pearson, 1985)
• BLAST(Altschul et al., 1990; Altschul et al., 1997)
![Page 46: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/46.jpg)
46
FASTA
1) Find runs of identities, and identify regions with the highest density of identities.
2) Re-score using PAM matrix, and keep top scoring segments.
3) Eliminate segments that are unlikely to be part of the alignment.
4) Optimize the alignment in a band.
![Page 47: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/47.jpg)
47
FASTA
Step 1: Find runes of identities, and identify regions with the highest density of identities.
Sequence A
Sequence B
![Page 48: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/48.jpg)
48
FASTA
Step 2: Re-score using PAM matrix, andkeep top scoring segments.
![Page 49: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/49.jpg)
49
FASTA
Step 3: Eliminate segments that are unlikely to be part
of the alignment.
![Page 50: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/50.jpg)
50
FASTA
Step 4: Optimize the alignment in a band.
![Page 51: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/51.jpg)
51
BLAST
Basic Local Alignment Search Tool(by Altschul, Gish, Miller, Myers and Lipman)
The central idea of the BLAST algorithm is that a statistically significant alignment is likely to contain a high-scoring pair of aligned words.
![Page 52: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/52.jpg)
52
The maximal segment pair measure
A maximal segment pair (MSP) is defined to be the highest scoring pair of identical length segments chosen from 2 sequences.(for DNA: Identities: +5; Mismatches: -4)
the highest scoring pair
•The MSP score may be computed in time proportional to the product of their lengths. (How?) An exact procedure is too time consuming.
•BLAST heuristically attempts to calculate the MSP score.
![Page 53: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/53.jpg)
53
BLAST
1) Build the hash table for Sequence A.
2) Scan Sequence B for hits.
3) Extend hits.
![Page 54: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/54.jpg)
54
BLASTStep 1: Build the hash table for Sequence A. (3-tuple example)
For DNA sequences:
Seq. A = AGATCGAT 12345678AAAAAC..AGA 1..ATC 3..CGA 5..GAT 2 6..TCG 4..
TTT
For protein sequences:
Seq. A = ELVIS
Add xyz to the hash table if Score(xyz, ELV) T;≧Add xyz to the hash table if Score(xyz, LVI) T;≧Add xyz to the hash table if Score(xyz, VIS) T;≧
![Page 55: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/55.jpg)
55
BLASTStep2: Scan sequence B for hits.
![Page 56: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/56.jpg)
56
BLASTStep2: Scan sequence B for hits.
Step 3: Extend hits.
hit
Terminate if the score of the sxtension fades away. (That is, when we reach a segment pair whose score falls a certain distance below the best score found for shorter extensions.)
BLAST 2.0 saves the time spent in extension, and
considers gapped alignments.
![Page 57: Alignment](https://reader036.vdocuments.pub/reader036/viewer/2022062500/5695d4881a28ab9b02a1c741/html5/thumbnails/57.jpg)
57
Remarks
• Filtering is based on the observation that a good alignment usually includes short identical or very similar fragments.
• The idea of filtration was used in both FASTA and BLAST.