sequence alignment 序列組合
DESCRIPTION
Sequence Alignment 序列組合. 組員 : B97570133 沈冠宇 B97570142 林哲宇 B97570145 翁以諾 B97570154 歐柏宏. Outline. 生物信息 ,序列比對,來找出不同序列之間的相似之處,並鑒定這些相似之處可能在功能、 結構 、 進化 關係等上的相關性的方法。 DNA 序列比對 是本系 白敦文 ( 葉問 ) 教授 的專長之一此學年白老師的專題奕是此 話說白老師實驗室買了所謂的 次世代定序儀 全台僅有 20 台左右 人類身上 僅有 A 、 C 、 T 、 G 四種 - PowerPoint PPT PresentationTRANSCRIPT
組員 : B97570133 沈冠宇 B97570142 林哲宇 B97570145 翁以諾 B97570154 歐柏宏
Outline生物信息,序列比對,來找出不同序列之間的相似之處,並鑒定這些
相似之處可能在功能、結構、進化關係等上的相關性的方法。DNA 序列比對 是本系 白敦文 ( 葉問 ) 教授 的專長之一此學年白老師
的專題奕是此 話說白老師實驗室買了所謂的 次世代定序儀 全台僅有 20 台左右 人類身上 僅有 A 、 C 、 T 、 G 四種 但白老師表示存成純文字檔將有 3GigaBytes 即使有次世代定序儀 亦要消耗大量時間比對
( 生物方面 資工系不予深究 ) ( 白老師表示 : 我現在生物還是很爛… )
Scoring RoleMatch Mismatc
h
Scoring Role(Cont’d)
ai=bj Score=2
ai or bj align with a blank Score=-1
ai≠bj Score=-1
Scoring Role(Cont’d)
Scoring Role(Cont’d)
Find an alignment which has the Highest score基本上有點類似 Dynamic Programming 中的 LCS
1) A(i,j) = the score of optimal alignment2) A(0,0)=03) A(i,0)= -i4) A(0,j)= -j
5) If ( a i=b j )then A(i, j)= A(i-1, j-1) +2
Else A(i, j)=Max{ (A(i-1,j) –1, A(i,j-1) –1, A(i-
1,j-1) –1 ) }
Find an alignment which has the Highest score(Cont’d)
Find an alignment which has the Highest score(Cont’d)
Score = -1+2-1+2-1-1-1 = -1
Find an alignment which has the Highest score(Cont’d)
Score=-1+2-1-1+2-1-1= -1
Find an alignment which has the Highest score(Cont’d)
Score=
-1+2-1-1+2-1-1= -1
February 26, 2002 Sequence Alignment -- Gary Jackoway 12
DP algorithms have a strong relationship to recursion:define a base case and prove that you can extend.If you already have the optimal solution to:
X…YA…B
then you know the next pair of characters will either be:X…YZ or X…Y- or X…YZA…BC A…BC A…B-
(where “-” indicates a gap).So you can extend the match by determining which ofthese has the highest score.
February 26, 2002 Sequence Alignment -- Gary Jackoway 13
Want to find local matching areas, even when farremoved from each other in the sequence:
ACTTAGCAGACTAACGTAAC
CCATGACTAACGGGACCTAC
Smith-Waterman: Use Needleman-Wunsch but add:IF value<0, replace with 0 (and set backtrack to none).When matrix is complete, backtrack from all localmaxima, creating local matching alignments.
February 26, 2002 Sequence Alignment -- Gary Jackoway 14
PAM: Percent Accepted Mutation Substitution Matrix (Dayhoff)Substitution matrices based on sound
evolutionary principles.Find PAM1 by comparing groups of proteins
known to be evolutionarily closely related.Find PAM-n my multiplying PAM1 by itself n
times.PAM60: ~60% similar, PAM250: ~20%
similar.The more distant the expected relationship,
the higher PAM-n should be used.
February 26, 2002 Sequence Alignment -- Gary Jackoway 15
BLOSUM: BLOcks SUbstition MatrixStart with highly-conserved patterns
(blocks) in a large set of closely related proteins.
Use the likelihood of substitutions found in those sequences to create a substitution probability matrix.
BLOSUM-n means that the sequences used were n% identical.
BLOSUM62 is “standard”.