the problem of reconstructing k-articulated phylogenetic network
DESCRIPTION
The Problem of Reconstructing k-articulated Phylogenetic Network. Supervisor : Dr. Yiu Siu Ming Second Examiner : Professor Francis Y.L. Chin Student : Vu Thi Quynh Hoa. Contents. Introduction Motivation Related Work Project Plan Problem Definitions Algorithms - PowerPoint PPT PresentationTRANSCRIPT
THE PROBLEM OF RECONSTRUCTING K-ARTICULATED PHYLOGENETIC NETWORK
Supervisor: Dr. Yiu Siu MingSecond Examiner: Professor Francis Y.L. Chin
Student: Vu Thi Quynh Hoa
CONTENTS
1. Introduction Motivation Related Work Project Plan
2. Problem Definitions3. Algorithms
1-articulated Network Algorithm 2-articulated Network Algorithm
INTRODUCTION – MOTIVATION
To model the evolutionary history of species, phylogenetic network is a powerful approach to represent the articulation events
Level-x network: the time complexity of all existing algorithms increases exponentially when x gets higher
k-articulated network is a more naturally biological model which can capture complex scenarios of articulation events with a smaller value of k E.g. level-4 network vs. 2-articulated network
RELATED WORK
The problem of constructing phylogenetic networks has been worked under many approaches using different input types Nakhleh et al. proposed an algorithm constructing a level-1
network from two trees in polynomial time Huynh et al. with a polynomial-running-time algorithm building a
galled network from a set of trees Bryant and Moulton developed NeighborNet method to construct
a network from a distance matrix Jansson, Nguyen and Sung with O(n3) running time to construct a
galled network given a set of triplets Extending to level-2 network, Van Iersel et al. provided an O(n8)
algorithm
SCHEDULES – PROJECT PLAN
Objectives Time
Round1
1. Reconstructing restricted 1-articulated network from a set of binary phylogenetic trees
30 Sep2011
2. Reconstructing restricted 2-articulated network from a set of binary phylogenetic trees
15 Nov2011
Round2
3. Reconstructing 1-articulated network from a Distance Matrix
15 Feb2012
4. Implementation one of the three problems 31 Mar2012
DEFINITIONS Phylogenetic Tree
A rooted, unordered tree with distinctly labeled leaves representing each strain of the species
Phylogenetic NetworkA rooted, directed acyclic graph in which: One node has indegree 0 (the root), and all other nodes have indegree 1 or
2 All nodes with indegree 2 must have outdegree 1 (hybrid nodes) All other nodes with indegree 1 have outdegree 0 or 2 Nodes with outdegree 0 are leaves which are distinctly labeled
Node s is called a split node of a hybrid node h if s can be reached using two disjoint paths from the children of s
PHYLOGENETIC NETWORK
DEFINITIONS
k-articulated networka phylogenetic network in which every split node corresponds to at most k hybrid nodes A level-k network is a k-articulated network A k-articulated network can model a level-x network (x > k)
Level-2 network
1-articulated
network
DEFINITIONS
A network is non-skew if all paths from any split node to its hybrid node have a length ≥ 2
A network is safe if the siblings of all hybrid nodes are not hybrid nodes
A network is restricted if it is non-skew and safe
DEFINITIONS
Given a hybrid node h and its parents p and q, a cut on edge (p, h) means removing the edge (p, h) from the network, and then for every node with indegree 1 and outdegree less than 2, contracting its outgoing edge
A network N is compatible with phylogenetic tree T if N can be converted to T by performing a series of cuts one by one.
h
p q
h
p q
PROBLEM DEFINITION
Reconstructing a restricted k-articulated network (where k = 1, 2) from a set of binary trees
Given a set of phylogenetic binary trees Ti , i = 1, 2, …, k,
with the same leaf label set, construct a restricted k-
articulated network N (where k = 1, 2) with minimum
number of hybrid nodes compatible with each tree Ti
ALGORITHM
Divide and Conquer Technique
Dividing
Bipartition Tripartition Quadripartition
Conquering
?
1-ARTICULATED NETWORK ALGORITHM
Case 1: Each input tree is a single node – Base case
Case 2: Input tree set admits a leaf set bipartition
Case 3: Input tree set admits a leaf set tripartition
1-ARTICULATED NETWORK ALGORITHM
Case 1: Each input tree is a single node – Base case – O(1) Return a network which is a single node with the same label
1-ARTICULATED NETWORK ALGORITHM
Case 2: Input tree set admits a leaf set bipartition – O(kn)
T1 T2 Tk
N1 N2
r
NCombinatio
n
r
1-ARTICULATED NETWORK ALGORITHM
Case 3: Input tree set admits a leaf set tripartition – O(kn)T1 T2 Tk
N1 N2
Nh
x y
It takes O(kn) to find nodes x in N1 and y in N2
2-ARTICULATED NETWORK ALGORITHM
Case 1: Each input tree is a single node – Base case
Case 2: Input tree set admit a leaf set bipartition
Case 3: Input tree set admit a leaf set tripartition
Case 4: Input tree set admit a leaf set quadripartition
r
4-ARTICULATED NETWORK ALGORITHM
Case 4: Input tree set admits a leaf set quadripartition – O(kn)T1 T2 Tk
Nh1
x1 y1
It takes O(kn) to find nodes x1 & x2 in N1
and y1 & y2 in N2
Nh1
x2 y2
N2N1
r
4-ARTICULATED NETWORK ALGORITHM
Case 4: Input tree set admits a leaf set quadripartition – O(kn)T1 T2 Tk
Nh1
x1 y1
It takes O(kn) to find nodes x1 & x2 in N1
and y1 & y2 in N2
Nh1
x2
y2
N2N1
TIME COMPLEXITY
Time complexity of the Algorithms in reconstructing a restricted k-articulated network, in both cases when k = 1, 2: Each recursive step takes O(kn) running time to check
whether the input tree set admit a leaf set bipartition or tripartition, and then combine the subnetworks returned
The number of nodes in the restricted 1-articulated network is O(n)
Therefore, the total time complexity is O(kn2)
THANK YOU!