structure of the endoms-dna complex as mismatch ... · (residues lys3 to glu120) were dimerized as...
TRANSCRIPT
-
Structure, Volume 24
Supplemental Information
Structure of the EndoMS-DNA Complex
as Mismatch Restriction Endonuclease
Setsu Nakae, Atsushi Hijikata, Toshiyuki Tsuji, Kouki Yonezawa, Ken-ichi Kouyama, KoutaMayanagi, Sonoko Ishino, Yoshizumi Ishino, and Tsuyoshi Shirai
-
SUPPLEMENTARY INFORMATION
Structure of the EndoMS-DNA complex as mismatch-restriction endonuclease
Setsu Nakae, Astuhi Hijikata, Toshiyuki Tsuji, Koki Yonezawa, Ken-ichi Kouyama, Kouta
Mayanagi, Sonoko Ishino, Yoshizumi Ishino, Tsuyoshi Shirai
Contents
Detailed Materials and Methods
Supplementary Figure S1. Related to Figure 2. Alignment of putative EndoMS orthologs
Supplementary Figure S2. Related to Figure 2. Molecular phylogeny and genome structure of putative EndoMS
orthologs
Supplementary Figure S3. Related to Figure 2. Structures of EndoMS and type II restriction enzymes
Supplementary Figure S4. Related to Figure 3. Sequence and affinity of DNAs for EndoMS substrates
Supplementary Figure S5. Related to Figure 3. DNA bound-structures of EndoMS
Supplementary Figure S6. Related to Figure 4. Electron microscopic images and class averages of the
TkoEndoMS-TkoPCNA- dsDNA complex
Supplementary Table S1. Related to Figure 2. Structures similar to N- or C-terminal domains of EndoMS in the
PDB
Supplementary Table S2. Related to Figure 3. DNA base-pair and base-step parameters of EndoMS-bound
and canonical B-form DNAs
Supplementary References
-
Detailed Materials and Methods
Database survey. Homologs of TkoEndoMS were screened using amino acid sequence (UniProt) and structure (PDB)
databases 1,2. The amino acid sequence of TkoEndoMS (UniProt entry Q5JER9) was used for the query, and the
UniProtKB database was searched with BLAST 3. A total of 1,336 sequences were found to have E-values lower than
0.01 and were aligned with ClustalW; their molecular phylogenies were constructed with the neighbor-joining method
4,5. Multiple sequence alignment and the phylogenetic tree of the 42 EndoMS/NucS homologs, which were selected
according to clustering on the tree and overall conservation between TkoEndoMS, are shown in Figures S1 and S2,
respectively. Putative EndoMS/NucS orthologs were detected in Archaea (Euryarchaeota, Crenarchaeota, and
Thaumarchaeota) and a group of Eubacteria (Actinobacteria and Deinococcus-Thermus).
The structural similarities between EndoMS/NucS and other proteins were examined in the PDB. Subunit A
of PabNucS (PDB: 2VLD) was divided into N-terminal (residues Lys3 to Glu120) and C-terminal (residues Ser126 to
Pro233) domains, and each domain was separately used for a query in a structural search with the fast structure
superposition application in the SIRD database system (http://sird.nagahama-i-bio.ac.jp/sird/). Subunits/domains that
showed less than 4.0 Å root mean square deviation (RMSD) for more than 50 residues were retrieved (Table S1).
Redundant hits on similar structures (more than 20% sequence identity) were discarded.
The results revealed that a few proteins showed structures similar to that of the N-terminal domain.
Although the proteins appeared to bind RNA, none of the structures of these proteins had been determined in complex
with RNA; thus, these proteins did not provide useful information for modeling of EndoMS-dsDNA interactions. In
contrast, many DNA-binding proteins have been detected for the C-terminal domain. This was expected because the
C-terminal domain of EndoMS belongs to a fold family, which contains a large variety of restriction, recombination,
and repair nucleases 6,7. A previous study showed that the C-terminal domain of PabNucS showed structural similarities
with RecB 8. However, the current results emphasized the similarities between EndoMS and type II restriction enzymes.
The type II restriction enzyme SgrAI was superposed with the C-terminal domain of EndoMS with an RMSD of 3.4 Å
for 91 residues; this was the most similar structure observed among known structures. Additionally, the restriction
enzymes FokI, AspBHI, Cfr10I, Bse634I, R.BspD6I, and BsoBI were detected. From this analysis, all of the detected
-
structures in complex with DNA were those of restriction enzymes (Figure S3). Therefore, these structures were
considered in modeling the EndoMS-dsDNA complex structure, as explained below.
The UCSC Archaeal Genome Browser (http://archaea.ucsc.edu/) and ProOpDB
(http://operons.ibt.unam.mx/OperonPredictor/) were referenced in order to investigate genomic structures proximal to
the EndoMS gene 9,10. In ProOpDB, operons of the endoMS gene were searched, and operons consisting of the gene and
radA-related genes were conserved only in the archaeal Thermococcus genus. The genome structures of the predicted
operons were retrieved from the UCSC Archaeal Genome Browser for T. kodakaraensis KOD1, T. gammatolerans, T.
oonurience, T. sp. strain 4557, T. sibiricus, and T. barophilus MP (Figure S2) 11-15. STRING (http://string-db.org/) and
IntAct (http://www.ebi.ac.uk/intact/) databases were referenced for the experimentally detected protein-protein
interactions and for prediction of the functional relationships with EndoMS, respectively 16,17.
Knowledge-based modeling of the TkoEndoMS-dsDNA complex. According to the results of a database survey, two
of the EndoMS C-terminal domains were assumed to tether dsDNA in manner similar to that of restriction enzymes.
BsoBI was employed as the reference structure for the modeling because its catalytic domain exhibited the highest
similarity to that of EndoMS in terms of motif conservation and sizes of insertions/deletions (Figure S3a). The
N-terminal domains were thought to take part in a dimer formation. The C-terminal domain of PabNucS (residues
Ser126 to Pro233, PDB: 2VLD) was superposed onto the corresponding domains of BsoBI (PDB: 1DC1) in order to
construct a C-terminal domain dimer on dsDNA (Figure 6). The dsDNA was truncated, and nucleotides were replaced
according to the cocrystallized structure (G-T-mismatch-DNA1 in Figure S4a). The N-terminal domains of PabNucS
(residues Lys3 to Glu120) were dimerized as in the crystal structure of PabNucS. These N- and C-terminal models were
used as search models in molecular replacement, as detailed below.
Crystal structure analyses. The crystal structures of TkoEndoMS were determined with X-ray crystallography. The
mutant TkoEndoMS gene harboring Asp165Ala (D165A) inactive mutation was cloned, expressed, and purified as
described previously 18.
DNA-bound TkoEndoMS crystals were obtained under initial conditions using 80 mM sodium cacodylate
-
buffer (pH 6.5) containing 0.16 M calcium acetate, 14.4% (w/v) PEG8000, and 20% (w/v) glycerol for a 0.5-mL
reservoir and a mixture of 2 µL of the reservoir solution and 2 µL of protein solution in a 50 mM Tris-HCl (pH 8.0)
buffer containing 0.1 mM EDTA, 0.5 mM DDT, 10% (w/v) glycerol, 0.6 M NaCl, 200 µM TkoEndoMS dimer, and 200
µM dsDNA (T-G-mismatch-DNA1 in Figure S4a) for a hanging drop.
Crystals grew at 18°C to approximate maximum dimensions of 0.3 × 0.3 × 0.01 mm3 within a few weeks.
X-ray diffraction data were collected from loop-mounted crystals under cryogenic conditions with a CCD detector
Quantum315 (ADSC) at BL38B1, SMART6500 (Bruker AXS) at BL44XU in SPring-8 (Hyogo, Japan), or Quantum
270 (ADSC) at BL17A in Photon Factory (Tsukuba, Japan). The crystals were soaked for 10–30 s in a crystal growth
buffer containing 15% (v/v) 2-methyl-2,4-pentanediol (MPD) for cryoprotection. The diffraction images were
processed using the MOSFLM program 19,20.
The crystal structure was solved with the molecular replacement method using the Phaser-MR application
of PHENIX or MOLREP of CCP4 suites 21,22. A solution of the EndoMS-dsDNA crystal structure was initially
attempted using the crystal structure of PabNucS (PDB: 2VLD) for search models. TkoEndoMS and PabNucS
demonstrated 69.7% identity in amino acid sequences. PabNucS as dimers, monomers, or separated N-terminal and
C-terminal domains was used in both all-atom and Cβ models for molecular replacements. However, no promising
solution was obtained.
Since the database survey clearly indicated the similarities between EndoMS and type II restriction
enzymes, knowledge-based models of C-terminal domain dimer with dsDNA and N-terminal domain dimer were
examined as search models. The domains in the models were rendered into Cβ models and were applied for Phaser-MR.
Under these conditions, a solution that was reasonable in terms of crystal packing, R-values, and electron density map
was obtained. The initial R/free R factors of the model were 0.377/0.366, while those after three cycles of rigid body,
coordinates, and simulated annealing refinements were 0.355/0.386 for reflections between 25.0 and 2.4 Å resolution.
In the map after refinement, the electron densities for most of the side chains were clearly observed as residual densities
(Figure S5a).
The model refinements were conducted by using COOT and the phenix.refine application of PHENIX 21,23.
Because EndoMS was in homodimer form with two-fold symmetry, dsDNA might bind to the protein in two different
-
(opposing) directions. Since the dual-conformation was actually implied in the electron density map as a residual
density, the dsDNA model was duplicated and those in opposite directions were built in as alternative coordinates of
equal occupancy (Figures S5c, d, e, f, and g). In the initial refinement process, high residual densities were observed
proximal to many of the bases, where only one of the alternative dsDNA models was considered (an example electron
density is shown in Figure S5f). A high electron density proximal to Glu179 residues was interpreted to be a
magnesium ion (Figure S5b), because the atoms positioned on this density showed interatomic distances 2.15 Å in
average (s.d. 0.19 Å) to the coordinating oxygen atoms (values are those in the final models). Blobs of electron
densities observed between proteins and DNA were modeled in MPD, a cryoprotectant used in diffraction experiments,
because corresponding densities were not observed when data were collected without the cryoprotectant but in the
presence of a higher (~35%) glycerol concentration (data not shown). Final R /freeR-factors of 0.190/0.244 were
obtained for this crystal structure (Table I).
Structural analyses of the TkoEndoMS-dsDNA complex were also attempted for DNAs with different
mismatched base pairs (G-G, T-T, A-C, or T-C) or background sequences (DNA1, DNA2, or DNA3; Figure S4a).
DNA1, DNA2, and DNA3 have no consensus base except for those in the mismatch pairs. Crystallization experiments
were carried out using the same conditions as described above, and crystals were obtained for mismatch base pairs of
G-T, G-G, or T-T regardless of the background sequence. On the other hand, no crystals grew for DNAs containing
A-C or T-C mismatches, as well as normal A-T base pair. Although screening for crystallization conditions was
repeated, no crystals were obtained for complexes containing these DNAs. The structures of T-T-mismatch-DNA1,
G-G-mismatch-DNA1, G-T-mismatch-DNA2, and G-T-mismatch-DNA3 complexes were solved via molecular
replacement using the structure of the G-T-mismatch-DNA1 complex as a search model and were refined with the same
procedures as described above.
The crystals of apo TkoEndoMS grew in a stock solution (50 mM Tris-HCl, pH 8.0, 0.1 mM EDTA, 0.5 mM
DDT, 10% [w/v] glycerol, 0.6 M NaCl, and 445 µM TkoEndoMS) stored at 4°C. X-ray diffraction data were collected
with a CCD detector Quantum 270 at BL17A in the Photon Factory under cryogenic conditions and processed as
described above. Analysis of reflection data with the Xriage tool suggested that the crystal was twinned with an
operator (-h, l, k). Data collection from other apo-form crystals was carried out, and the crystals were always twinned
-
with fractions of 0.05–0.08. The crystal structure was solved with the molecular replacement method by separately
using N- and C-terminal domains of DNA-bound TkoEndoMS as search models. Model refinement was conducted as
mentioned above by applying the twin operator.
The quality of the models was evaluated by using the PROCHECK program 24. The parameters of dsDNA
were evaluated by using W3DNA web-tool, and compared with that of a canonical B-form DNA (PDB: 1BNA) (Table
S2) 25,26. The crystallographic parameters, data collection and refinement statistics, and PDB codes are summarized in
Table I. The molecular graphics were prepared with CHIMERA 27.
Evaluating Binding affinities of TkoEndoMS for various mismatch containing DNA.
Electrophoresis mobility shift assays (EMSA) were examined to quantify the DNA binding affinity of TkoEndoMS as
described previously 18. Various concentrations of TkoEndoMS were incubated with 5 nM 5'-Cy5-labeled dsDNA (15
bp) which had no mismatch base pair or contained single base-pair mismatches (G-G, G-T, T-T, A-A, T-C, A-C, A-A,
and C-C) in a reaction solution (20 mM Tris-HCl, pH 8.0, 6 mM (NH4)2SO4, 2 mM MgCl2, 100 mM NaCl, 0.1 mg/ml
BSA, and 0.1 % Triton X-100) at 37°C for 5 min. Relatively low concentrations of protein (0.5, 1, 2.5, 5, and 10 nM as
a dimer) were examined for preferred mismatches (G-G, G-T, and T-T), and higher concentrations were tested for
non-preferred mismatches (A-A, T-C, A-C, A-A, and C-C) (Figure S4c). The protein-DNA complexes were
fractionated by 8% native PAGE in 0.5 × TBE buffer. Typhoon Trio+ image analyzer and Image Quant TL software
(GE healthcare) were used to quantify the fluorescent signal in each DNA band. The apparent Kd values were
calculated based on a plotting of the rate of DNA binding versus the EndoMS concentration through non-linear
regression from three independent experiments.
Knowledge-based modeling of the TkoEndoMS-TkoPCNA-dsDNA complex. The structure of the
TkoEndoMS-TkoPCNA-dsDNA complex was modeled by assembling the structures from PDB as has been previously
applied for the P. furiosus DNA polymerase B-PCNA-dsDNA complex 28. Interface information for EndoMS-dsDNA,
PCNA-dsDNA, and EndoMS-PCNA was retrieved from the crystal structures determined in this study, the E. coli DNA
polymerase β subunit-DNA complex (PDB: 3BEP), and the P. furiosus PCNA-RFCL PIP-box complex (PDB: 1ISQ),
-
respectively 29,30.
First, the crystal structure of the TkoEndoMS-dsDNA complex was assembled onto that of the
PCNA-dsDNA complex by superposing the dsDNAs by gradually shifting the nucleotide segments for superposition
(Figure 6). The dsDNA on PCNA was extended in advance by 5 base pairs to the PIP-binding side of the trimer to
increase merging for DNA superposition. The superposed structure, in which EndoMS was placed in contact with
PCNA, but not so close as to cause steric hindrance, was selected for the next step. Second, the PIP-box of RFCL was
assembled to the model by superposing the PCNA subunits from the PCNA-dsDNA and PIP-box-PCNA complexes.
Finally, the C-terminal residues, which were disordered in the crystal structures, were added to the model to complete
the EndoMS structure by connecting the PIP-box to the C-terminal of the TkoEndoMS crystal structure (Arg240).
Electron microscopy and single particle image analysis. The stock solutions of purified TkoEndoMS (5 µM, i.e., 2.5
µM dimer), TkoPCNA (7.5 µM, i.e., 2.5 µM trimer), and synthetic dsDNA (2.5 µM, T-G-mismatch-DNA1’ in Figure
S4a) were mixed in a buffer solution (50 mM Tris-HCl, 0.1 mM EDTA, 0.5 mM DDT, 10% glycerol, 0.6 M NaCl, pH
8.0) and incubated at room temperature for 10 min.
The sample solution was diluted 20-fold a buffer (50 mM Tris-HCl, 50 mM NaCl, pH 8.0), and 3 µL was
applied to a copper grid supporting a continuous thin-carbon film. The sample was left for 1 min and then stained with
three drops of 2% (w/v) uranyl acetate. Images of molecules were recorded with a TemCam-F216 CMOS camera
(TVIPS) with a pixel size of 3.0 Å/pixel (a total of 38 images) using a JEM1010 EM (JEOL), operated at an
accelerating voltage of 100 kV. A minimum dose system (MDS) was employed to reduce the electron radiation damage
of the sample.
Image analyses were carried out using the EMAN2 program suite 31. A total of 5,130 particle images of the
assumed TkoEndoMS-TkoPCNA-DNA complex were manually selected with the e2boxer tool. No filter was applied to
the individual images prior to image analysis. Alignment, classification, and averaging of the particle images were
performed using the e2refine2d tool. The average number of particles per class average was 65 (Figure S6c). The model
of the TkoEndoMS-TkoPCNA-DNA complex was used to generate density map projections with e2pdb2em and
e2proc2d tools for comparisons.
-
Tabl
e S1
. Rel
ated
to F
igur
e 2.
Str
uctu
res s
imila
r to
N- o
r C
-term
inal
dom
ains
of E
ndoM
S in
PD
B
PDB
/Cha
in*
Sche
mat
ic a
lignm
ent**
R
MSD
(Å)
Iden
tity(
%)
No.
res
Cov
erag
e(%
) N
ote
N-te
rmin
al
2vld
A
====
====
====
====
====
====
====
====
====
====
====
====
==
- -
- -
Que
ry
1vu2
K
____
__==
====
===-
____
_===
--==
====
-___
____
___=
====
=-
3.5
7
63
81
Smal
l nuc
lear
ribo
nucl
eopr
otei
n E
3sb2
E __
___-
====
====
=-__
___-
==-_
____
____
-===
==-_
_===
====
3.
5
15
60
90
RN
A c
hape
rone
Hfq
C-te
rmin
al
2vld
A
====
====
====
====
====
====
====
====
====
====
====
====
==
- -
- -
Que
ry
3n78
A++
_=
====
=-_-
-===
====
=_==
====
====
====
====
====
====
____
3.
4
8 91
27
Ty
pe II
rest
rictio
n en
donu
clea
se S
grA
I
2fok
B+
_-==
====
__-=
==-=
====
====
====
====
==_=
====
====
====
=-
3.5
11
98
17
Ty
pe II
rest
rictio
n en
donu
clea
se F
okI
3qp9
D
__==
====
_===
===-
____
====
=-==
====
====
====
====
====
=-
3.5
8
90
19
C2-
type
Ket
ored
ucta
se
4oc8
A+
=-==
====
====
====
====
====
====
====
=__-
====
====
====
=_
3.5
12
10
2 26
R
estri
ctio
n en
donu
clea
se A
spB
HI
1cfr
A+
_===
====
_===
==-=
====
====
====
====
====
====
====
====
=_
3.6
13
10
2 36
R
estri
ctio
n en
donu
clea
se C
fr10
I
3v21
D++
_=
====
=-__
====
-===
====
====
====
====
====
====
====
===_
3.
6
10
99
34
Type
IIF
rest
rictio
n en
donu
clea
se B
se63
4I
1z22
A
__==
===-
_-==
====
____
====
=-==
====
====
====
====
====
==
3.7
15
90
54
R
ab23
GTP
ase
4im
pA
_-==
====
_-==
===-
-___
====
=--=
====
====
====
====
====
=-
3.7
10
90
17
Po
lyke
tide
synt
hase
2p14
A+
=-==
====
_-==
=-_-
====
====
====
====
====
====
====
====
=_
3.8
10
99
53
Ty
pe II
S re
stric
tion
endo
nucl
ease
R.B
spD
6I
1a2k
E __
====
=-_=
====
--=_
__==
===-
-===
====
====
====
====
====
3.
9
6 90
43
R
as-f
amily
GTP
ase
Ran
1xm
xA
====
====
_-==
====
====
====
====
====
====
====
====
====
-_
3.9
7
104
27
Hyp
othe
tical
pro
tein
VC
1899
3agp
A
_-==
===-
_===
===-
____
====
--==
====
====
====
====
====
==
3.9
7
90
7 Pr
obab
le S
ecD
F pr
otei
n-ex
port
mem
bran
e pr
otei
n
3svt
A
__==
===-
__-=
=--=
=-__
-===
====
====
====
====
====
====
==
3.9
6
90
32
Shor
t-cha
in ty
pe d
ehyd
roge
nase
/redu
ctas
e
-
3u2q
A
__==
====
_-==
====
-___
-===
--==
====
====
====
====
====
==
3.9
10
90
23
El
onga
tion
fact
or T
u 1
4bxz
E --
_===
====
==__
__==
====
====
====
===-
_===
====
====
===_
3.
9
7 91
42
RN
A p
olym
eras
e su
buni
t RPA
BC1
1dc1
D++
_=
====
==--
====
====
=-==
====
====
====
====
====
====
===_
4.
0
8 10
2 32
Re
stric
tion
enzy
me
BsoB
I
1wtd
B+
====
====
____
-===
====
====
====
====
===-
====
====
==__
__
4.0
17
90
33
Ty
pe II
restr
ictio
n en
donu
clea
se E
coO
109I
2okf
A
__==
====
====
====
====
====
====
====
====
====
====
====
=-
4.0
13
10
6 82
Fd
xN e
lem
ent e
xcisi
on c
ontro
lling
fact
or p
rote
in
2r
eqC
__==
====
-===
====
=--_
====
=_-=
====
====
====
====
====
=_
4.0
6
93
12
Met
hylm
alon
yl C
oA m
utas
e
* PD
B co
de a
nd a
sym
id (c
hain
ID) a
re in
dica
ted
with
‘+’ (
DN
A u
nbou
nd) o
r ‘++
’ (D
NA
bou
nd) f
or
type
II re
stric
tion
enzy
mes
.
**Th
e re
gion
s of t
he p
rote
ins s
uper
pose
d to
End
oMS
dom
ains
are
indi
cate
d w
ith ‘=
‘ (w
ithou
t gap
) or ‘
-’ (i
nclu
ding
gap
).
-
Conf
.W
atso
nCr
ick
Shea
rSt
retc
hSt
agge
rBu
ckle
Prop
elle
rO
peni
ngSh
iftSl
ide
Rise
Tilt
Roll
Twist
Form
Wat
son
Cric
kSh
ear
Stre
tch
Stag
ger
Buck
lePr
opel
ler
Ope
ning
Shift
Slid
eRi
seTi
ltRo
llTw
istFo
rmA
C1C
G15
D-1
.13
-0.2
10.
22-5
.08
-14.
382.
22-
--
--
--
AG
2CC1
4D-0
.53
-0.2
0-0
.52
-14.
51-1
5.33
-6.3
0-0
.29
0.42
3.66
7.15
8.91
43.7
1B
C1A
G24
B-0
.42
-0.2
70.
062.
76-1
4.20
-3.6
7-
--
--
--
AC3
CG
13D
-1.6
00.
170.
29-6
.61
-3.3
28.
080.
65-0
.25
3.11
-7.5
01.
7927
.29
BG
2AC2
3B-0
.02
-0.2
70.
25-4
.46
-10.
85-4
.02
-0.3
60.
153.
52-3
.40
6.42
40.3
1B
AT4
CA
12D
-0.1
1-0
.39
0.10
1.64
-9.9
1-6
.11
-0.7
10.
003.
13-1
.70
2.98
35.5
2B
C3A
G22
B0.
00-0
.25
0.21
-6.9
4-3
.93
-2.3
50.
500.
233.
520.
80-4
.73
38.1
5B
AA
5CT1
1D-0
.17
-0.3
70.
097.
110.
20-3
.88
0.39
-0.0
13.
240.
713.
7933
.51
BG
4AC2
1B-0
.37
-0.4
4-0
.18
9.31
-10.
39-1
.30
-0.3
20.
693.
043.
637.
9524
.47
BA
C6C
G10
D-0
.51
-0.3
4-0
.10
6.38
-11.
90-2
.06
-0.6
5-0
.72
3.31
-1.7
71.
0734
.52
BA
5AT2
0B0.
27-0
.22
0.03
5.03
-16.
361.
840.
010.
073.
36-2
.68
3.16
40.9
0B
AA
7CT9
D0.
58-0
.37
0.29
8.52
-16.
26-6
.81
0.21
-0.1
03.
47-2
.86
6.62
44.5
2B
A6A
T19B
-0.0
9-0
.04
0.17
3.54
-18.
135.
560.
10-0
.31
3.32
-0.7
00.
9535
.35
BA
T8C
G8D
AG
9CC7
D0.
48-0
.51
0.44
-1.2
7-1
3.66
-3.9
10.
02-1
.56
3.58
-2.4
01.
4417
.07
T7A
A18
B0.
32-0
.12
0.13
0.83
-17.
707.
930.
33-0
.60
3.34
1.83
-2.7
534
.76
BA
T10C
A6D
-0.7
7-0
.24
-0.2
51.
80-1
5.84
1.80
0.13
-0.5
93.
127.
968.
8431
.28
T8A
A17
B0.
25-0
.21
-0.1
0-1
.33
-17.
670.
83-0
.31
-0.1
83.
322.
960.
7335
.39
BA
C11C
G5D
-0.3
10.
12-0
.13
-7.0
2-7
.51
3.26
0.96
0.14
3.60
1.69
1.23
41.0
3B
C9A
G16
B-0
.02
-0.2
5-0
.06
-10.
18-1
7.25
-0.8
70.
02-0
.03
3.39
0.33
-0.0
539
.27
BA
G12
CC4
D-0
.90
0.09
0.24
-0.1
9-8
.89
3.23
-0.4
40.
013.
13-5
.84
6.99
27.4
0B
G10
AC1
5B0.
09-0
.28
0.27
1.67
-5.3
1-1
.13
0.38
0.86
3.24
-3.2
93.
8629
.40
BA
T13C
A3D
-0.0
2-0
.14
-0.0
111
.06
-8.9
5-4
.56
-0.2
6-0
.42
3.05
1.56
6.07
32.9
5B
C11A
G14
B0.
07-0
.28
0.59
-3.9
6-1
8.05
-5.6
2-1
.30
0.42
3.68
-4.6
8-1
2.20
40.7
8B
AC1
4CG
2D0.
17-0
.29
-0.0
40.
68-7
.05
-5.6
5-0
.02
-0.3
33.
52-0
.41
3.73
35.7
7B
G12
AC1
3B-0
.53
-0.1
10.
266.
601.
96-3
.86
0.77
0.06
3.23
3.14
-3.0
932
.62
BA
C15C
G1D
0.22
-0.4
2-0
.02
7.83
-17.
292.
661.
24-0
.23
3.16
3.66
7.84
38.2
8
BG
1CC1
5D0.
88-0
.16
0.15
-16.
59-7
.85
-10.
70-
--
--
--
BG
2CC1
4D-0
.85
-0.3
8-0
.45
-16.
64-1
6.11
-7.5
50.
06-0
.50
3.44
9.19
8.43
29.4
4B
A3C
T13D
0.12
-0.2
50.
18-3
.52
-11.
326.
300.
290.
102.
97-6
.12
0.41
37.5
0B
BC4
CG
12D
0.58
0.06
0.12
-3.1
2-0
.76
3.49
0.10
-0.3
03.
350.
452.
1033
.79
BB
G5C
C11D
-0.4
10.
020.
095.
70-5
.31
5.95
0.50
0.01
3.08
2.60
8.58
28.1
9B
BA
6CT1
0D0.
070.
18-0
.15
5.60
-11.
810.
00-1
.07
0.18
3.42
0.10
-0.9
338
.67
BB
C7C
G9D
0.34
0.14
0.44
2.97
-13.
281.
640.
26-0
.43
3.39
-6.2
17.
6838
.10
BG
8CT8
D
BT9
CA
7D0.
120.
130.
11-7
.70
-18.
01-2
.61
0.16
-1.4
53.
554.
333.
1815
.36
BG
10C
C6D
0.18
-0.0
8-0
.11
-4.7
9-9
.37
0.01
-0.2
30.
073.
361.
998.
1040
.50
BB
T11C
A5D
0.49
-0.3
70.
00-3
.53
-3.6
1-4
.16
0.50
-0.5
33.
300.
11-0
.88
37.2
6B
BA
12C
T4D
0.43
-0.3
3-0
.10
3.28
-6.6
2-3
.23
-0.4
1-0
.27
3.13
1.60
9.30
31.4
9B
BG
13C
C3D
-0.7
2-0
.18
0.45
6.25
-3.5
62.
510.
16-0
.72
3.13
-5.4
52.
7126
.21
BB
C14C
G2D
0.65
-0.1
00.
28-1
.70
-8.1
04.
100.
33-0
.48
3.53
1.79
2.12
40.4
4B
G15
CC1
D-0
.33
-0.1
10.
4413
.00
-15.
653.
35-0
.22
-0.4
02.
89-1
.72
12.1
925
.95
Cano
nica
l B-fo
rm d
sDN
A (1
bna)
Endo
MS-
boun
d ds
DN
A (5
gke)
Supp
lem
enta
ry T
able
S2.
Rel
ated
to F
igur
e 3.
DN
A ba
se-p
air
and
base
-ste
p pa
ram
ters
of E
ndoM
S-bo
und
and
cano
nica
l B-fo
rm D
NA
s*
* Ba
se-p
airs
supe
rpos
ed in
Fig
ures
3c
and
S3e
are
alig
ned
in e
ach
low.
Con
f. sh
ows a
ltena
tive
conf
orm
atio
ns A
and
B. B
ases
(in
Wat
son
and
Cric
k str
ands
) are
indi
cted
as b
ase,
resid
ue n
umbe
r, an
d ch
ain
ID. F
orm
indi
cate
s A, B
, or o
ther
(bra
nk) f
orm
of d
oubl
e-he
lix.
-
Supplementary Figure S1. Related to Figure 2. Alignment of putative EndoMS orthologs
Sequence alignment of the selected EndoMS homologs. The sequences are indicated as gene ID, genus, and species,
followed by higher order taxonomy, where Arc and Bac represent kingdom Archaea and Eubacteria, respectively. Eur,
Cre, Tha, Act, and DeiThe indicate the phyla Euryarchaeota, Crenarchaeota, Thaumarchaeota, Actinobacteria, and
Deinococcus-Thermus, respectively (species names are truncated in the alignment). Invariant amino acids are shaded,
and the residues consistent with the signature motif of type II restriction enzymes are underlined in the alignment.
-
a
b
-
Supplementary Figure S2. Related to Figure 2. Molecular phylogeny and genome structure of putative EndoMS
orthologs
(a) Molecular phylogeny of the selected EndoMS homologs. The sequences are indicated as gene ID, genus, and species,
followed by higher order taxonomy, where Arc and Bac represent kingdom Archaea and Eubacteria, respectively. Eur,
Cre, Tha, Act, and DeiThe indicate the phyla Euryarchaeota, Crenarchaeota, Thaumarchaeota, Actinobacteria, and
Deinococcus-Thermus, respectively (species names are truncated in the alignment). Invariant amino acids are shaded,
and the residues consistent with the signature motif of type II restriction enzymes are underlined in the alignment. (b)
Probable EndoMS operons in Thermococcus genomes are shown for T. kodakaraensis KOD1, T. gammatolerans, T.
oonurience, T. sp. strain 4557, T. sibiricus, and T. barophilus MP. Genes are shown as arrows, and possible operons are
boxed. Name of proteins encoded by the genes are abbreviated as MTAP (5'-methylthioadenosine phosphorylase),
MTAD (S-adenosylhomocysteine deaminase), SK channel (calcium-gated potassium channel), GK (glycerate kinase),
and CDC6 (cell division control protein 6).
-
Supplementary Figure S3. Related to Figure 2. Structures of EndoMS and type II restriction enzymes
(a) Amino acid sequence alignment of TkoEndoMS and type II restriction enzymes with similar catalytic domain
structures, namely, the type II restriction endonuclease SgrAI (3n78_A), type IIF restriction endonuclease Bse634I
(3v21_D), and restriction enzyme BsoBI (1dc1_D). The secondary structures of each protein are indicated with yellow
and green boxes for β-sheets and α-helices, respectively. The residues consistent with the signature motif of the type II
restriction enzyme are indicated with asterisks. (b) The quaternary structures of type II restriction enzymes and
TkoEndoMS are shown on top. The structures are superposed at the catalytic domains. The corresponding C-terminal
catalytic domains (green and pink), N-terminal dimerizing domains (blue and tan), and dsDNA (red and orange) are
colored for each protein. The superposition between catalytic domains of EndoMS (blue) and restriction enzymes
(green, tan, and sky blue for 3n78, 3v21, and 1dc1, respectively) are shown at the bottom. Catalytic residues of
EndoMS are shown as ball-and-stick models.
-
Supplementary Figure S4. Related to Figure 3. Sequence and affinity of DNAs for EndoMS substrates
(a) Nucleotide sequences of the dsDNAs cocrystallized in X-ray analyses or complexed in EM analyses with
-
TkoEndoMS. The mismatch bases are indicated as bases not in-line with the rest of the sequence, and the cleavage
positions are indicated with vertical lines. A comparison among the background sequences of DNA1, DNA2, and
DNA3 is shown on the right. The strands were selected such that base identities were as high as possible. (b)
Representative gel image for evaluating binding affinities of TkoEndoMS for DNA containing G-G, G-T, or T-T
mismatch, which are preferably recognized by the enzyme. The mismatched base pair and the protein concentrations
were shown above, and apparent Kd values were indicated below the gel image. (c) Representative gel image for
evaluating binding affinities of TkoEndoMS for DNA containing A-G, T-C, A-C, A-A, or C-C mismatches, which were
not bound by the enzyme.
-
Supplementary Figure S5. Related to Figure 3. DNA bound-structures of EndoMS
(a) The electron density map of TkoEndoMS in complex with G-T-mismatch-DNA1 contoured at 1.2 s is shown on the
model. The atoms are colored as in Figure 1a. (b) The omit (Fo - Fc) maps of TkoEndoMS in complex with
G-T-mismatch-DNA1 contoured at 5.0 s are shown on the model. The nucleotides at 5′ and 3′ sides of cleavage bond
and Mg2+ ion were omitted for the map shown in green. The mismatched nucleotides (only one of them is in sight) were
omitted for the map shown in magenta. (c) 2Fo - Fc map contoured at 1.2 s around the binding site of flipped bases
(G8A and T8B in alternative dsDNA models). (d) The omit (Fo - Fc) map of the same region as panel c contoured at
3.7 s, where T8B was omitted. (e) The omit (Fo - Fc) map of the same region as panel c contoured at 3.7 s, where G8A
was omitted. (f) The map of residual electron density contoured at 3.5 s around T9A-A7A or G9B-C7B pair. The
carbon atoms in the bases in alternative dsDNA model A (T9A-A7A) and B (G9B-C7B, which were not modeled at this
point) are shown in gray and white, respectively. (g) Alternative binding of dsDNA on TkoEndoMS. The same dsDNA
models were bound in opposite directions to each other on TkoEndoMS homo-dimer, and colored in red (alternative
model A) and yellow (alternative model B). (h) Superposition of TkoEndoMS-dsDNA complex structures. The proteins
in T-G-mismatch-DNA1, T-T-mismatch-DNA1, G-G-mismatch-DNA1, T-G-mismatch-DNA2, and
T-G-mismatch-DNA3 complexes are shown in light blue, tan, light green, pink, and gray, respectively. (i) A model of
dsDNA in B-form (PDB: 1BNA, colored in blue and sky) was superposed onto the bound dsDNA
(T-G-mismatch-DNA1, red and yellow) by ignoring the flipped out mismatched bases. The base-pair and base-step
parameters were compared between the dsDNAs and summarized in Table S2, which showed the dsDNA bound to
TkoEndoMS is in B-from as an overall.
-
Figure S6. Related to Figure 4. Electron microscopic images and class averages of the TkoEndoMS-TkoPCNA-
dsDNA complex
(a) An electron micrograph of the negatively stained TkoEndoMS-TkoPCNA-dsDNA complex (scale bar represents
100 nm). A large aggregate and several isolated particles (enclose in boxes with white lines) are noticeable. (b) Close
up image of the boxed area (black line) in panel a. (c) Two-dimensional class averages of the selected particles. The
size of each individual image is 19.2 × 19.2 nm2. The class averages indicated with white dots were compared with the
model projections in Figure 4c.
-
SUPPLEMENTARY REFERENCES
1. UniProt, C. Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res 42, D191-8 (2013).
2. Berman, H., Henrick, K. & Nakamura, H. Announcing the worldwide Protein Data Bank. Nat Struct Biol 10,
980 (2003).
3. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J Mol
Biol 215, 403-10 (1990).
4. Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol
Biol Evol 4, 406-25 (1987).
5. Thompson, J.D., Higgins, D.G. & Gibson, T.J. CLUSTAL W: improving the sensitivity of progressive
multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix
choice. Nucleic Acids Res 22, 4673-80 (1994).
6. Aggarwal, A.K. Structure and function of restriction endonucleases. Curr Opin Struct Biol 5, 11-9 (1995).
7. Aravind, L., Makarova, K.S. & Koonin, E.V. SURVEY AND SUMMARY: holliday junction resolvases and
related nucleases: identification of new families, phyletic distribution and evolutionary trajectories. Nucleic
Acids Res 28, 3417-32 (2000).
8. Ren, B. et al. Structure and function of a novel endonuclease acting on branched DNA substrates. EMBO J 28,
2479-89 (2009).
9. Chan, P.P., Holmes, A.D., Smith, A.M., Tran, D. & Lowe, T.M. The UCSC Archaeal Genome Browser: 2012
update. Nucleic Acids Res 40, D646-52 (2011).
10. Taboada, B., Ciria, R., Martinez-Guerrero, C.E. & Merino, E. ProOpDB: Prokaryotic Operon DataBase.
Nucleic Acids Res 40, D627-31 (2011).
11. Fukui, T. et al. Complete genome sequence of the hyperthermophilic archaeon Thermococcus kodakaraensis
KOD1 and comparison with Pyrococcus genomes. Genome Res 15, 352-63 (2005).
12. Lee, H.S. et al. The complete genome sequence of Thermococcus onnurineus NA1 reveals a mixed
heterotrophic and carboxydotrophic metabolism. J Bacteriol 190, 7491-9 (2008).
13. Mardanov, A.V. et al. Metabolic versatility and indigenous origin of the archaeon Thermococcus sibiricus,
-
isolated from a siberian oil reservoir, as revealed by genome analysis ProOpDB: Prokaryotic Operon DataBase.
Appl Environ Microbiol 75, 4580-8 (2009).
14. Vannier, P., Marteinsson, V.T., Fridjonsson, O.H., Oger, P. & Jebbar, M. Complete genome sequence of the
hyperthermophilic, piezophilic, heterotrophic, and carboxydotrophic archaeon Thermococcus barophilus MP. J
Bacteriol 193, 1481-2 (2011).
15. Zivanovic, Y. et al. Genome analysis and genome-wide proteomics of Thermococcus gammatolerans, the most
radioresistant organism known amongst the Archaea. Genome Biol 10, R70 (2009).
16. Franceschini, A. et al. STRING v9.1: protein-protein interaction networks, with increased coverage and
integration. Nucleic Acids Res 41, D808-15 (2012).
17. Orchard, S. et al. The MIntAct project--IntAct as a common curation platform for 11 molecular interaction
databases. Nucleic Acids Res 42, D358-63 (2013).
18. Ishino, S. et al. Identification of a mismatch-specific endonuclease in hyperthermophilic archaea. Nucleic
Acids Res 44, 2977-2986 (2016).
19. Otwinowski, Z. & Minor, W. Processing of X-ray Diffraction Data Collected in Oscillation Mode. in Methods
in Enzymology, Volume 276: Macromolecular Crystallography, part A, Vol. 276 (eds. C.W. Carter, J. & Sweet,
R.M.) 307-326 (1997).
20. Battye, T.G., Kontogiannis, L., Johnson, O., Powell, H.R. & Leslie, A.G. iMOSFLM: a new graphical
interface for diffraction-image processing with MOSFLM. Acta Crystallogr D Biol Crystallogr 67, 271-81
(2011).
21. Adams, P.D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution.
Acta Crystallogr D Biol Crystallogr 66, 213-21 (2010).
22. Vagin, A. & Teplyakov, A. Molecular replacement with MOLREP. Acta Crystallogr D Biol Crystallogr 66,
22-5 (2010).
23. Emsley, P., Lohkamp, B., Scott, W.G. & Cowtan, K. Features and development of Coot. Acta Crystallogr D
Biol Crystallogr 66, 486-501 (2004).
24. Laskowski, R.A., Moss, D.S. & Thornton, J.M. Main-chain bond lengths and bond angles in protein structures.
-
J Mol Biol 231, 1049-67 (1993).
25. Zheng, G., Lu, X.J. & Olson, W.K. Web 3DNA--a web server for the analysis, reconstruction, and
visualization of three-dimensional nucleic-acid structures. Nucleic Acids Res 37, W240-6 (2009).
26. Drew, H.R. et al. Structure of a B-DNA dodecamer: conformation and dynamics. Proc Natl Acad Sci U S A 78,
2179-83 (1981).
27. Pettersen, E.F. et al. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput
Chem 25, 1605-12 (2004).
28. Mayanagi, K. et al. Architecture of the DNA polymerase B-proliferating cell nuclear antigen (PCNA)-DNA
ternary complex. Proc Natl Acad Sci U S A 108, 1845-9 (2011).
29. Georgescu, R.E. et al. Structure of a sliding clamp on DNA. Cell 132, 43-54 (2008).
30. Matsumiya, S., Ishino, S., Ishino, Y. & Morikawa, K. Physical interaction between proliferating cell nuclear
antigen and replication factor C from Pyrococcus furiosus. Genes Cells 7, 911-22 (2002).
31. Tang, G. et al. EMAN2: an extensible image processing suite for electron microscopy. J Struct Biol 157, 38-46
(2007).