jcm accepts, published online ahead of print on 28...

34
1 Use of Illumina deep sequencing technology to differentiate hepatitis C virus variants. 1 2 Masashi Ninomiya, 1 Yoshiyuki Ueno, 1 Ryo Funayama, 2 Takeshi Nagashima, 2 Yuichiro 3 Nishida, 2 Yasuteru Kondo, 1 Jun Inoue, 1 Eiji Kakazu, 1 Osamu Kimura, 1 Keiko 4 Nakayama, 2 Tooru Shimosegawa 1 5 6 1. Division of Gastroenterology, Tohoku University of Medicine, Sendai, Japan, 2. 7 Division of Cell Proliferation, Tohoku University of Medicine, Sendai, Japan 8 9 10 11 Corresponding author: 12 Associate Prof. Dr. Yoshiyuki Ueno 13 Division of Gastroenterology, Tohoku University Graduate School of Medicine 14 1-1 Seiryo, Aobaku, Sendai, 9808574, Japan 15 Phone: 81-227177171; Fax: 81-227177177 16 E-mail: [email protected] 17 18 19 Word counts: Abstract: 230 20 21 Copyright © 2011, American Society for Microbiology. All Rights Reserved. J. Clin. Microbiol. doi:10.1128/JCM.05715-11 JCM Accepts, published online ahead of print on 28 December 2011 Copyright © 2012, American Society for Microbiology. All Rights Reserved. J. Clin. Microbiol. doi:10.1128/JCM.05715-11 JCM Accepts, published online ahead of print on 11 January 2012 on May 14, 2018 by guest http://jcm.asm.org/ Downloaded from

Upload: hoangtruc

Post on 11-Mar-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

1

Use of Illumina deep sequencing technology to differentiate hepatitis C virus variants. 1

2

Masashi Ninomiya,1 Yoshiyuki Ueno,1 Ryo Funayama,2 Takeshi Nagashima,2 Yuichiro 3

Nishida,2 Yasuteru Kondo,1 Jun Inoue,1 Eiji Kakazu,1 Osamu Kimura,1 Keiko 4

Nakayama,2 Tooru Shimosegawa1 5

6

1. Division of Gastroenterology, Tohoku University of Medicine, Sendai, Japan, 2. 7

Division of Cell Proliferation, Tohoku University of Medicine, Sendai, Japan 8

9

10

11

Corresponding author: 12

Associate Prof. Dr. Yoshiyuki Ueno 13

Division of Gastroenterology, Tohoku University Graduate School of Medicine 14

1-1 Seiryo, Aobaku, Sendai, 9808574, Japan 15

Phone: 81-227177171; Fax: 81-227177177 16

E-mail: [email protected] 17

18

19

Word counts: Abstract: 230 20

21

Copyright © 2011, American Society for Microbiology. All Rights Reserved.J. Clin. Microbiol. doi:10.1128/JCM.05715-11 JCM Accepts, published online ahead of print on 28 December 2011

Copyright © 2012, American Society for Microbiology. All Rights Reserved.J. Clin. Microbiol. doi:10.1128/JCM.05715-11 JCM Accepts, published online ahead of print on 11 January 2012

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 2: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

2

ABSTRACT 22

Hepatitis C virus (HCV) is a positive-strand enveloped RNA virus that shows diverse 23

viral populations even in one individual. Though Sanger sequencing has been used for 24

determining viral sequences, deep sequencing technologies are much faster and can 25

detect large-scale sequencing. We demonstrate the successful use of Illumina deep 26

sequencing technology and subsequent analyses to determine the genetic variants and 27

amino acid substitutions of both treatment-naïve (No.1) and treatment-experienced 28

(No.7) isolates of HCV infection. As a result, almost the full nucleotide sequence of 29

HCV was detectable in patients No.1 and No.7. The reads were mapped to the HCV 30

reference sequence. The coverage showed 99.8% and the average depth indicated 69.5X 31

in patient No.7 and 99.4% (coverage) and 51.1X (average depth) in patient No.1. In 32

patient No.7, aa. 70 in the core region showed arginine and methionine at aa. 91 by the 33

Sanger sequencing. Major variants showed the same amino acid sequence, but minor 34

variants were detectable in 18% (6/34) with substitutions of methionine by (leucine) 35

at aa. 91. In NS3, the position of 8 amino acid bases showed mixed variants, T72T/I, 36

K213K/R, G237G/S, P264P/S/A, S297S/A, A358A/T, S457S/C and I615I/M, in patient 37

No.7. In patient No.1, 3 amino acids indicated mix variants, L14L/F/V, S61S/A and 38

I586T/I. In conclusion, deep sequencing technologies are a powerful tool for obtaining 39

more profound insight into the dynamics of variants in the HCV quasispecies in human 40

serum. 41

42

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 3: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

3

INTRODUCTION 43

Hepatitis C virus (HCV) is a positive-strand enveloped RNA virus of approximately 44

9600 nucleotide bases, consisting of a single open reading frame and two untranslated 45

regions, and belongs to the genus Hepacivirus within the family Flaviviridae (6). A 46

single open reading frame encodes a polyprotein of 3011 amino acids that is cleaved by 47

viral and cellular proteases into 10 different proteins. The three structural proteins, 48

which constitute the viral particle, include the core protein and the envelope 49

glycoproteins E1 and E2. Two regions in E2, known as hypervariable regions 1 and 2, 50

are reported to have extreme sequence variability. The seven nonstructural 51

components include the p7 polypeptide, the NS2-3 protease, the NS3 serine protease 52

and RNA helicase, the NS4A polypeptide, the NA4B and NS5A protein and the NS5B 53

RNA-dependent RNA polymerase (RdRp) (29). In both ends of the open reading frame, 54

untranslated regions called 5’UTR and 3’UTR are constituted. The nucleotide sequence 55

of the 5'UTR is relatively well conserved among different HCV genotypes. The HCV 56

5’UTR contains an internal ribosome entry site (IRES) that directs the 57

cap-independent initiation of virus translation and forms on characteristic four 58

stem-loop structures (17, 18). HCV displays very high genetic variability both in 59

populations and within infected individuals, where it exists as a cluster of closely 60

related but distinct variants, termed “quasispecies”, as occurs in many other RNA 61

viruses with a polymerase enzyme lacking proofreading ability (6, 8, 26). 62

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 4: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

4

Current standard treatment of chronic HCV infection is based on the combination of 63

pegylated interferon-α (Peg-IFN-α) and ribavirin (RBV). However, patients with a high 64

load of genotype 1b (>1x105 LogIU/ml) do not achieve high sustained virological 65

response (SVR) rates (<50%), even when the most effective combination treatment 66

(IFN plus RBV) is administered for 48 weeks (14, 25). Some investigations concerning 67

therapeutic prediction based on virological features revealed that the substitutions of 68

arginine by glutamine at amino acid (aa.) 70, and/or leucine by methionine at aa. 91 in 69

the core region were independent and significant factors associated with SVR, or that 70

more than 4 amino acid changes in the NS5A-interferon sensitivity determining region 71

(ISDR) with the aa. 2209 to 2248 sequence had high responses to IFN therapy 72

compared to that of HCV-J (mutant type), whereas the patients with no amino acid 73

changes (wild type) and with 1 to 3 amino acid changes (intermediate type) had low 74

responses (1, 2, 9, 10). 75

Recently, direct-acting antiviral (DAA) molecules active on HCV such as NS3/4A 76

protease inhibitor, nucleoside/nucleotide analogue inhibitors of RNA-dependent RNA 77

polymerase (RdRp), non-nucleoside inhibitors of RdRp and NS5A inhibitors have been 78

developed. These DAA molecules either alone or in combination with Peg-IFN plus 79

RBV have recently been described as showing high antiviral effects (15, 21). However, 80

the problem that we have to consider next is viral resistance to DAAs due to the 81

selection of viral variants that contain amino acid substitutions altering the drug 82

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 5: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

5

target and rendering it less susceptible to the drug’s inhibitory activity (35). 83

Additionally, the drug-resistant variants already preexist as minor populations within 84

a patient’s quasispecies. Drug exposure intensively inhibits replication of the 85

drug-sensitive viral population and the resistant variants gradually predominate in the 86

HCV population (7). In the future, to determine the most appropriate treatment for 87

HCV patients, analysis of the nucleotide or amino acid sequence of HCV will become 88

important. 89

Initial attempts to identify the HCV genome sequence relied on Sanger sequencing 90

and the use of PCR primers to relatively conserved regions, methods that would likely 91

fail if the virus has more variants (32-34). In recent years, new technologies that are 92

able to sequence viruses from environmental samples without using specific primers, 93

cloning and resorting to recombinant DNA techniques can obtain the sequence 94

information for the complete virome in an unbiased way. Metagenomic approaches, 95

such as by deep sequencing, have proven increasingly successful for identifying the 96

variants or mutations of the nucleotide sequence (23, 42, 45). 97

Here, we demonstrate the successful use of Illumina deep sequencing technology and 98

subsequent analyses to determine the genetic variants and amino acid substitutions of 99

both treatment-naïve (No.1) and treatment-experienced with IFN (No.7) isolates of 100

HCV infection without using specific HCV primers. 101

102

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 6: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

6

MATERIALS AND METHODS 103

104

Patients. Two patients with chronic hepatitis C infection with genotype 1b and one 105

healthy control were enrolled this study. Each serum sample was collected before 106

treatment with Peg-IFN-α and RBV, and stored at -20 oC until testing. In the 107

laboratory data, the HCV viral load was 6.8 log IU/ml in patient No.1 who was 108

treatment-naïve and 7.0 log IU/ml in patient No.7 treated with IFNα2b and RBV for 6 109

months in 2002, but with no treatment effect. (COBAS TaqMan HCV Test; Roche 110

Molecular Systems, Plasanton, CA). More clinical information is described in Table 1. 111

112

Sanger sequencing in the core region and NS5A-ISDR. Total RNA was extracted from 113

100 μl serum samples using MagMAX Viral RNA Isolation kit (Ambion, Austin, TX), 114

and the RNA preparation thus obtained was subjected to complementary DNA (cDNA) 115

synthesis with reverse transcriptase (SuperScriptIII RNaseH- reverse transcriptase; 116

Invitrogen) and to PCR amplification using Prime STAR HS DNA polymerase (TaKaRa 117

Bio, Shiga, Japan) with nested primers derived from the core region and NS5A-ISDR of 118

the HCV genome. Nested PCR of the core region of the HCV genome was carried out 119

with primers C008 (sense, 5’-AAC CTC AAA GAA AAA CCA AAC G-3’) and C011 120

(antisense, 5’-CAT GGG GTA CAT YCC GCT YG-3’) in the first round for 35 cycles (98 121

oC for 10 sec, 55 oC for 15 sec, and 72 oC for 1 min, with an additional 7 min in the last 122

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 7: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

7

cycle) and C009 (sense, 5’-CCA CAG GAC GTY AAG TTC CC-3’) and C010 (antisense, 123

5’-AGG GTA TCG ATG ACC TTA CC-3’) in the second round for 25 cycles. Nested 124

primers that were derived from NS5A-ISDR of the HCV genomes were designed to 125

amplify a 188 bp product with C004 (sense, 5’-ATG CCC ATG CCA GGT TCC AG-3’) 126

and C005 (antisense, 5’-AGC TCC GCC AAG GCA GAA GA-3’) in the first round and 127

C006 (sense, 5’-ACC GGA TGT GGC AGT GCT CA-3’) and C007 (antisense, 5’-GTA ATC 128

CGG GCG TGC CCA TA-3’) in the second round. The PCR products were sequenced 129

directly on both strands using the BigDye Terminator version 3.1 Cycle Sequencing Kit 130

on an ABI PRISM 3100 Genetic Analyzer (Applied Biosystems, Foster City, CA). 131

Sequence analysis was performed using Genetyx-Mac ver. 12.2.6 (Genetyx Corp., Tokyo, 132

Japan) and ODEN (version 1.1.1) from the DNA Data Bank of Japan (National 133

Institute of Genetics, Mishima, Japan) (19). 134

135

Library preparation and Illumina sequencing. Total RNA was extracted from 800 μl of 136

serum using MagMAX Viral RNA Isolation kit (Ambion) according to the 137

manufacturer’s protocol with the slight modification that carrier RNA was not included. 138

A library was prepared from approximately 200 ng of total RNA using mRNA-seq sample 139

prep kit (Illumina, San Diego, CA). The quality of the library was evaluated with 140

Bioanalyzer (Agilent, Santa Clara, CA). Before deep sequencing, we confirmed the 141

presence of the HCV genome in the libraries by conducting quantitative PCR with 142

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 8: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

8

StepOnePlus ™ (Applied Biosystems) using SYBR premix Ex Taq (TaKaRa, Siga, 143

Japan) with specific primers C112 (sense, 5’-GCW GTS CAR TGG ATG AAC CG-3’) and 144

C113 (antisense, 5’-GCT YTC MGG CAC RTA GTG CG-3’) derived from the 81 bp 145

region of HCV-NS4B, and then loaded each sample on two or three lanes of a flow cell. 146

Libraries were clonally amplified on the flow cell and sequenced on the Illumina Genome 147

Analyzer IIx (SCS 2.8 software, Illumina, San Diego, CA) with single-end sequence of 148

76-mer. Image analysis and base calling were performed using the RTA 1.8 software. 149

150

Analysis. 76-mer single-end reads were classified by strict barcodes, split into 151

individual reads, and stripped of any remaining primer sequences using CLC Genomics 152

Workbench (4.6) (http://www.clcbio.co.jp). Sequence reads aligned to the human 153

genome by hg19 (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips 154

/chromFa.tar.gz), GenBank (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/ 155

bigZips/mrna.fa.gz), RefSeq (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/ 156

bigZips/refMrna.fa.gz) and Ensembl (ftp://ftp.ensembl.org/pub/release62/fasta/ 157

homo_sapiens/cdna/Homo_sapiens.GRCh37.62.cdna.all.fa.gz and ftp://ftp.ensembl. 158

org/pub/release62/fasta/homo_sapiens/ncrna/Homo_sapiens.GRCh37.62.ncrna.fa.gz) 159

were removed in the first mapping analysis of the human genome. Sequence reads not 160

of human origin were aligned with 970 reference HCV sequences registered at 161

Hepatitis Virus Database Server (http://s2as02. genes.nig.ac.jp/index.html) using BWA 162

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 9: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

9

(0.5.9-r16) with the mismatch within 5 to 10 nucleotide bases (24). The reads could be 163

defined as of HCV origin by identification with reference to the HCV sequences with 164

the mismatches within 10 nucleotide bases. The duplicate reads were completely 165

excluded to avoid sequence bias by Samtools (0.1.16) (24). Additionally, the variants 166

compared with HCV-J were identified by Samtools. The result of the analysis was 167

displayed using an Integrative genomics viewer (IGV) (2.0.3) (36). 168

169

Ethics statement. Written informed consent was obtained from each individual and 170

the study for detecting host genome was approved by the Ethics Committee of Tohoku 171

University School of Medicine (2010-404). 172

173

174

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 10: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

10

RESULTS 175

176

Evaluation of the quality of the library. We conducted the deep sequencing with two 177

patients (No.1: treatment-naïve and No.7: treatment-experienced with IFN) who had 178

been infected with chronic hepatitis C of genotype 1b, and one healthy control (No.9) 179

(Table 1). Since there is only a small quantity of circulating RNAs including those of 180

viral origin in serum, it would be important to evaluate the quality of the libraries. The 181

library showed good quality in patient No.7 using Agilent bioanalyzer, but the primer 182

and adaptor dimers were mixed in the libraries of patient No.1 and control No.9 (Fig. 183

1A). Before deep sequencing, we evaluated whether the HCV genome was included in 184

the libraries, and the amplification plot of quantitative PCR showed that the HCV 185

genome could be detected in both patients No.1 and No.7 with the specific primers C112 186

and C113 derived from the region of NS4B (Fig. 1B). 187

188

Distribution of free RNA in human serum. To characterize the metagenomics of HCV 189

infection in human, we analyzed the samples by single-end deep sequencing on three 190

lanes from patient No.7 and on two lanes from patient No.1 and control No.9 using an 191

Illumina Genome Analyzer IIx. After trimming the reads to exclude ambiguious 192

nucleotides, primers or adaptor sequences, 96,079,465 high-quality 76-bp reads were 193

subjected to analysis. From the initial set of the reads, a total of 86,868,619 reads were 194

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 11: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

11

alignable to the human genomic DNA. 195

Then we mapped the remaining 9,210,846 reads: 1,453,419 reads in patient No.1, 196

7,099,434 in No.7 and 657,993 in No.9 (Table 2). 197

198

Mapping of the HCV genome sequence. The reads were aligned to 970 HCV genome 199

sequences using BWA, allowing mismatches within 5 to 10 nucleotide bases. 200

Accordingly, MD5-1 (accession no. AF165053) was expected to be the closest HCV 201

strain to that in patient No.1 and MD2-2 (accession no. AF165048) the closest to that in 202

patient No.7. The reads obtained from healthy subject No. 9 were not aligned to MD5-1 203

or MD2-2 when allowing mismatches within 10 nucleotide bases (Supple. data set). 204

Whereas some strains, for example HC-J4, HCV-KT9 or HC-J6 et.al, could be mapped 205

to the reads from healthy subject No.9, all the reads were aligned at 3’UTR of the 206

U-rich region and we could not evaluate whether they were of HCV origin, Therefore, 207

we constituted the HCV genome sequence of patient No.1 and No.7 without 3’UTR. In 208

this alignment, the duplicate reads were completely excluded. In patient No.1, 6303 209

reads were mapped on MD5-1 when allowing within 10 mismatched nucleotide bases. 210

The coverage showed 99.4% and the average depth indicated 51.1X (Fig. 3). In No.7, 211

8583 reads could be identified with MD2-2. The coverage showed 99.8% and the 212

average sequencing depth indicated 69.5X (Fig. 2). 213

Of note, only in 8 nt at 5’UTR and 52 nt at the core, region in patient No.1 and 18 nt at 214

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 12: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

12

E2 region in patient No.7 the genome sequence could not be obtained. 215

216

Amino acid substitutions in the core region and NS5A-ISDR. To identify potential 217

mutations at key sites in the genome that mediate the effect of IFN-based therapy, we 218

compared the genome of patient No.7 with HCV-J, which is known as the prototypical 219

HCV 1b strain and whose complete genomic sequence has been determined (20). A 220

previous study reported the substitutions of aa. 70 and/or 91 in the core region and that 221

the number within three bases of substitutions in aa. 2209-2248 (NS5A-ISDR) might 222

be associated with resistance to IFN-based therapy (1, 2, 9, 10). Aa. 70 in the core 223

region showed arginine and methionine at aa. 91 by the Sanger sequencing. In deep 224

sequencing, major variants showed the same amino acid sequence, but minor variants 225

were detectable in 18% (6/34) with substitutions of methionine by leucine at aa. 91 (Fig. 226

4A). In NS5A-ISDR, no substitution was indicated in major variants, the same as that 227

of the direct sequence by the Sanger sequencing, but 16% (14/89) showed minor variant 228

substitutions of aspartic acid by valine in aa. 2220 (Fig. 4B). We validated that more 229

than 10% of the detected variants were effective. In patient No.1, amino acid bases of 230

core aa. 70,(arginine) core aa. 91 (leucine) and the number of NS5A-ISDR mutations (0) 231

were the same as that by the Sanger sequencing. 232

233

Amino acid substitutions in NS3 and NS5B. The recent development of DAA 234

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 13: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

13

molecules, such as protease inhibitor and polymerase inhibitor, has raised the concern 235

that resistance may weaken the effect of DAA-based therapy (35). It is necessary to 236

obtain the amino acid sequence of NS3 or NS5B that bear substitutions which alter the 237

target of the drug. In NS3 of patient No.7, 26 amino acids were changed in comparison 238

with the prototype HCV-J. The position of 8 amino acid bases showed mixed variants: 239

T72T/I (57 75%/21 27%), K213K/R (18 26%/52 74%), G237G/S (41 46%/49 54%), 240

P264P/S/A (20 42%/19 40%/9 19%), S297S/A (63 81%/15 19%), A358A/T (20 21%/75 241

79%), S457S/C (56 64%/32 36%) and I615I/M (42 46%/49 54%) (Table 3). In NS5B, A 242

full amino acid sequence was observed. 31 amino acids were altered in comparison with 243

HCV-J. No mixed variants were shown (Table 3). With patient No.1, a full amino acid 244

sequence was detected in NS3 and NS5B. When compared with HCV-J, 31 amino acids 245

were altered in NS3. 3 amino acids indicated mix variants L14L/F/V (67 89%/6 8%/2 246

3%), S61S/A (48 71%/20 29%) and I586I/T (4 12%/30 88%) (Table 3). In NS5B, 31 amino 247

acids were converted (Table 3). Of note, more than 10% of the minor variants were 248

confirmed as effective. 249

250

251

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 14: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

14

DISCUSSION 252

253

In this study, we attempted to detect the HCV genome directly in human serum 254

without using specific primers and succeeded in determining and certifying nearly the 255

full genome sequence and high genetic diversity using deep Illumina sequencing. HCV 256

has been already reported as a highly variable virus with a quasispecies distribution, 257

large viral populations, and very rapid turnover in individual patients (6, 26). Previous 258

studies of metagenomic sequencing of other viruses from human clinical samples 259

mostly employed pyrosequencing (11, 12, 23, 30, 46). The longer reads from 260

pyrosequencing (250-450 bp) facilitate the assembly of the individual reads into contigs, 261

which facilitate the classification of the sequence data by homology-based BLAST 262

alignment. In contrast to metagenomic analysis using pyrosequencing, Illumina 263

short-read sequencing enables an order of magnitude greater depth that is reflected in 264

a very low detection level. A recent report revealed that viral transcripts could be found 265

at frequencies less than 1 in 1,000,000 (28). However, because of short reads, de novo 266

assembly without any reference is difficult to conduct, so it is not suitable to discover 267

an unknown viral genome. However, it seems quite useful for resequencing or the 268

detection of variants of known viruses for which there have already been reported 269

abundant nucleotide sequence data. 270

We defined the Ilumina 76-reads as of HCV origin relying on the 970 HCV genome 271

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 15: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

15

sequences in this study. HCV shows considerable genetic diversity and has been 272

classified into 11 genogroups or 6 groups by the core, E1 or NS5B region with 273

nucleotide divergence. Only about 75% similarity was shown even in the same group in 274

the variable region (38, 39, 44). The reads were aligned to each of the 970 HCV isolates 275

with mismatches within 10 nucleotide bases. Under this condition, we could not map 276

the reads of HCV isolates without the 3’UTR region in patient No.9, who had not been 277

infected with HCV. This is because, 3’UTR has a U-rich region and it is impossible to 278

decide the reads of HCV origin with specificity. Therefore, we aligned the reads to the 279

HCV full sequence without 3’UTR in patients No.1 and No.7. 280

Many variants known as quasispecies with nucleotide bases were detected in both 281

patients No.1 and No.7. Taking a close look at the mixed variants at amino acid 282

substitutions in NS3, No.7 showed 8 variants, while there were only 3 in No.1. Recent 283

studies reported that HCV genomic sequences in treatment relapsers displayed 284

significantly more mutations than those in the nonresponders. The HCV sequence 285

analysis of a 4-year post anti-viral therapy follow-up revealed that a vast majority of 286

mutations selected during the therapy phase in the relapsers were maintained, while 287

very few new mutations arose during the 4-year post-therapy span (5, 47). Based on the 288

experiments mentioned above, treatment with IFN may lead to the emergence of 289

mutations. Deep sequencing is considered to be a useful tool to detect viral variants 290

and determine the mutational rate without cloning. 291

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 16: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

16

Although, there were only a few detected reads of HCV origin in from the serum, 292

metagenomic analysis could be conducted with the enormous data sets generated by 293

deep sequencing. Consequently, almost the full genome sequence of HCV was 294

demonstrated by using computational analysis of the sequential alignments of 295

individual reads, with an average depth 51.1X and 69.5X. However, the regions in 296

which we could not obtain the sequence were 8 nt at 5’UTR and 52 nt at the core, in 297

patient No.1 and 18nt at E2 in patient No.7. Since 5’UTR forms on characteristic 298

stem-loop structures, Sanger sequencing is generally difficult (31). Similarly, even deep 299

Illumina sequencing appeared to be difficult. Since the E2 region, known as a 300

hypervariable region, is reported to have extreme sequence variability, it was predicted 301

that there would be too many mismatches with the reference HCV genome and the 302

reads could not be mapped by this analysis. The analysis in this hypervariable region 303

will require further work. 304

Comparing the quality of the libraries of patients No.1 and No.7, that of patient No.7 305

was well refined, hence the quality is important to get larger amounts of expected reads. 306

However, a lot of duplicate reads were found in patient No.7 and, in fact, to analyze the 307

full sequence of HCV, it is considered to be sufficient to use two lanes by Illumina 308

sequencing. 309

Amino acid substitutions of core aa. 70 and 91 and in NS5A-ISDR as well as genetic 310

polymorphisms in the host IL28B gene encoding IFN-λ-3 on chromosome 19 affect the 311

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 17: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

17

outcome of interferon-based therapies for chronic C patients (1, 2, 9,10, 16, 40, 43). 312

Even if only an amino acid substitution of a major variant is assumed, the accurate 313

therapeutic effect would be impossible to predict. The proportion of minor variants may 314

change the therapeutic effect and the variants cannot be detected only by direct 315

sequence using Sanger sequencing. In fact, patient No.7 showed methionine at aa. 91 316

in the core region and no substitution in the NS5A-ISDR by direct Sanger sequencing, 317

but deep sequencing indicated minor variants of 18% at aa. 91 in the core region, and of 318

15% in the NS5A-ISDR. 319

Recently, DAA molecules have been developed for HCV therapies and these drugs 320

may lead to the selection of resistant viruses if administered alone. The 321

first-generation NS3/4A inhibitors are telaprevir and boceprevir, and most of the 322

reported clinical data on drug resistance were obtained from patients treated with 323

telaprevir. As an illustration, based on in vitro studies, telaprevir resistance related to 324

amino acid substitutions, V36A/M/C, T54A/S, R155K/T/Q, A155V/T and A156T has 325

been reported (4, 22, 37). Substitutions that generated boceprevir resistance included 326

those detected in a patient treated with telaprevir plus V170A/T and V55A (13, 41). Of 327

the non-nucleoside inhibitors of RdRp causing efficiency at NS5B, few have been 328

reported in the in vivo resistance data (27). This is because the studies of antiviral 329

efficacy are generally limited to 3-5 days. And yet, for example, the S282T substitution 330

has been reported to confer a loss of in vitro sensitivity to nucleoside/nucleotide 331

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 18: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

18

analogue inhibitors (3). In the future, the triple combination of Peg-IFN-α, RBV and a 332

protease inhibitor or several combinations of DAAs will soon become the standard 333

therapy for treatment-naïve and treatment-experienced patients with HCV genotype 1. 334

In this study, though neither patients No.1 nor No.7 showed drug-resistant variants, it 335

would be very important for the selection of therapy to identify resistant or minor 336

variants prior to treatment. Additionally, when treatment has failed, it is necessary to 337

consider viral factors as a cause. 338

In conclusion, deep sequencing technologies are a powerful tool for obtaining more 339

profound insight into the dynamics of variants in the HCV quasispecies in human 340

serum. Although, the cost of deep sequencing is still much greater than the reagent 341

costs for Sanger sequencing, it is still attractive in clinical medicine because deep 342

sequencing is able to generate much more information on the viral genome sequence in 343

internal organs. The cost will decrease in the future as the technology of deep 344

sequencing develops. As DAAs combination treatment of HCV infection is developed, 345

the sequence information of variants in individual cases using deep sequencing will be 346

feasible for determining the optimal antiviral treatment. 347

348

ACKNOWLEDGMENTS 349

350

We thank M. Tsuda, N. Koshita and K. Kuroda for technical assistance. We also 351

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 19: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

19

acknowledge the support of the Biomedical Research Core of Tohoku University 352

Graduate School of Medicine. This study was supported in part by Grant-in-Aid for Young 353

Scientists (B) from Ministry of Education, Culture, Sports, Science, and Technology of Japan 354

(assignment no. 22790627), and by grants from Ministry of Health, Labor, and Welfare of Japan. 355

356

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 20: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

20

REFERENCES 357

358

1. Akuta, N., F. Suzuki, Y. Kawamura, H. Yatsuji, H. Sezaki, Y. Suzuki, T. Hosaka, 359

M. Kobayashi, Y. Arase, K. Ikeda, Y. Miyakawa, and H. Kumada. 2007. 360

Prediction of response to pegylated interferon and ribavirin in hepatitis C by 361

polymorphisms in the viral core protein and very early dynamics of viremia. 362

Intervirology 50:361-368. 363

2. Akuta, N., F. Suzuki, H. Sezaki, Y. Suzuki, T. Hosaka, T. Someya, M. Kobayashi, 364

S. Saitoh, S. Watahiki, J. Sato, M. Matsuda, Y. Arase, K. Ikeda, and H. Kumada. 365

2005. Association of amino acid substitution pattern in core protein of hepatitis 366

C virus genotype 1b high viral load and non-virological response to 367

interferon-ribavirin combination therapy. Intervirology 48:372-380. 368

3. Ali, S., V. Leveque, S. Le Pogam, H. Ma, F. Philipp, N. Inocencio, M. Smith, A. 369

Alker, H. Kang, I. Najera, K. Klumpp, J. Symons, N. Cammack, and W. R. Jiang. 370

2008. Selected replicon variants with low-level in vitro resistance to the 371

hepatitis C virus NS5B polymerase inhibitor PSI-6130 lack cross-resistance 372

with R1479. Antimicrob. Agents Chemother. 52:4356-4369. 373

4. Barbotte, L., A. Ahmed-Belkacem, S. Chevaliez, A. Soulier, C. Hezode, H. 374

Wajcman, D. J. Bartels, Y. Zhou, A. Ardzinski, N. Mani, B. G. Rao, S. George, A. 375

Kwong, and J. M. Pawlotsky. 2010. Characterization of V36C, a novel amino 376

acid substitution conferring hepatitis C virus (HCV) resistance to telaprevir, a 377

potent peptidomimetic inhibitor of HCV protease. Antimicrob. Agents 378

Chemother. 54:2681-2683. 379

5. Cannon, N. A., M. J. Donlin, X. Fan, R. Aurora, and J. E. Tavis. 2008. Hepatitis 380

C virus diversity and evolution in the full open-reading frame during antiviral 381

therapy. PLoS One 3:e2123. 382

6. Choo, Q. L., K. H. Richman, J. H. Han, K. Berger, C. Lee, C. Dong, C. Gallegos, 383

D. Coit, R. Medina-Selby, P. J. Barr, and et al. 1991. Genetic organization and 384

diversity of the hepatitis C virus. Proc. Natl. Acad. Sci. U S A 88:2451-2455. 385

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 21: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

21

7. Clavel, F., and A. J. Hance. 2004. HIV drug resistance. N. Engl. J. Med. 386

350:1023-1035. 387

8. Domingo, E., E. MartÌnez-Salas, F. Sobrino, J. C. de la Torre, A. Portela, J. 388

OrtÌn, C. LÛpez-Galindez, P. PÈrez-BreÒa, N. Villanueva, R. N·jera, S. 389

VandePol, D. Steinhauer, N. DePolo, and J. Holland. 1985. The quasispecies 390

(extremely heterogeneous) nature of viral RNA genome populations: biological 391

relevance -- a review. Gene 40:1-8. 392

9. Enomoto, N., I. Sakuma, Y. Asahina, M. Kurosaki, T. Murakami, C. Yamamoto, 393

N. Izumi, F. Marumo, and C. Sato. 1995. Comparison of full-length sequences of 394

interferon-sensitive and resistant hepatitis C virus 1b. Sensitivity to interferon 395

is conferred by amino acid substitutions in the NS5A region. J. Clin. Invest. 396

96:224-230. 397

10. Enomoto, N., I. Sakuma, Y. Asahina, M. Kurosaki, T. Murakami, C. Yamamoto, 398

Y. Ogura, N. Izumi, F. Marumo, and C. Sato. 1996. Mutations in the 399

nonstructural protein 5A gene and response to interferon in patients with 400

chronic hepatitis C virus 1b infection. N. Engl. J. Med. 334:77-81. 401

11. Feng, H., M. Shuda, Y. Chang, and P. S. Moore. 2008. Clonal integration of a 402

polyomavirus in human Merkel cell carcinoma. Science 319:1096-1100. 403

12. Finkbeiner, S. R., Y. Li, S. Ruone, C. Conrardy, N. Gregoricus, D. Toney, H. W. 404

Virgin, L. J. Anderson, J. Vinje, D. Wang, and S. Tong. 2009. Identification of a 405

novel astrovirus (astrovirus VA1) associated with an outbreak of acute 406

gastroenteritis. J. Virol. 83:10836-10839. 407

13. Flint, M., S. Mullen, A. M. Deatly, W. Chen, L. Z. Miller, R. Ralston, C. Broom, E. 408

A. Emini, and A. Y. Howe. 2009. Selection and characterization of hepatitis C 409

virus replicons dually resistant to the polymerase and protease inhibitors 410

HCV-796 and boceprevir (SCH 503034). Antimicrob. Agents Chemother. 411

53:401-411. 412

14. Fried, M. W., M. L. Shiffman, K. R. Reddy, C. Smith, G. Marinos, F. L. Goncales, 413

Jr., D. Haussinger, M. Diago, G. Carosi, D. Dhumeaux, A. Craxi, A. Lin, J. 414

Hoffman, and J. Yu. 2002. Peginterferon alfa-2a plus ribavirin for chronic 415

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 22: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

22

hepatitis C virus infection. N. Engl. J. Med. 347:975-982. 416

15. Gao, M., R. E. Nettles, M. Belema, L. B. Snyder, V. N. Nguyen, R. A. Fridell, M. 417

H. Serrano-Wu, D. R. Langley, J. H. Sun, D. R. O'Boyle, 2nd, J. A. Lemm, C. 418

Wang, J. O. Knipe, C. Chien, R. J. Colonno, D. M. Grasela, N. A. Meanwell, and 419

L. G. Hamann. 2010. Chemical genetics strategy identifies an HCV NS5A 420

inhibitor with a potent clinical effect. Nature 465:96-100. 421

16. Ge, D., J. Fellay, A. J. Thompson, J. S. Simon, K. V. Shianna, T. J. Urban, E. L. 422

Heinzen, P. Qiu, A. H. Bertelsen, A. J. Muir, M. Sulkowski, J. G. McHutchison, 423

and D. B. Goldstein. 2009. Genetic variation in IL28B predicts hepatitis C 424

treatment-induced viral clearance. Nature 461:399-401. 425

17. Honda, M., M. R. Beard, L. H. Ping, and S. M. Lemon. 1999. A phylogenetically 426

conserved stem-loop structure at the 5' border of the internal ribosome entry 427

site of hepatitis C virus is required for cap-independent viral translation. J. 428

Virol. 73:1165-1174. 429

18. Honda, M., L. H. Ping, R. C. Rijnbrand, E. Amphlett, B. Clarke, D. Rowlands, 430

and S. M. Lemon. 1996. Structural requirements for initiation of translation by 431

internal ribosome entry within genome-length hepatitis C virus RNA. Virology 432

222:31-42. 433

19. Ina, Y. 1994. ODEN: a program package for molecular evolutionary analysis and 434

database search of DNA and amino acid sequences. Comput. Appl. Biosci. 435

10:11-12. 436

20. Kato, N., M. Hijikata, Y. Ootsuyama, M. Nakagawa, S. Ohkoshi, T. Sugimura, 437

and K. Shimotohno. 1990. Molecular cloning of the human hepatitis C virus 438

genome from Japanese patients with non-A, non-B hepatitis. Proc Natl Acad Sci 439

U S A 87:9524-8. 440

21. Kieffer, T. L., C. Sarrazin, J. S. Miller, M. W. Welker, N. Forestier, H. W. Reesink, 441

A. D. Kwong, and S. Zeuzem. 2007. Telaprevir and pegylated 442

interferon-alpha-2a inhibit wild-type and resistant genotype 1 hepatitis C virus 443

replication in patients. Hepatology 46:631-639. 444

22. Kwong, A. D., L. McNair, I. Jacobson, and S. George. 2008. Recent progress in 445

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 23: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

23

the development of selected hepatitis C virus NS3.4A protease and NS5B 446

polymerase inhibitors. Curr. Opin. Pharmacol. 8:522-531. 447

23. Lataillade, M., J. Chiarella, R. Yang, S. Schnittman, V. Wirtz, J. Uy, D. Seekins, 448

M. Krystal, M. Mancini, D. McGrath, B. Simen, M. Egholm, and M. Kozal. 2010. 449

Prevalence and clinical significance of HIV drug resistance mutations by 450

ultra-deep sequencing in antiretroviral-naive subjects in the CASTLE study. 451

PLoS One 5:e10952. 452

24. Li, L., A. Kapoor, B. Slikas, O. S. Bamidele, C. Wang, S. Shaukat, M. A. Masroor, 453

M. L. Wilson, J. B. Ndjango, M. Peeters, N. D. Gross-Camp, M. N. Muller, B. H. 454

Hahn, N. D. Wolfe, H. Triki, J. Bartkus, S. Z. Zaidi, and E. Delwart. 2010. 455

Multiple diverse circoviruses infect farm animals and are commonly found in 456

human and chimpanzee feces. J. Virol. 84:1674-1682. 457

25. Manns, M. P., J. G. McHutchison, S. C. Gordon, V. K. Rustgi, M. Shiffman, R. 458

Reindollar, Z. D. Goodman, K. Koury, M. Ling, and J. K. Albrecht. 2001. 459

Peginterferon alfa-2b plus ribavirin compared with interferon alfa-2b plus 460

ribavirin for initial treatment of chronic hepatitis C: a randomised trial. Lancet 461

358:958-965. 462

26. Martell, M., J. I. Esteban, J. Quer, J. Genesca, A. Weiner, R. Esteban, J. 463

Guardia, and J. Gomez. 1992. Hepatitis C virus (HCV) circulates as a 464

population of different but closely related genomes: Quasispecies nature of HCV 465

genome distribution. J. Virol. 66:3225-3229. 466

27. McCown, M. F., S. Rajyaguru, S. Le Pogam, S. Ali, W. R. Jiang, H. Kang, J. 467

Symons, N. Cammack, and I. Najera. 2008. The hepatitis C virus replicon 468

presents a higher barrier to resistance to nucleoside analogs than to 469

nonnucleoside polymerase or protease inhibitors. Antimicrob. Agents 470

Chemother. 52:1604-1612. 471

28. Moore, R. A., R. L. Warren, J. D. Freeman, J. A. Gustavsen, C. Chenard, J. M. 472

Friedman, C. A. Suttle, Y. Zhao, and R. A. Holt. 2011. The sensitivity of 473

massively parallel sequencing for detecting candidate infectious agents 474

associated with human tissue. PLoS One 6:e19838. 475

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 24: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

24

29. Moradpour, D., F. Penin, and C. M. Rice. 2007. Replication of hepatitis C virus. 476

Nat. Rev. Microbiol. 5:453-463. 477

30. Nakamura, S., C. S. Yang, N. Sakon, M. Ueda, T. Tougan, A. Yamashita, N. Goto, 478

K. Takahashi, T. Yasunaga, K. Ikuta, T. Mizutani, Y. Okamoto, M. Tagami, R. 479

Morita, N. Maeda, J. Kawai, Y. Hayashizaki, Y. Nagai, T. Horii, T. Iida, and T. 480

Nakaya. 2009. Direct metagenomic detection of viral pathogens in nasal and 481

fecal specimens using an unbiased high-throughput sequencing approach. PLoS 482

One 4:e4219. 483

31. Ninomiya, M., M. Takahashi, T. Shimosegawa, and H. Okamoto. 2007. Analysis 484

of the entire genomes of fifteen torque teno midi virus variants classifiable into 485

a third group of genus Anellovirus. Archives of Virology 152:1961-1975. 486

32. Okamoto, H., M. Kojima, S. Okada, H. Yoshizawa, H. Iizuka, T. Tanaka, E. E. 487

Muchmore, D. A. Peterson, Y. Ito, and S. Mishiro. 1992. Genetic drift of hepatitis 488

C virus during an 8.2-year infection in a chimpanzee: variability and stability. 489

Virology 190:894-899. 490

33. Okamoto, H., K. Kurai, S. Okada, K. Yamamoto, H. Lizuka, T. Tanaka, S. 491

Fukuda, F. Tsuda, and S. Mishiro. 1992. Full-length sequence of a hepatitis C 492

virus genome having poor homology to reported isolates: comparative study of 493

four distinct genotypes. Virology 188:331-341. 494

34. Okamoto, H., H. Tokita, M. Sakamoto, M. Horikita, M. Kojima, H. Iizuka, and S. 495

Mishiro. 1993. Characterization of the genomic sequence of type V (or 3a) 496

hepatitis C virus isolates and PCR primers for specific detection. J. Gen. Virol. 497

74 ( Pt 11):2385-2390. 498

35. Pawlotsky, J. M. 2011. Treatment failure and resistance with direct-acting 499

antiviral drugs against hepatitis C virus. Hepatology 53:1742-1751. 500

36. Robinson, J. T., H. Thorvaldsdottir, W. Winckler, M. Guttman, E. S. Lander, G. 501

Getz, and J. P. Mesirov. 2011. Integrative genomics viewer. Nat. Biotechnol. 502

29:24-26. 503

37. Sarrazin, C., T. L. Kieffer, D. Bartels, B. Hanzelka, U. Muh, M. Welker, D. 504

Wincheringer, Y. Zhou, H. M. Chu, C. Lin, C. Weegink, H. Reesink, S. Zeuzem, 505

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 25: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

25

and A. D. Kwong. 2007. Dynamic hepatitis C virus genotypic and phenotypic 506

changes in patients treated with the protease inhibitor telaprevir. 507

Gastroenterology 132:1767-1777. 508

38. Simmonds, P., D. B. Smith, F. McOmish, P. L. Yap, J. Kolberg, M. S. Urdea, and 509

E. C. Holmes. 1994. Identification of genotypes of hepatitis C virus by sequence 510

comparisons in the core, E1 and NS-5 regions. J. Gen. Virol. 75 ( Pt 511

5):1053-1061. 512

39. Smith, D. B., S. Pathirana, F. Davidson, E. Lawlor, J. Power, P. L. Yap, and P. 513

Simmonds. 1997. The origin of hepatitis C virus genotypes. J. Gen. Virol. 78 ( Pt 514

2):321-8. 515

40. Suppiah, V., M. Moldovan, G. Ahlenstiel, T. Berg, M. Weltman, M. L. Abate, M. 516

Bassendine, U. Spengler, G. J. Dore, E. Powell, S. Riordan, D. Sheridan, A. 517

Smedile, V. Fragomeli, T. Müller, M. Bahlo, G. J. Stewart, D. R. Booth, and J. 518

George. 2009. IL28B is associated with response to chronic hepatitis C 519

interferon-α and ribavirin therapy. Nature Genetics 41:1100-1104. 520

41. Susser, S., C. Welsch, Y. Wang, M. Zettler, F. S. Domingues, U. Karey, E. Hughes, 521

R. Ralston, X. Tong, E. Herrmann, S. Zeuzem, and C. Sarrazin. 2009. 522

Characterization of resistance to the protease inhibitor boceprevir in hepatitis C 523

virus-infected patients. Hepatology 50:1709-1718. 524

42. Szpara, M. L., L. Parsons, and L. W. Enquist. 2010. Sequence Variability in 525

Clinical and Laboratory Isolates of Herpes Simplex Virus 1 Reveals New 526

Mutations. J. Virol. 84:5303-5313. 527

43. Tanaka, Y., N. Nishida, M. Sugiyama, M. Kurosaki, K. Matsuura, N. Sakamoto, 528

M. Nakagawa, M. Korenaga, K. Hino, S. Hige, Y. Ito, E. Mita, E. Tanaka, S. 529

Mochida, Y. Murawaki, M. Honda, A. Sakai, Y. Hiasa, S. Nishiguchi, A. Koike, I. 530

Sakaida, M. Imamura, K. Ito, K. Yano, N. Masaki, F. Sugauchi, N. Izumi, K. 531

Tokunaga, and M. Mizokami. 2009. Genome-wide association of IL28B with 532

response to pegylated interferon-α and ribavirin therapy for chronic hepatitis 533

C. Nature Genetics 41:1105-1109. 534

44. Tokita, H., H. Okamoto, H. Iizuka, J. Kishimoto, F. Tsuda, L. A. Lesmana, Y. 535

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 26: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

26

Miyakawa, and M. Mayumi. 1996. Hepatitis C virus variants from Jakarta, 536

Indonesia classifiable into novel genotypes in the second (2e and 2f), tenth (10a) 537

and eleventh (11a) genetic groups. J. Gen. Virol. 77 (Pt 2):293-301. 538

45. Verbinnen, T., H. Van Marck, I. Vandenbroucke, L. Vijgen, M. Claes, T. I. Lin, K. 539

Simmen, J. Neyts, G. Fanning, and O. Lenz. 2010. Tracking the evolution of 540

multiple in vitro hepatitis C virus replicon variants under protease inhibitor 541

selection pressure by 454 deep sequencing. J. Virol. 84:11124-11133. 542

46. Victoria, J. G., A. Kapoor, L. Li, O. Blinkova, B. Slikas, C. Wang, A. Naeem, S. 543

Zaidi, and E. Delwart. 2009. Metagenomic analyses of viruses in stool samples 544

from children with acute flaccid paralysis. J. Virol. 83:4642-4651. 545

47. Xiang, X., J. Lu, Z. Dong, H. Zhou, W. Tao, Q. Guo, X. Zhou, S. Bao, Q. Xie, and J. 546

Zhong. 2011. Viral sequence evolution in Chinese genotype 1b chronic hepatitis 547

C patients experiencing unsuccessful interferon treatment. Infection, Genetics 548

and Evolution 11:382-390. 549

550

551

552

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 27: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

27

FIGURE LEGENDS 553

FIG. 1 Evaluation of the quality of libraries. (A) The library was well refined in patient 554

No.7, but the primer and adaptor dimers were mixed in the libraries of patients No.1 555

and No.9 using Bioanalyzer (Agilent). The horizontal axis shows the DNA size and the 556

vertical axis the quantity of DNA. The blue arrows indicate the desired products and 557

the red arrows the primer or adaptor dimers. The peak of the wave at 35 bp and 10380 558

bp expressed the marker. (B) Amplification plots of No.1 (treatment-naïve) and No.7 559

(treatment-experienced) showed the presence of the HCV genome in the libraries by 560

conducting quantitative PCR with StepOnePlus ™ (Applied Biosystems). 561

562

FIG. 2 Mapping to the HCV reference genome. In patient No.1, 6303 reads were 563

mapped on MD5-1. The coverage showed 99.4% and the average depth indicated 51.1X. 564

In patient No.7, 8583 reads were aligned to MD2-2. The coverage showed 99.8% and 565

the average depth indicated 69.5%. 566

567

FIG. 3 Amino acid substitutions of (A) aa. 70 and aa. 91 in the core region and (B) aa. 568

2209-2248 of NS5A-ISDR. These regions were reported to affect the outcome of 569

interferon (IFN)-based therapies for chronic hepatitis C patients. Lower two lines show 570

the nucleotide sequence and amino acid sequence of HCV-J. Amino acid abbreviations: 571

F with phenylalanine, S (serine), Y (tyrosine), C (cysteine), W (tryptophan), L (leucine), 572

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 28: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

28

P (proline), H (histidine), Q (glutamine), R (arginine), I (isoleucine), M (methionine), T 573

(threonine), N (asparagine), K (lysine), V (valine), A (alanine), D (aspartic acid), E 574

(glutamic acid) and G (glycine). 575

576

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 29: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

TABLE 1. Clinical data of patients enrolled in this study.

Sample

(Stored date) Sex Age Diagnosis

HCV-RNA

(LogIU/ml)

Geno-

type

Viral

infection

Past treatment

(Period)

Therapeutic

effect

Core aa.

70

Core

aa. 91

NS5A-ISDR

number of

aa.

substitutions

No.1

(2008. Oct.) M 43

Chronic

Hepatitis C 6.8 1b

HBsAg (-),

HBsAb (-) None NA Wild Wild 0

No.7

(2010. May) M 57

Chronic

Hepatitis C 7.0 1b

HBsAg (-),

HBsAb (-)

IFNα2b+RBV

(2002. Mar.-Sep.)

Non-

responder Wild Mutation 0

No.9

(2011. Jan.) F 64 Control NA. NA

HBsAg (-),

HBsAb (-) NA NA NA NA NA

Abbreviations:

aa: amino acids, HBsAb: Hepatitis B surface antigen, HBsAb: Hepatitis B surface antibody, IFN: interferon, NA: not applicable, RBV,

ribavirin.

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 30: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

TABLE 2. Distribution of viral reads in human serum

Sample No. 1 (Treatment-naïve) No. 7 (Treatment-experienced) No. 9 (Healthy control)

Total reads 27,717,487 100.00% 94,151,356 100.00% 15,032,130 100.00%

Adaptor & Primer 6,502,508 23.46% 28,605,006 30.38% 5,713,994 38.01% Modified total reads 21,214,979 76.54% 65,546,350 69.62% 9,318,136 61.99%

Human origin 19,761,560 71.30% 58,446,916 62.08% 8,660,143 57.61%

Unknown 1,453,419 5.24% 7,099,434 7.54% 657,993 4.38%

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 31: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

TABLE 3. Amino acid substitutions in NS3 and NS5B, compared with HCV-J.

a The uppercase letter indicates the prototype nucleotide and the lowercase letter indicates the mutation. b The number and the percentage of amino acid bases are displayed in parentheses.

No. 1 (NS3) Nucleotide

position Amino acid

position Prototype amino acid

Nucleotide sequencea

Amino acid substitutionb

3426 7 S gCC A

3447 14 L (C/t/g)TT L/F/V (67 89%/ 6 8%/ 2 3%)

3495 30 D GAg E 3513 36 L gTt V 3588 61 S (T/g)CG S/A (48 71%/ 20 29%) 3591 62 K AgG R 3618 71 I gTC V 3645 80 Q CtG L 3663 86 P CaG Q 3747 114 V aTc I 3801 132 I gTC V 3855 150 V GcT A 3915 170 I gT(A/g) V 4149 248 I gTC V 4152 249 E GAc D 4191 262 G aGC S 4194 263 G Gct A 4218 271 C gGC G 4302 299 T tCC S 4392 329 I gTC V 4479 358 A aaC N 4551 382 T tCg S 4554 382 G GcC A 4563 386 L gTC V 4659 418 F TaT Y 4758 451 L gTG V 4782 459 A tCG S 4815 470 S gGg G 4935 510 S aCa T 5007 534 S gGC G 5076 557 L tTC F 5163 586 I A(T/c)A I/T (4 12%/ 30 88%) 5232 609 V aTC I 5268 621 A aCA T

No. 7 (NS3) Nucleotide

position Amino acid

position Prototype amino acid

Nucleotide sequencea

Amino acidsubstitutionb

3495 30 D GAg E 3513 36 L gTt V 3531 42 S aCT T 3549 48 V aTC I 3621 72 T A(C/t)C T/I (57 73%/ 21 27%) 3663 86 P CaG Q 3672 89 P tCC S 3687 94 M tTG L 3747 114 V aTc I 3801 132 I gTC V 3855 150 V GcT A 3915 170 I gTA V 4044 213 K A(A/g)g K/R (18 26%/ 52 74%) 4116 237 G (G/a)G(C/t) G/S (41 46%/ 49 54%) 4149 248 I gTt V 4152 249 E Gac D 4194 263 G Gct A

4197 264 P (C/t/g)CC P/S/A (20 42%/ 19 40%/9 19%)

4218 271 C gGC G 4296 297 S (T/g)CG S/A (63 81%/ 15 19%) 4302 299 T tCC S 4479 358 A (G/a)CC A/T (20 21%/ 74 79%) 4542 379 A tCA S 4551 382 T tCA S 4554 383 G acC T 4563 386 L aTC I 4659 418 F TaT Y 4758 451 L gTG V 4776 457 S T(C/g)(G/t) S/C (56 64%/ 32 36%) 4782 459 A tCg/a S 4815 470 S gcg A 5076 557 L tTC F 5172 589 K AgG R 5250 615 I AT(A/g) I/M (42 46%/ 49 54%) 5259 618 Y TtC F

No. 1 (NS5B) Nucleotide

position Amino acid

position Prototype amino acid

Nucleotide sequencea

Amino acid substitution

7659 25 P gCG A 7689 35 S Aac N 7701 39 S gCC A 7725 47 L CaG Q 7827 81 R Aaa K 7839 85 I gTA V 7878 98 K AgA R 7914 110 S AaC N 7932 116 V aTt I 7944 120 R CaC H 7956 124 E aAG K 7989 135 D aAc N 8025 147 V aTt I 8151 189 P tCC S 8205 207 T gCC A 8223 213 C aaC N 8238 218 S gCA A 8289 235 T gtT V 8370 262 V aTt I 8484 300 T tCT S 8532 316 N tgC C 8589 335 A aaC N 8598 338 A GtC V 8784 400 V GcT A 8937 451 C acT T 8976 464 E cAg Q 9153 523 K Aga R 9177 531 K AgG R 9252 556 N AgC S 9276 564 L gTG V 9306 574 L TgG W

No. 7 (NS5B) Nucleotide

position Amino acid

position Prototype amino acid

Nucleotide sequencea

Amino acid substitution

7599 5 T tCA S 7689 35 S Aa(c/t) N 7701 39 S gCC A 7725 47 L Caa Q 7827 81 R Aaa K 7839 85 I gTg V 7878 98 K AgA R 7914 110 S AaC N 7926 114 R Aaa K 7956 124 E aAG K 8151 189 P tCC S 8205 207 T gCC A 8223 213 C acC T 8238 218 S gCA A 8277 231 N AgT S 8289 235 T gtT V 8298 238 S gCA A 8370 262 V aTC I 8484 300 T tCT S 8532 316 N tgC C 8589 335 A agC S 8784 400 V GcT A 8937 451 C acT T 8940 452 Y cAC H 8946 454 I gTT V 8976 464 E cAA Q 9177 531 K AgG R 9213 543 S TtC F 9231 549 G aaC N 9252 556 N AgC S 9306 574 L TgG W

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 32: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

A)

B)

No.7

No.9

No.1

FIG. 1

Treatment-naive Treatment-experienced

Healthy control

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40cycle

Amplification Plot

ΔRn

10

1

0.1

0.01

0.001

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 33: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

5'UTR E1 p7 NS3 NS4B NS5B

Core E2 NS2 NS4A NS5A

0 bp 1,000 bp 2,000 bp 3,000 bp 4,000 bp 5,000 bp 6,000 bp 7,000 bp 8,000 bp 9,000 bp

Coverage 99.4 % Average depth 51.1X

Coverage 99.8 % Average depth 69.5X

FIG. 2

No. 1 6,303 reads

No. 7 8,583 reads

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from

Page 34: JCM Accepts, published online ahead of print on 28 ...jcm.asm.org/content/early/2011/12/22/JCM.05715-11.full.pdf · 132 Sequence analysis was performed using Genet yx-Mac ver. 12.2.6

R L A G P R V G P G L S P G T L G P S M A T R V W G GR

K A R R P E G R T W A Q P G Y P W P L Y G N E G M G W A

G

A A G G C T C G C C G G C C C G A G G G T A G G A C C T G G G C T C A G C C C G G G T A C C C T T G G C C C C T C T A T G G C A A C G A G G G T A T G G G G T G G G C A

530 bp 540 bp 550 bp 560 bp 570 bp 580 bp 590 bp 600 bp 610 bp

XP S L K A T C T T H H D S P D A D L I E A N L L W R Q E M G G N I T R V E S E N

C C T T C T T T G A A G G C G A C A T G T A C T A C C C A T C A T G A C T C C C C G G A C G C T G A C C T C A T C G A G G C C A A C C T C C T G T G G C G G C A G G A G A T G G G C G G G A A C A T C A C C C G T G T G G A G T C A G A A A A T

6,960 bp 6,980 bp 7,000 bp 7,020 bp 7,040 bp 7,060 bp

A) Core aa. 70 and aa. 91

B) NS5A-ISDR

HCV-J

FIG. 3

on May 14, 2018 by guest

http://jcm.asm

.org/D

ownloaded from