conditional generative adversarial networks to model ipsc...

8
Aida, S. et al. Paper: Conditional Generative Adversarial Networks to Model iPSC-Derived Cancer Stem Cells Saori Aida , Hiroyuki Kameda , Sakae Nishisako ∗∗ , Tomonari Kasai ∗∗∗ , Atsushi Sato ∗∗∗ , and Tomoyasu Sugiyama ∗∗∗ School of Computer Science, Tokyo University of Technology 1404-1 Katakura, Hachioji, Tokyo 192-0982, Japan E-mail: {aidasor, kameda}@stf.teu.ac.jp ∗∗ Graduate School of Bionics, Computer and Media Sciences, Tokyo University of Technology 1404-1 Katakura, Hachioji, Tokyo 192-0982, Japan E-mail: [email protected] ∗∗∗ School of Bioscience and Biotechnology, Tokyo University of Technology 1404-1 Katakura, Hachioji, Tokyo 192-0982, Japan E-mail: {kasaitnr, atsato, tsugiyama}@stf.teu.ac.jp [Received February 21, 2019; accepted October 28, 2019] The realization of effective and low-cost drug discov- ery is imperative to enable people to easily purchase and use medicines when necessary. This paper re- ports a smart system for detecting iPSC-derived can- cer stem cells by using conditional generative adver- sarial networks. This system with artificial intelli- gence (AI) accepts a normal image from a microscope and transforms it into a corresponding fluorescent- marked fake image. The AI system learns 10,221 sets of paired pictures as input. Consequently, the sys- tem’s performance shows that the correlation between true fluorescent-marked images and fake fluorescent- marked images is at most 0.80. This suggests the fun- damental validity and feasibility of our proposed sys- tem. Moreover, this research opens a new way for AI- based drug discovery in the process of iPSC-derived cancer stem cell detection. Keywords: conditional generative adversarial networks, pix2pix, iPSC-derived cancer stem cells, drug discovery, artificial intelligence 1. Introduction 1.1. General Background Human history is exceptionally long, and human soci- ety has been changing gradually. The 1st industrial rev- olution around the late 18th century, which came about with the development of steam engines, accelerated this change. Roughly 200 years later, at the beginning of the 20th century, the 2nd industrial revolution happened with the development of electric power generators. Further, in the late 20th century, the 3rd industrial revolution – in- formation revolution – burst forth with the development of the Internet, especially the World Wide Web (WWW) technology. Owing to these three industrial revolutions, our society is now in a mode of constant change, faster than ever be- fore. Moreover, since the beginning of the 21st century, new, cutting-edge, and revolutionary technologies have emerged successively, across every field. For example, in the industrial field, IoT (Internet of Things) technolo- gies have been affirmatively adopted by the world-leading “Industry 4.0” project in Germany. In the financial field, Blockchain technologies are being applied to FinTech for realizing smart contract systems. In academia, artificial intelligence (AI) technologies, especially the application of deep learning in various fields, has attracted consider- able attention [1, 2]. Drug discovery is one such field. It needs – as you would be knowing – a significant amount of money, time, and human resources; however, at the same time, it fails very often in reaching a practical stage, where people can use the medicines in their daily lives. This is a serious problem that needs to be addressed expeditiously. This fact compelled the authors to start this study. In this paper, we report our study and its outcomes to improve, that is, accelerate the precursory drug discovery processes by applying a new AI technology – conditional generative adversarial networks (CGAN), which is a type of deep learning algorithm [3]. 1.2. Background of Deep Learning Research Deep learning originates from the idea of neural net- works and has grown into many new ideas such as con- volutional neural networks (CNN), recurrent neural net- works (RNN), and so on. In 2014, an impactful idea on deep learning – generative adversarial networks (GAN) was proposed by Goodfellow et al. [4]. GAN is able to, for example, make black-and-white photos colored, repair images with missing data, and transform font styles in text into another style. That is to say, GAN is able to produce new but fake images by learning features from learning 134 Journal of Advanced Computational Intelligence Vol.24 No.1, 2020 and Intelligent Informatics

Upload: others

Post on 20-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Aida, S. et al.

Paper:

Conditional Generative Adversarial Networksto Model iPSC-Derived Cancer Stem Cells

Saori Aida∗, Hiroyuki Kameda∗, Sakae Nishisako∗∗, Tomonari Kasai∗∗∗,Atsushi Sato∗∗∗, and Tomoyasu Sugiyama∗∗∗

∗School of Computer Science, Tokyo University of Technology1404-1 Katakura, Hachioji, Tokyo 192-0982, Japan

E-mail: {aidasor, kameda}@stf.teu.ac.jp∗∗Graduate School of Bionics, Computer and Media Sciences, Tokyo University of Technology

1404-1 Katakura, Hachioji, Tokyo 192-0982, JapanE-mail: [email protected]

∗∗∗School of Bioscience and Biotechnology, Tokyo University of Technology1404-1 Katakura, Hachioji, Tokyo 192-0982, JapanE-mail: {kasaitnr, atsato, tsugiyama}@stf.teu.ac.jp

[Received February 21, 2019; accepted October 28, 2019]

The realization of effective and low-cost drug discov-ery is imperative to enable people to easily purchaseand use medicines when necessary. This paper re-ports a smart system for detecting iPSC-derived can-cer stem cells by using conditional generative adver-sarial networks. This system with artificial intelli-gence (AI) accepts a normal image from a microscopeand transforms it into a corresponding fluorescent-marked fake image. The AI system learns 10,221 setsof paired pictures as input. Consequently, the sys-tem’s performance shows that the correlation betweentrue fluorescent-marked images and fake fluorescent-marked images is at most 0.80. This suggests the fun-damental validity and feasibility of our proposed sys-tem. Moreover, this research opens a new way for AI-based drug discovery in the process of iPSC-derivedcancer stem cell detection.

Keywords: conditional generative adversarial networks,pix2pix, iPSC-derived cancer stem cells, drug discovery,artificial intelligence

1. Introduction

1.1. General BackgroundHuman history is exceptionally long, and human soci-

ety has been changing gradually. The 1st industrial rev-olution around the late 18th century, which came aboutwith the development of steam engines, accelerated thischange. Roughly 200 years later, at the beginning of the20th century, the 2nd industrial revolution happened withthe development of electric power generators. Further, inthe late 20th century, the 3rd industrial revolution – in-formation revolution – burst forth with the developmentof the Internet, especially the World Wide Web (WWW)technology.

Owing to these three industrial revolutions, our societyis now in a mode of constant change, faster than ever be-fore. Moreover, since the beginning of the 21st century,new, cutting-edge, and revolutionary technologies haveemerged successively, across every field. For example,in the industrial field, IoT (Internet of Things) technolo-gies have been affirmatively adopted by the world-leading“Industry 4.0” project in Germany. In the financial field,Blockchain technologies are being applied to FinTech forrealizing smart contract systems. In academia, artificialintelligence (AI) technologies, especially the applicationof deep learning in various fields, has attracted consider-able attention [1, 2].

Drug discovery is one such field. It needs – as youwould be knowing – a significant amount of money, time,and human resources; however, at the same time, it failsvery often in reaching a practical stage, where people canuse the medicines in their daily lives. This is a seriousproblem that needs to be addressed expeditiously. Thisfact compelled the authors to start this study.

In this paper, we report our study and its outcomes toimprove, that is, accelerate the precursory drug discoveryprocesses by applying a new AI technology – conditionalgenerative adversarial networks (CGAN), which is a typeof deep learning algorithm [3].

1.2. Background of Deep Learning ResearchDeep learning originates from the idea of neural net-

works and has grown into many new ideas such as con-volutional neural networks (CNN), recurrent neural net-works (RNN), and so on. In 2014, an impactful idea ondeep learning – generative adversarial networks (GAN)was proposed by Goodfellow et al. [4]. GAN is able to,for example, make black-and-white photos colored, repairimages with missing data, and transform font styles in textinto another style. That is to say, GAN is able to producenew but fake images by learning features from learning

134 Journal of Advanced Computational Intelligence Vol.24 No.1, 2020and Intelligent Informatics

Page 2: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Conditional GAN to Model iPSC-Derived Cancer Stem Cells

datasets. However, GAN still has some severe disadvan-tages: it is unstable at times, produces images inclinedtoward a certain type, and so on.

In order to overcome such disadvantages, many typesof GAN have been proposed, such as deep convolu-tional GAN (DCGAN) [5], spatial transformer GAN(ST-GAN) [6], cycle-consistent adversarial networks(CycleGAN) [7], unified GAN (StarGAN) [8], progres-sive growing of GAN (PGGAN) [9], auxiliary classifierGAN (ACGAN) [10], self-attention GAN (SAGAN) [11],conditional GAN (CGAN) [3], information maximizingGAN (InfoGAN) [12], stacked GAN (StackGAN) [13],anomaly detection with GAN (AnoGAN) [14], and soon. Today, GANs are also applied across a wide rangeof fields such as dynamic facial expression synthesis [15],speaker identification [16], and network intrusion detec-tion [17].

In this study, we adopted pix2pix, a type of CGAN totransform an input image into another corresponding im-age for automatically detecting iPSC-derived cancer stemcells [18], as the ability of CGAN to create fake imagescan be utilized to detect cells.

1.3. Background of Drug Discovery ResearchDrug discovery has a long history. From the days of

the ancient Mayans and ancient China, people had alreadydeveloped various types of drugs from herbs or animals(bones and internal organs), following their own hunchesand experiences. With the rise of modern science, isola-tion and identification of a wide range of bioactive com-pounds cleared the way for a brand new way of syntheticdrug discovery. At the time, drugs were discovered bychance, were related to experiences or accidental events,and were evaluated using drug efficacy evaluation systemsin vitro with the use of animals.

Not long after that, evaluation systems in vitro weredeveloped to use enzymes and cultured cells, and mech-anisms were also more detailly analyzed than before. Atthis stage, however, numerous trials and errors in biolog-ical activity evaluation were inevitable when using high-throughput cleaning systems of a synthetic organic com-pound library to identify compound candidates.

Consequently, significant human effort, time (about10–20 years), and money (more than several hundred mil-lion yen) were needed to discover new drugs. This is aserious problem with new drug discovery. On the otherhand, there were still no theories to determine if the drugswere best suited to patients. These facts forced drug can-didates to be often rejected because of their toxicity inpre-clinical experiments or clinical trials. This is anotherproblem with new drug discovery.

After the draft of the human genome sequences waspublished publicly in 2001 [19, 20], a new, genome-basedmethodology of drug discovery emerged for theoretically,deductively, and exhaustively discovering new drugs. Byapplying computer simulation techniques to drug discov-ery based on the knowledge of genomes, the success rateof new drug discovery is expected to be higher than be-fore. Using these theoretical drug discovery methods, the

dream of theoretically best-suited drugs is expected to berealized in the near future. This causes a paradigm shiftin the genome-based drug discovery for aiming at com-plete drug discovery. Such dreams can be realized usingAI technology.

On this background, the deep learning competition“Kaggle,” aimed at drug discovery, was held in 2012. Thecompetition showed deep learning to be affirmatively use-ful in drug discovery. In January 2019, the 1st annualmeeting of the Japanese Association for Medical Artifi-cial Intelligence was held in Tokyo. At the same time,induced cancer stem cells (iCSC), that is, cells with dis-ease status of the patients, were produced from patients’own cells, and used to evaluate compounds of drug candi-dates. Indeed, the evaluation methods still remain underdevelopment, but AI-based drug discovery has started at-tracting much attention.

1.4. Our ResearchUnder these circumstances, we, a joint group of bion-

ics and computer science researchers, have started a newresearch project since 2017 and aim to implement a newAI-based software system for drug discovery in a muchmore effective, much less money-consuming, and muchless time-consuming way than before, by making AI sys-tems learn iPSC-derived cancer stem cell modeling auto-matically. Specifically, we have effectively developed anew AI-based software to detect an iPSC-derived cancerstem cell by producing fake fluorescent-marked picturesfrom microscopic image data.

In this paper, to realize an effective drug discovery pro-cess in terms of human resources, time, and money, a newAI method pix2pix – a type of the deep learning CGAN –was adopted to design and implement an AI system to ex-clusively detect iPSC-derived cancer stem cells with theuse of miPS-LLC cells [21–23]. Furthermore, evaluationof the system showed its high performance in detectingsuch cells; thus, we can expect the feasibility of artificialintelligence-based drug discovery in the near future.

2. Related Work

According to the world-famous book on AI by Rus-sell and Norvig [24], AI’s history is at least two thousandyears old. The term “Artificial Intelligence” was adoptedat a conference in Dartmouth in 1957 by a small group ofresearchers.

Since then, AI research has resulted in many out-comes: knowledge representation, fundamental algo-rithms, expert systems, speech recognition, natural lan-guage processing, machine translation, intelligent robots,toy version-level of AI software systems, and so on. Neu-ral networks is one such AI research area that is inspiredby human nerve cells and nervous system. Deep learningis their evolutionary version.

Around 2000, CNN emerged as a deep learning methodand had a significant impact in the field of image recog-nition. RNN is another derivative of deep learning, which

Vol.24 No.1, 2020 Journal of Advanced Computational Intelligence 135and Intelligent Informatics

Page 3: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Aida, S. et al.

Fig. 1. Fundamental architecture of GAN.

is mainly applied to natural language processing. On theother hand, GAN is applied to generate fake images sim-ilar to original input images. Fig. 1 shows the structureof GAN. GAN is originally constructed with two mainmodules, generator and discriminator. The generator pro-duces fake data, and the discriminator distinguishes realones from fake ones. GAN realizes intelligence with theworking of the two modules as adversaries.

As described in Fig. 1, many derivatives of GAN havebeen proposed and implemented across a wide range offields. CGAN is a unique example of GAN, which cantransform an input image into another one, for example,from a black-and-white picture into a colored one.

In this study, in order to improve drug discovery pro-cess, we adopted pix2pix, a type of CGAN, and im-plemented an AI system to automatically detect iPSC-derived cancer stem cells, by generating a picture in whichthe cancer cells are fluorescent from another picture inwhich iPSC-derived cancer stem cells are normally cap-tured with a microscope.

3. Overall Design of Our System

3.1. Research Goals(1) Long-Term Goal

Our research project aims at creating an AI softwarethat can be applied to new drug discovery processes.Specifically, it aims to implement a software for automaticdetection of iPSC-derived cancer stem cells to drasticallyreduce human resource, time, and financial costs.

(2) Short-Term Goal

Our research project aims at implementing an AI sys-tem to automatically detect iPSC-derived cancer stemcells in miPS-LLC cells. In other words, we make theAI learn to model iPSC-derived cancer stem cells. It aimsnot only at developing AI software but also at building ahardware system that is scalably applicable to big dataprocessing swiftly and rigidly. This paper reports ournewly-started work and its current results, aiming at thisshort-term goal.

3.2. Overview of AI-Based Drug Discovery SupportSystem for the Short-Term Goal

Figure 2 shows a blueprint of our AI-based, smartiPSC-derived cancer stem cell detection system. From theviewpoint of computer science, AI engine for big data onthe right-top of Fig. 2 is a main part of AI for learning todetect iPSC-derived cancer stem cells from big data stor-age at the middle-top of the figure, with the use of AIalgorithm, that is CGAN, to the right of it. Source codesused by the system are from pix2pix, a type of CGAN.Datasets for learning and testing are created by bionics re-searchers Nishisako and Sugiyama. They repeatedly taketwo types of pictures of iPSC-derived stem cells. Oneis an original shot and the other is a processed shot withfluorescent-marked iPSC-derived cancer stem cells.

3.3. Hardware SettingsBefore starting this project, we gathered a wide range

of information on machine capabilities to run deep learn-ing software. Based on the available information, we de-cided to use the following machines:

A) CPUmachine name: ASUSCPU: Intel Core i5-7500 CPU @ 3.40 GHzMain memory: 64 GBOS: Windows 10 Education (64 bits)

B) GPUmachine name: NVIDIA DGX-1CPU: Dual Intel 20-core Xeon (E-2698v4 2.2 GHz)PGU Main memory: 16 GB on one GPUNVIDIA CUDA core: 28672(Main memory: 512 GB, 2133 MHz)OS: Ubuntu Server (Linux)

3.4. AI Software Settings – Which AI Software toUse?

We surveyed algorithms and software applicable andsuitable to our research project, and finally decided toadopt CGAN (as mentioned before) because the pro-gram pix2pix (image-to-image translation technique with

136 Journal of Advanced Computational Intelligence Vol.24 No.1, 2020and Intelligent Informatics

Page 4: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Conditional GAN to Model iPSC-Derived Cancer Stem Cells

Fig. 2. Overview of smart iPSC-derived stem cell detection system.

CGAN) is strictly expected to be applicable and suitableto our AI system.

3.5. Software Environment Settings – How to Cre-ate AI Software?

As you would know, there are many options for creat-ing AI software pix2pix. Based on our survey of relatedworks by searching books, papers, and Web pages, weconsidered the following three options to create AI soft-ware environment settings.

[Option 1]: We originally write scratch-based codes inPython or MATLAB.

[Option 2]: We use existing software resources for ma-chine learning on the Internet, such asTensor-Flow or PyTorch, affirmatively.

[Option 3]: We use a cloud-based machine learning soft-ware such as Microsoft AzureML, IBM Wat-son, or Google GCP.

[Option 1] is a standard way to solve the problem be-cause we can easily revise the program the way we wantto, but it will take a longer time because we are building adeep learning software for the first time. As one of our subgoals in this study is to confirm the fundamental adequacyof our research approach, we did not select this option.[Option 3] is the best way to rapidly build an AI system;however, such a system has difficulties and restrictionswhen revising it the way we want to, so we did not selectthis option either. Consequently, we selected [Option 2]and used the existing open software source code PyTorch,making it easier for us to implement the pix2pix programwith CGAN.

3.6. Dataset SettingsSo far, we have collected 10,221 sets of paired pic-

tures, that is, a normally-captured version of pictures anda processed version of pictures where every single iPSC-derived cancer stem cell is fluorescent-marked. The num-ber of these pictures is preliminarily set as an initial goal

of data collection. The format of the pictures is TIFF(tagged image file format), because this format is usuallyused to save digital camera pictures. The picture size is1,920 × 1,200 pixels.

4. Experiments

4.1. Preliminary Experiment – Is System Configu-ration Adequate to Perform the Main Experi-ments 1 and 2 Effectively?

(1) Objective

Before performing the two main experiments describedin Sections 4.2 and 4.3, we checked to see if our above-described system runs adequately with no serious prob-lems. This is because a deep learning system, in general,requires a CPU and GPU with higher specifications anda considerable amount of memory. Our AI system waschecked in terms of its speed, memory, and so on.

(2) Method

After deciding on the hardware, we installed a programnamed PyTorch-pix2pix. This program is a fork of orig-inal pix2pix – the open source code of CGAN algorithmin Python, and can be downloaded from GitHub.1

We adopted this code sample because many users re-peatedly run this code on the site, properly and safely. Thecode sizes of the training program and the test program areapproximately 150 and 40 lines, respectively. These pro-grams were also used in the next two main experimentsdescribed in Sections 4.2 and 4.3.

(3) Datasets

Procedure for making datasets: Our datasets weremade as follows:

[Step 1] Gather flower pictures in color from the Webpage.2

1. https://github.com/GINK03/pytorch-pix2pix2. https://www.dropbox.com/s/yhstjhqmsy1cneb/

Vol.24 No.1, 2020 Journal of Advanced Computational Intelligence 137and Intelligent Informatics

Page 5: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Aida, S. et al.

[Step 2] Decolorize each picture by issuing the followingcommand: $ mogrify -colorspace Gray *.jpg

[Step 3] Make a pair of pictures, that is, original picturesin color and the corresponding decolorized ones.Put them separately into two different directo-ries, a training dataset and a test dataset, wherecorresponding files have the same file names, re-spectively.

Dataset profile: The total size and number of alldatasets is about 168 MB and 18,776, respectively, wherethe number of paired data (pictures) is 9,388 (7,517 fortraining and 1,871 for the test). These paired pictures areused later in the learning process and testing process, re-spectively.

(4) Training and Test Procedure

Training and test of the AI system were done by thefollowing steps:

[Step 1] Train the AI system with the use of trainingdatasets by issuing the command below:$ python train.py --dataset DS --nEpochs N--cuda(Note: DS – directory name of training datasets,and N – the number of the iteration of learning.)

[Step 2] To test the validity of learning, the AI systemreads decolorized pictures and creates pictures incolor that resemble the original ones. They arealso evaluated in terms of correlation with eachother, and so on.

(5) Results and Considerations

The AI system could produce flower pictures in colorfrom decolorized pictures with no particular problems(Fig. 3). Learning time of the AI system was about24 hours when the Epoch number was 100. This numer-ical value was adopted based on the results of the systempre-test. With this Epoch value, the AI system showed thebest performance. As the AI program PyTorch-pix2pixwas run on our machine adequately, that is, not for long,and with feasible computing time and no particular prob-lems, we decided to use this system configuration for themain experiments 1 and 2: Can AI detect iPSC-derivedcancer stem cells from a microscopic picture?

As described above, our AI system has the ability toprocess big datasets of images. The version of Pythonis not 3 but 2, and the image format is not TIFF butJPEG, because the pix2pix program used by us acceptsonly JPEG format datasets.

4.2. Main Experiment 1 – Can the AI System DetectiPSC-Derived Cancer Stem Cells?

(1) Objective

We investigate the feasibility of our AI system in termsof whether it can really detect iPSC-derived cancer stemcells automatically. In this experiment, PyTorch-pix2pixis adopted as the AI engine software.

(a) Input: decolored image (b) Output: colored image

Fig. 3. Input and output of AI system in experiment 2.

(2) Method

PyTorch-pix2pix is first applied to training datasets andthen to test datasets.

(3) Dataset

Datasets are elaborated in Section 3.6. 2,000 pairedpictures are used in leaning phase, and another 221 mi-croscopic pictures are used to test and evaluate the qualityof AI learning performance.

(4) Training and Test Procedure

The steps in this stage are the same as in the preliminaryexperiment described in Section 4.1.

(5) Results and Considerations

Fake fluorescent-marked pictures were successfullycreated, at least for human-eyes, although this is subjec-tive. The correlation r with created pictures, that is, withfake fluorescent-marked and original fluorescent-markedones is maximally 0.80. The value is quite good, but morethorough investigation should be done in succession. Thenext step is to increase the value of Epoch (the number ofiterations) and the number of data gradually.

Figure 4 shows an example of this AI system’s output.Picture on the left is an input to AI system which pro-duces the picture at the middle, and this produced pictureis a fake version of the original picture on the right. Thismeans we can easily get fake fluorescent-marked picturesfrom original pictures directly through our AI system.

4.3. Main Experiment 2 – Does the Number of Cellsin the Input Picture Affect the Quality of Out-put Performance?

(1) Objective

In main experiment 1, input picture contains manycells. In this experiment, we investigate the effects of thenumber of cells in input pictures. This experiment alsoadopts PyTorch-pix2pix as the AI engine software.

(2) Method

Same as in experiment 1 described above.

138 Journal of Advanced Computational Intelligence Vol.24 No.1, 2020and Intelligent Informatics

Page 6: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Conditional GAN to Model iPSC-Derived Cancer Stem Cells

(a) Input picture

(b) Output picture from AI

(c) Correct picture

Fig. 4. Input for test, output from AI, and original correctpicture.

(3) Dataset

42 pictures with large amounts of iPSC-derived cancerstem cells were picked up from the datasets described inSection 3.6 by a human, and each picture was divided into16 pieces, that is, every piece of a picture contains smallamount of iPSC-derive cancer stem cells. The total num-ber of pictures used in this experiment is 42× 16 = 672.640 pictures are used for training and 30 for test. TheEpoch value was 10,000.

(4) Training and Test Procedure

The steps in this stage are the same as in the preliminaryexperiment described above.

(5) Results and Considerations

Initially, to know the similarity between fake picturescreated by AI and their corresponding original ones, wecalculated the mean of correlation r between them. Theoriginal pictures are the ones taken by a microscope, andthe fake ones are images created by the AI system, wherethe iPSC-derived cancer stem cells detected by AI sys-tem are colored in white fluorescence. The correlationbetween them was r = 0.35. This suggests the necessityof further improvements. Moreover, the detection rate ofiPSC-derived cancer stem cells was 63.0% and detectionprecision was 44.0%. These results show a good detectionrate and detection precision for drug discovery. However,going by these results, we still have to improve the learn-ing algorithm by changing input data type and quality toconfirm the fundamental validity of our AI system, whilepaying attention to the quality control of dataset.

5. Conclusion

This paper reported the first step of our new challengeof applying AI technology to establish a new drug discov-ery process. Concretely, we designed and implementedan AI-based system to automatically detect iPSC-derivedcancer stem cells by learning to automatically model themfrom datasets. The fundamental validity of our systemwas experimentally investigated and confirmed. Indeed,further improvement is still needed for the proposed AIsystem in terms of its detection power, but surely, our pro-posed method opens a new, more effective way of drugdiscovery than before.

AcknowledgementsThis research is supported by Tokyo University of Technologywith two grants, that is, the Advanced AI Research Grant of Bion-ics AI research (Professor Tomoyasu Sugiyama), and that of Com-puter Science AI research (Associate Professor Shino Iwashita).

References:[1] R. Yu, X. Xu, and Z. Wang, “Influence of Object Detection in Deep

Learning,” J. Adv. Comput. Intell. Intell. Inform., Vol.22, No.5,pp. 683-688, 2018.

[2] W. Liu, Y. Yang, and L. Wei, “Weather Recognition of Street SceneBased on Sparse Deep Neural Networks,” J. Adv. Comput. Intell.Intell. Inform., Vol.21, No.3, pp. 403-408, 2017.

[3] M. Mirza and S. Osindero, “Conditional Generative AdversarialNets,” arXiv preprint arXiv: 1411.1784, 2014.

[4] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adver-sarial Nets,” arXiv: 1406.2661 [stat.ML], 2014.

[5] A. Radford, L. Metz, and S. Chintala, “Unsupervised Representa-tion Learning with Deep Convolutional Generative Adversarial Net-works,” arXiv preprint arXiv: 1511.06434, 2015.

[6] C. H. Lin, E. Yumer, O. Wang, E. Shechtman, and S. Lucey, “ST-GAN: Spatial Transformer Generative Adversarial Networks forImage Compositing,” Proc. of the IEEE/CVF Conf. on ComputerVision and Pattern Recognition, pp. 9455-9464, 2018.

[7] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks,”Proc. of the IEEE Int. Conf. on Computer Vision, pp. 2242-2251,2017.

Vol.24 No.1, 2020 Journal of Advanced Computational Intelligence 139and Intelligent Informatics

Page 7: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Aida, S. et al.

[8] Y. Choi et al., “StarGAN: Unified Generative Adversarial Net-works for Multi-Domain Image-to-Image Translation,” Proc. of theIEEE/CVF Conf. on Computer Vision and Pattern Recognition,pp. 8789-8797, 2018.

[9] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive Grow-ing of GANs for Improved Quality, Stability, and Variation,” arXivpreprint arXiv:1710.10196, 2017.

[10] A. Odena, C. Olah, and J. Shlens, “Conditional Image Synthesiswith Auxiliary Classifier Gans,” Proc. of the 34th Int. Conf. on Ma-chine Learning, Vol.70, pp. 2642-2651, 2017.

[11] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-Attention Generative Adversarial Networks,” arXiv preprintarXiv: 1805.08318, 2018.

[12] X. Chen et al., “InfoGAN: Interpretable Representation Learning byInformation Maximizing Generative Adversarial Nets,” Advancesin Neural Information Processing Systems, pp. 2172-2180, 2016.

[13] H. Zhang et al., “StackGAN: Text to Photo-Realistic Image Syn-thesis with Stacked Generative Adversarial Networks,” Proc. of theIEEE Int. Conf. on Computer Vision, pp. 5908-5916, 2017.

[14] T. Schlegl, P. Seebock, S. M. Waldstein, U. Schmidt-Erfurth, and G.Langs, “Unsupervised anomaly detection with generative adversar-ial networks to guide marker discovery,” Int. Conf. on InformationProcessing in Medical Imaging, pp. 146-157, 2017.

[15] N. Gong, Y. Yang, Y. Liu, and D. Liu, “Dynamic Facial ExpressionSynthesis Driven by Deformable Semantic Parts,” Proc. of 24th Int.Conf. on Pattern Recognition (ICPR), pp. 2929-2934, doi: 10.1109/ICPR.2018.8545831, 2018.

[16] G. Keskin, T. Lee, C. Stephenson, and O. H. Elibol, “Measuringthe Effectiveness of Voice Conversion on Speaker Identificationand Automatic Speech Recognition Systems,” arXiv: 1905.12531[eess.AS], 2019.

[17] J. Yang, T. Li, G. Liang, W. He, and Y. Zhao, “A Simple Recur-rent Unit Model based Intrusion Detection System with DCGAN,”IEEE Access, Vol.7, pp. 83286-83296, doi: 10.1109/ACCESS.2019.2922692 (in print).

[18] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-ImageTranslation with Conditional Adversarial Networks,” Proc. of theIEEE Conf. on Computer Vision and Pattern Recognition, pp. 1125-1134, 2017.

[19] International Human Genome Sequencing Consortium, “Initial se-quencing and analysis of the human genome,” Nature, Vol.409,pp. 860-921, 2001.

[20] J. C. Venter, M. D. Adams et al., “The Sequence of the HumanGenome,” Science, Vol.291, No.5507, pp. 1304-1351, 2011.

[21] M. Prieto-Vila, T. Yan, A. S. Calle, N. Nair, L. Hurtley, T. Kasai,H. Kakuta, J. Masuda, H. Murakami, A. Mizutani, and M. Seno,“iPSC-derived cancer stem cells provide a model of tumor vascula-ture,” American J. of Cancer Research, Vol.6, No.9, pp. 1906-1921,2016.

[22] H. Kameda, S. Aida, S. Nishisako, T. Kasai, A. Sato, and T.Sugiyama, “Implementation and its Evaluation of a Prototype Sys-tem to detect iPSC-derived Cancer Stem Cell,” The 1st AnnualMeeting of the Japanese Association for Medical Artificial Intel-ligence, G-32, pp. 85, 2019.

[23] S. Nishisako, S. Aida, H. Kameda, T. Kasai, A. Sato, and T.Sugiyama, “Deep-learning of cancer stem cell morphology for anti-cancer stem cell molecule screening,” Chem-Bio Informatics (CBI)Society Annual Meeting 2018, p. 85, 2018.

[24] S. Russell and P. Norvig, “Artificial Intelligence – A Modern Ap-proach –,” 2nd Edition, Prentice Hall, 2003.

Name:Saori Aida

Affiliation:School of Computer Science, Tokyo Universityof Technology

Address:1404-1 Katakura, Hachioji, Tokyo 192-0982, JapanBrief Biographical History:2015- Post-Doctoral Researcher, National Institute of Advanced IndustrialScience and Technology (AIST)2016- Research Associate, Tokyo University of Technology2017- Assistant Professor, Tokyo University of TechnologyMain Works:• “Overestimation of the number of elements in a three-dimensionalstimulus,” J. of Vision, Vol.15, No.9, Article 23, pp. 1-16, 2015.Membership in Academic Societies:• Vision Society of Japan• The Japanese Psychonomic Society (JPS)• Society of Automotive Engineers of Japan (JSAE)

Name:Hiroyuki Kameda

Affiliation:School of Computer Science, Tokyo Universityof Technology

Address:1404-1 Katakura, Hachioji, Tokyo 1404-1, JapanBrief Biographical History:1987- Senior Assistant Professor, Tokyo University of Technology2007- Professor, Tokyo University of TechnologyMain Works:• “Computer-assisted cognitive remediation therapy increaseshippocampal volume in patients with schizophrenia: a randomizedcontrolled trial,” BMC Psychiatry, Vol.18, Article No.83,doi: 10.1186/s12888-018-1667-1, 2018.• “Samawa Language: Part of Speech Tagset and Tagged Corpus for NLPResources,” J. of Physics: Conf. Series, Vol.1061, Conf. 1, 2018.Membership in Academic Societies:• The Institute of Electronics, Information and Communication Engineers(IEICE)• Japanese Cognitive Science Society (JCSS)

140 Journal of Advanced Computational Intelligence Vol.24 No.1, 2020and Intelligent Informatics

Page 8: Conditional Generative Adversarial Networks to Model iPSC ...kameken.clique.jp/iPS_GAN_No1.pdfConditional GAN to Model iPSC-Derived Cancer Stem Cells datasets. However, GAN still has

Conditional GAN to Model iPSC-Derived Cancer Stem Cells

Name:Sakae Nishisako

Affiliation:Graduate School of Bionics, Computer and Me-dia Sciences, Tokyo University of Technology

Address:1404-1 Katakura, Hachioji, Tokyo 192-0982, JapanBrief Biographical History:2015- Undergraduate Student of School of Bionics and Biotechnology,Tokyo University of Technology2019- Graduate Student of Graduate School of Bionics, Computer andMedia Sciences, Tokyo University of TechnologyMain Works:• “Deep-learning of Cancer Stem Cell Morphology for Anti-Cancer StemCell Molecule screening,” Chem-Bio Informatics (CBI) Society AnnualMeeting, 2018.

Name:Tomonari Kasai

Affiliation:School of Bioscience and Biotechnology, TokyoUniversity of Technology

Address:1404-1 Katakura, Hachioji, Tokyo 192-0982, JapanBrief Biographical History:2007- Assistant Professor, Okayama University2018- Associate Professor, Tokyo University of TechnologyMain Works:• “A cancer stem cell model as the point of origin of cancer-associatedfibroblasts in tumor microenvironment,” Scientific Reports, Vol.7,Article No.6838, 2017.Membership in Academic Societies:• The Japanese Cancer Association (JCA)• The Molecular Biology Society of Japan (MBSJ)• Japanese Association for Medical Artificial Intelligence (JMAI)

Name:Atsushi Sato

Affiliation:Professor, School of Bioscience and Biotechnol-ogy, Tokyo University of Technology

Address:1404-1 Katakura, Hachioji, Tokyo 192-0982, JapanBrief Biographical History:1989- Researcher, Medical Devices & Diagnostics Research Laboratories,Toray Industries, Inc.2001- Researcher, Brain Science Institute, Institute of Physical andChemical Research (RIKEN)2003- Associate Professor, School of Bionics, Tokyo University ofTechnology2009- Professor, School of Bioscience and Biotechnology, TokyoUniversity of TechnologyMain Works:• Development of Lactoferrin with an Improved Plasma Half-Life as aBiotherapeutic Agent• Development of Functional Peptides Isolated from Phage DisplayRandom Peptide Libraries as Pharmaceutical SeedsMembership in Academic Societies:• Japanese Association for Lactoferrin• The Pharmaceutical Society of Japan• The Japanese Biochemical Society (JBS)

Name:Tomoyasu Sugiyama

Affiliation:Professor, School of Bioscience and Technology,Tokyo University of Technology

Address:1404-1 Katakura, Hachioji, Tokyo 192-1982, JapanBrief Biographical History:1990- Joined Kyowa Hakko Kogyo Co., Ltd.2002 Visiting Scientist, The University of Tokyo2003- Joined Tokyo University of TechnologyMain Works:• “A shRNA library constructed through the generation of loop-stem-loopDNA,” J. Gene. Med., Vol.12, No.11, pp. 927-933, 2010.• “shRNA target prediction informed by comprehensive enquiry (SPICE):a supporting system for high-throughput screening of shRNA library,”EURASIP J. Bioinform Syst. Biol. 2016, Article No.7, 2016.Membership in Academic Societies:• The Molecular Biology Society of Japan (MBSJ)• The Society for Biotechnology, Japan (SBJ)• The Japanese Biochemical Society (JBS)

Vol.24 No.1, 2020 Journal of Advanced Computational Intelligence 141and Intelligent Informatics