automated software transplantation
TRANSCRIPT
Automated Software Transplantation
QRS 2016 Talk by Mark Harman
PhD work by Alexandru Marginean
CREST, University College London
Collaborators Earl Barr, Yue Jia, Justyna Petke
Automated Software Transplantation
QRS 2016 Talk by Mark Harman
PhD work by Alexandru Marginean
CREST, University College London
Collaborators Earl Barr, Yue Jia, Justyna Petke
Video Player
Start from scratch
Why Autotransplantation?Check open
source repositories Why not handle H.264?
~100 players
Why Autotransplantation?
Kate
Start from scratch
Check open source repositories
C Call Graph?
C Indentation?
BBC World Service Interview!WIRED article!Many shares on Social Media!ACM Distinguished Paper Award at ISSTA ‘15
Recognition for!Our Tool μScalpel
AutotransplantationAutomatic Error Fixing
In-Situ Code ReusalManual Code Transplants
Related Work
Clone Detection Code Migration Dependence Analysis
Feature Location Code Salvaging Feature Extraction
In-Situ Code Reuse
Synchronising Manual
Transplants
Automatic Replay Copy-
Paste
AutotransplantationAutomatic Error Fixing
Related WorkManual Code TransplantsIn-Situ Code Reusal
Running Program
Debugger
Binary Organ
Miles et al.: In situ reuse of logically extracted functional components
Autotransplantation
Manual Code Transplants In-Situ Code ReusalAutomatic Error Fixing
Related Work
Host Donors
Sidiroglou-Douskos et al.: Automatic Error Elimination by Multi-Application Code Transfer
Automated Software Transplantation
Host
Donor !!
OrganOrgan
ENTRY
Implantation Point
V
Organ Test Suite
Manual Work:!!
Organ Entry !!Organ’s Test Suite !!!Implantation Point
μTransHost
Donor
Stage 1: Static Analysis
Host Beneficiary
Stage 2: Genetic
Programming
Stage 3: Organ
Implantation
Organ Test Suite
Implantation Point
Organ Entry
Stage 1 — Static Analysis Donor
OEENTRY
VeinOrgan
Matching Table
Dependency Graph
Host
Implantation Point
Stm: x = 10; -> Decl: int x;Donor: int X -> Host: int A, B, C
Stage 2 — GPS1 S2 S3 S4 S5 … Sn
Matching Table !
V3H V4H
Donor Variable ID
Host Variable ID (set)
V1D
V2D
…
V1H V2H
V5H
Individual
Var
Mat
chin
gSt
atem
ent
s
V1D V1H
V2D V4H
S1 S7 S73
M1:M2:
…Genetic Programming
Compilation
Week !Proxies
Strong !Proxies
Fitness Function:
Stage 2 - Gp Operators
Individual
Var
Mat
chin
gSt
atem
ent
s
V1D V1H
V2D V4H
S1 S7 S73
M1:M2:
…
Replace Mapping
Matching Table !Donor
Variable IDHost Variable
ID (set)
V1D V1H V2HV2H
Stage 2 - Gp Operators
Individual
Var
Mat
chin
gSt
atem
ent
s
V1D V1H
V2D V4H
S1 S7 S73
M1:M2:
…
Replace StatementS1 S2 S3 S4 S5 … Sn
S7
S3
Stage 2 - Gp Operators
Individual
Var
Mat
chin
gSt
atem
ent
s
V1D V1H
V2D V4H
S1 S7 S73
M1:M2:
…
S1 S2 S3 S4 S5 … SnRemove
Statement
Stage 2 - Gp Operators
Add StatementS1 S2 S3 S4 S5 … Sn
Individual
Var
Mat
chin
gSt
atem
ent
s
V1D V1H
V2D V4H
S1 S7 S73
M1:M2:
…
S3
Stage 2 - Gp Operators
Crossover Operator
Individual 1
S1 S7
M1 M2
Individual 2
S3 S9
M3 M4
Offspring 1
Offspring 2
Offspring 3
Random Mapping Selection
M1
S1
M4
S9
M3
S3
M2
S7S1 S7
S3 S9
M3
M2
Host
Organ
Donor
RQ1: Do we break the initial functionality?
RQ2: Have we really added new functionality?
RQ3: How about the computational effort?
RQ4: Is autotransplantation useful?
Research QuestionsRegression TestsAcceptance Tests
Research QuestionsRQ1: Do we break the initial functionality?
RQ3: How about the computational effort?
RQ4: Is autotransplantation useful?
RQ2: Have we really added new functionality?
Empirical Study !
15 Transplantations 300 Runs 5 Donors 3 Hosts
Case Studies: !
H.264 Encoding Transplantation;
Kate - call graph generation & C indentation;
ValidationRegression
Tests
Augmented Regression
Tests
Donor Acceptance
Tests
Acceptance Tests
Manual Validation
RQ1.1
RQ1.2 RQ2
Host Beneficiary
Subjects
Minimal size: 0.4k
Max size: 422k
Average Donor:16k
Average Host: 213k
Feedback-Controlled Random Test
Generation
Kohsuke Yatoh1⇤, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and e↵ectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight di↵erent, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
⇤The author is currently a�liated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ISSTA’15 , July 12–17, 2015, Baltimore, MD, USA
Copyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test e↵ectiveness stop increasing atdi↵erent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test e↵ectiveness of feedback-directed randomtesting should di↵er because of its randomness. However,the observed di↵erence is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of teste↵ectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from [email protected].
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USA
ACM. 978-1-4503-3620-8/15/07
http://dx.doi.org/10.1145/2771783.2771805
�������
��� ������� �����
��������������
��
����� �
���������
���������
������ �
���
316
Subjects Type Size KLOC
Reg. Tests
Organ Tests
Idct Donor 2.3 - 3-5Mytar Donor 0.4 - 4Cflow Donor 25 - 6-20
Webserver Donor 1.7 - 3TuxCrypt Donor 2.7 - 4-5
Pidgin Host 363 88 -Cflow Host 25 21 -SoX Host 43 157 -
Case StudyVLC Host 422 27 -Kate Host 50 238 -x264 Donor 63 - 5Cflow Donor 22 - 13Indent Donor 26 - 7
Experimental Methodology and Setup
μSCALPEL
Host
Implantation Point
Donor !!
OEOrgan Test Suite
Host Beneficiary
Implantation Point
Organ
Count LOC — CLOC
Count LOC CLOC
x 20
GNU Time
Validation Test Suites
!Coverage Information:
Gcov
64 bit Ubuntu 14.10 16 GB RAM 8 threads
Alexandru Marginean — Automated Software Transplantation
Donor Host All Passed Regression Regression++ AcceptanceIdct Pidgin 16 20 17 16
Mytar Pidgin 16 20 18 20Web Pidgin 0 20 0 18Cflow Pidgin 15 20 15 16Tux Pidgin 15 20 17 16Idct Cflow 16 17 16 16
Mytar Cflow 17 17 17 20Web Cflow 0 0 0 17Cflow Cflow 20 20 20 20Tux Cflow 14 15 14 16Idct SoX 15 18 17 16
Mytar SoX 17 17 17 20Web SoX 0 0 0 17Cflow SoX 14 16 15 14Tux SoX 13 13 13 14
TOTAL 188/300 233/300 196/300 256/300 RQ1.1 RQ1.2 RQ2
Empirical Study !RQ1,2
Feedback-Controlled Random Test
Generation
Kohsuke Yatoh1⇤, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and e↵ectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight di↵erent, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
⇤The author is currently a�liated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ISSTA’15 , July 12–17, 2015, Baltimore, MD, USA
Copyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test e↵ectiveness stop increasing atdi↵erent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test e↵ectiveness of feedback-directed randomtesting should di↵er because of its randomness. However,the observed di↵erence is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of teste↵ectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from [email protected].
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USA
ACM. 978-1-4503-3620-8/15/07
http://dx.doi.org/10.1145/2771783.2771805
�������
��� ������� �����
��������������
��
����� �
���������
���������
������ �
���
316
All Passed
188/300
Regression
233/300RQ1.1
Regression++
196/300RQ1.2
Acceptance
256/300 RQ2
Alexandru Marginean — Automated Software Transplantation
Execution Time (minutes)Donor HostIdct Pidgin 5 7 97
Mytar Pidgin 3 1 65Web Pidgin 8 5 160Cflow Pidgin 58 16 1151Tux Pidgin 29 10 574Idct Cflow 3 5 59
Mytar Cflow 3 1 53Web Cflow 5 2 102Cflow Cflow 44 9 872Tux Cflow 31 11 623Idct SoX 12 17 233
Mytar SoX 3 1 60Web SoX 7 3 132Cflow SoX 89 53 74Tux SoX 34 13 94
Total
Empirical Study !RQ3
Average
334 (min)
Std. Dev. Total
72 (hours)10 (Average)
Feedback-Controlled Random Test
Generation
Kohsuke Yatoh1⇤, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and e↵ectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight di↵erent, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
⇤The author is currently a�liated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ISSTA’15 , July 12–17, 2015, Baltimore, MD, USA
Copyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test e↵ectiveness stop increasing atdi↵erent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test e↵ectiveness of feedback-directed randomtesting should di↵er because of its randomness. However,the observed di↵erence is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of teste↵ectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from [email protected].
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USA
ACM. 978-1-4503-3620-8/15/07
http://dx.doi.org/10.1145/2771783.2771805
�������
��� ������� �����
��������������
��
����� �
���������
���������
������ �
���
316
Case Study !VLC
Transplant Time & Test Suites
Time (hours) Regression Regression++ Acceptance
H.264 26 1 1 1
Feedback-Controlled Random Test
Generation
Kohsuke Yatoh1⇤, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and e↵ectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight di↵erent, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
⇤The author is currently a�liated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ISSTA’15 , July 12–17, 2015, Baltimore, MD, USA
Copyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test e↵ectiveness stop increasing atdi↵erent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test e↵ectiveness of feedback-directed randomtesting should di↵er because of its randomness. However,the observed di↵erence is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of teste↵ectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from [email protected].
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USA
ACM. 978-1-4503-3620-8/15/07
http://dx.doi.org/10.1145/2771783.2771805
�������
��� ������� �����
��������������
��
����� �
���������
���������
������ �
���
316
VLC
H264
Donor Host All Passed Organ Test Suite Regression Regression++ Acceptance
Cflow Kate 16 18 20 17 18
Indent Kate 18 19 20 18 19
TOTAL 34/40 37/40 40/40 35/40 37/40
All Passed — RQ1.1 RQ1.2 RQ2
Case Study - KateRegression
40/40
RQ1.1
Regression++
35/40
RQ1.2
AcceptanceAll Passed
37/40
RQ2
34/40
All Passed
Organ Test Suite
37/40
—
Execution Time (minutes)
Donor Host Average (min) Std. Dev. (min) Total (hours)
Cflow Kate 101 31 33
Indent Kate 31 6 11
Total 132 18.5 44
Std. Dev. (min) Total (hours)
4418.5
Average (min)
132