semi-automatic incompatibility localization for re-engineered industrial software
TRANSCRIPT
Copyright 2014 FUJITSU LABORATORIES LIMITED
Semi-automatic Incompatibility Localization for Re-engineered Industrial Software
Susumu Tokumoto†1, Kazunori Sakamoto†2, Kiyofumi Shimojo†3, Tadahiro Uehara†1, Hironori Washizaki†3
Fujitsu Laboratories Limited(†1) National Institute of Informatics(†2)Waseda University(†3)
April 1, 2014
3 Copyright 2014 FUJITSU LABORATORIES LIMITED
A legacy system evolves with many fixing and new features
Reengineering a Legacy System
Reengineering
↓ Efficiency↓ Reusability
↑ Efficiency↑ Reusability
Can the new system keep the specifications of old one?
⇒ Compatibility testing!
4 Copyright 2014 FUJITSU LABORATORIES LIMITED
Basic idea: Record inputs and outputs on old, then check the outputs with corresponding inputs on new.
How to Test the Compatibility of the Reengineered System
InputInputInput
InputInputOutput
out=1 out=4 out=1 out=5
in=2in=1Incompatible
Check the outputs
Automation with Symbolic Execution
5 Copyright 2014 FUJITSU LABORATORIES LIMITED
run a symbolic executor to generate test cases for a component, then use these test cases to test the new versions of this component
Compatibility Testing using KLEE
int ffs (int word) { int i=0; if (!word) return 0; for (;;) if (((1 << i++)&word) != 0) return i;}int main(){ int x, r; klee_make_symbolic(&x, sizeof(x), "x"); r = ffs(x); klee_expected(&r, sizeof(r), "r"); return 0;}
int ffs (int i) { char n = 1; if (!(i & 0xfffe)) { n += 16; i >>= 16; } if (!(i & 0xff)) { n += 8; i >>= 8; } if (!(i & 0x0f)) { n += 4; i >>= 4; } if (!(i & 0x03)) { n += 2; i >>= 2; } return (i) ? (n+((i+1) & 0x01)) : 0;}int main(){ int x, r; klee_make_symbolic(&x, sizeof(x), "x"); r = ffs(x); klee_expected(&r, sizeof(r), "r"); return 0;}
Generate
test cases
Replay
test case
s
Running test: klee-last/test000001.ktest KLEE: r is valid as expected.Running test: klee-last/test000002.ktestKLEE: ERROR: invalid expected value.(name=r, input=0x00000000, expected=0x00000001)
Original source code New source codefault
Test results
KLEE
x=0Test
casesExpected
values
x=1
r=0 r=1
compatibleincompatible
6 Copyright 2014 FUJITSU LABORATORIES LIMITED
As Is The source code of the server products’ monitor is different from that of
the storage systems. However their SMTP libraries have similar features
To Be The both of SMTP libraries are unified
Overview of Re-engineering Project
Linux VxWorks
SMTP Library forServer Product
SMTP Library forStorage Systems
HW Monitor LibraryHW Monitor Library
HW Monitor Agent HW Monitor Agent
Linux VxWorks
Common SMTP Library
HW Monitor LibraryHW Monitor Library
HW Monitor Agent HW Monitor Agent
Compatibility Layer
Server ProductsStorage Systems
Storage SystemsServer ProductsAsIs ToBe
Product Specific Layer
Re-engineeri
ng
・ Written in C・ 19 KLOC
・ Written in C・ 13 KLOC
7 Copyright 2014 FUJITSU LABORATORIES LIMITED
How efficient is our approach compared with a traditional testing?
Results of Automated Compatibility Testing
Traditional testing Our approach
Man-months 1.5 4
# of test cases 545 10846
# of detected bugs 27 +5
Results of traditional testing and our approach
We mostly consumed 4 man-months for attacking the path explosion problem. One of examples to reduce redundant paths is adding constraints about symbolic values.
Our approach was able to find 5 more bugs which cannot be detected in the traditional testing
The test cases finding the bugs are characterized by combination of parameters and SMTP sequence, which are the
corner cases hard to be recognized manually
8 Copyright 2014 FUJITSU LABORATORIES LIMITED
Depending on the situations, sometimes developers trade some incompatibility for improved usability, performance and quality.
Issue: Investigating cause of failures
Original
• TCP port can be specified any non-negative number.
• Use blocking socket API
• Check the partial size even if partial message is disable
Re-engineered
• TCP port can be specified from 0 to 65535.
• Use non-blocking socket API and select()
• Doesn’t check the partial size if partial message is disable
Examples of allowable incompatibilities
9 Copyright 2014 FUJITSU LABORATORIES LIMITED
Issue: Investigating cause of failures
Erroneous incompatibilities
• can be fixed• all failed test cases
which have the same cause are changed to success accordingly
Allowable incompatibilities
• cannot be fixed• failed test cases
which have the same cause cannot be changed to success
Failed test case
Erroneous Allowable
Same causeallowable
Same causeerroneous We should check whether each
test case is allowable without fixing.
In our case there are 4002failed test cases to be checked
Cannot bechanged to
success
1. Investigating cause of failure
2. If it is error, fix it.
Process of debugging
11 Copyright 2014 FUJITSU LABORATORIES LIMITED
int mid(int x, int y, int z){
S1 int m = z;
S2 if (y < z)
S3 if (x < y)
S4 m = y;
S5 else if (x < z)
S6 m = y;
S7 else
S8 if (x > y)
S9 m = y;
S10 else if (x > z)
S11 m = x;
S12 return m;
}
Spectrum-based Bug Localization
uses the coverage and execution results to compute the suspiciousness of each statement
𝑆𝑢𝑠𝑝 (𝑆𝑛)=
𝐹𝑎𝑖𝑙(𝑆𝑛)𝐹𝑎𝑖𝑙(𝐴𝑙𝑙)
𝐹𝑎𝑖𝑙 (𝑆𝑛)𝐹𝑎𝑖𝑙(𝐴𝑙𝑙)
+𝑃𝑎𝑠𝑠(𝑆𝑛)𝑃𝑎𝑠𝑠( 𝐴𝑙𝑙)
𝑆𝑢𝑠𝑝 (𝑆𝑛)=¿
Tarantula’s suspiciousness:
Ochiai’s suspiciousness:
Susp (𝑆6 )=
1111+ 15
3,3,5
1,2,3
3,2,1
5,5,5
5,3,4
2,1,3
Susp.
50.0%
50.0%
62.5%
0%
71.4%
83.3%
0%
0%
0%
0%
0%
50.0%
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
●
● ● ●
● ●
● ●
● ●
●
●
● ● ● ● ● ●
P P P P P F
12 Copyright 2014 FUJITSU LABORATORIES LIMITED
Details of Applying Bug Localization
• Written by C• 13 KLOC ( 3560 Executable Statements )• 4003 failed test cases in 10876 test cases• Statement coverage: 86.3%
• 9 causes of incompatibility are detected beforehand by combination of another methods
Target Source Code
• OCCF(Open Code Coverage Framework) by Sakamoto et al.• Easy to change suspiciousness formula(Tarantula, Ochiai, Jaccard,
Russell, etc.)
Bug Localization Tool
• Search causes from high suspicious line
Method of detecting causes of incompatibility
13 Copyright 2014 FUJITSU LABORATORIES LIMITED
Histogram of Statements (Tarantula)
0.9<
susp
1≦
0.8<
susp
0.9≦
0.7<
susp
0.8≦
0.6<
susp
0.7≦
0.5<
susp
0.6≦
0.4<
susp
0.5≦
0.3<
susp
0.4≦
0.2<
susp
0.3≦
0.1<
susp
0.2≦
0su
sp0.1
≦
≦
0
100
200
300
400
500
600
700
800
900
1000
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
# of statements cumulative % of statements cumulative % of detected causes
Detect 90% causesin 10% statements
14 Copyright 2014 FUJITSU LABORATORIES LIMITED
Histogram of Statements (Ochiai)
0.9<
susp
1≦
0.8<
susp
0.9≦
0.7<
susp
0.8≦
0.6<
susp
0.7≦
0.5<
susp
0.6≦
0.4<
susp
0.5≦
0.3<
susp
0.4≦
0.2<
susp
0.3≦
0.1<
susp
0.2≦
0su
sp0.1
≦
≦
0
100
200
300
400
500
600
700
800
900
1000
1100
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
110%
# of statements cumulative % of statements cumulative % of detected causes
Detect 44% causesin 10% statements
15 Copyright 2014 FUJITSU LABORATORIES LIMITED
Comparison of Tarantula and Ochiai
In this application, Tarantula is superior to Ochiai This application contained multiple causes, therefore most Fail(Sn) is
much smaller than Fail(All) Tarantula’s suspiciousness is calculated by ratio of pass/fail test cases.
• If Pass(Sn) is 0, Susp(Sn) is 1 even though Fail(Sn) is much smaller than Fail(All).
Ochiai’s suspiciousness is calculated by non-ratio values.
• Even if Pass(Sn) is 0, Susp. is not 1 because Fail(Sn) is much smaller than Fail(All).
𝑆𝑢𝑠𝑝 (𝑆𝑛)=
𝐹𝑎𝑖𝑙(𝑆𝑛)𝐹𝑎𝑖𝑙(𝐴𝑙𝑙)
𝐹𝑎𝑖𝑙 (𝑆𝑛)𝐹𝑎𝑖𝑙(𝐴𝑙𝑙)
+𝑃𝑎𝑠𝑠(𝑆𝑛)𝑃𝑎𝑠𝑠( 𝐴𝑙𝑙)
Ratio of failed test cases in Sn Ratio of
passed test cases in Sn
𝑆𝑢𝑠𝑝 (𝑆𝑛)= 𝐹𝑎𝑖𝑙(𝑆𝑛)
√𝐹𝑎𝑖𝑙( 𝐴𝑙𝑙)×(𝐹𝑎𝑖𝑙 (𝑆𝑛)+𝑃𝑎𝑠𝑠 (𝑆𝑛))
If Fail(Sn) << Fail(All), susp. becomes small
16 Copyright 2014 FUJITSU LABORATORIES LIMITED
Why are there row Susp.s ?
ID File name line Susp.1 dir_c/src01.c 524-535 0.9782 dir_c/src01.c 443-445 0.5273 dir_c/src01.c 507-513 1.0004 dir_r/src07.c 292-296 0.5385 dir_r/src03.c 312 0.6866 dir_r/src06.c 216 0.4777 dir_r/src06.c 197 0.9628 dir_r/src07.c 216-234 1.0009 dir_r/src10.c 266 1.000 switch (isPart) {
case 0: /* no partial size check */ break;case 1: if (! ((Partial_size == 0) || (Partial_size >= PART_SIZE_MIN && Partial_size <= PART_SIZE_MAX))) { /* patial size error */ return -1; } break;default: /* invalid isPart value */ return -1;}
Due to missing code, suspicious lines of code is
not found
if( Partial_size >= PART_SIZE_MIN && Partial_size <= PART_SIZE_MAX ){ :}else{ return -1;}
Before Reengineered
After Reengineered
Due to missing code,
suspicious lines of code is not
found
List of incompatible points
17 Copyright 2014 FUJITSU LABORATORIES LIMITED
Discussion
Does symbolic execution really help compatibility testing? We obtained five more bugs but spent 4 MM The most important effect of automated compatibility testing is detecting
edge case bugs, which would cost more by manual testing The original project didn't have automated test suite. Therefore the
generated test cases become their testing asset.
Is spectrum-based bug localization technique appropriate for finding incompatibilities? In our case localized 10% of code equals about 200 executable
statements, that is realistic amount to inspect by manual. Trying other techniques, such as model-based debugging, is candidate
of future work
18 Copyright 2014 FUJITSU LABORATORIES LIMITED
Conclusion
We present compatibility testing
using symbolic execution and show
how to apply it to the re-engineering project
5 more bugs was detected by our testing
method
We applied the bug localization to search
the cause of incompatibility in
system reengineering.
90% of incompatibilities are localized in 10% of
code by Tarantula.