Download - Wcre2010 shihab
1
Predicting Re-opened BugsA Case Study on the Eclipse Project
Emad Shihab, A. Ihara, Y. Kamei, W. Ibrahim, M. Ohira, B. Adams, A. E. Hassan and K. Matsumoto
[email protected], Queen’s University, Canada
NAIST, Japan
2
When you discover a bug …
Report bug Fix bug Verify fix Close bug
Re-opened
Bug report
3
Degrade quality …
4
Increase maintenance costs …
5
Unnecessary re-work…
6
Research questions …
1. Which attributes indicate re-opened bugs?
2. Can we accurately predict if a bug will be re-opened using the extracted attributes?
7
Determine best
attributes
Mine code and bug
repositories
Approach overview
Extract attributes
Predict re-opened bugs
8
Our dimensions …
Work habit Bug report
Bug fix People
9
Work habit attributes
1. Time (Hour of day)2. Weekday3. Day of month4. Month
10
Bug report attributes1. Component 2. Platform3. Severity 4. Priority5. CC list6. Priority changed7. Description size 8. Description text9. Number of comments10. Comment size11. Comment text
Metadata
Textualdata
11
Bug fix attributes
1. Time to resolve (in days) 2. Last status3. Number of edited files
12
People attributes
1. Reporter Name 2. Reporter experience3. Fixer name4. Fixer experience
13
Research question 1
Which attributes indicate re-opened bugs?
Comment text, description text and fix location (component) are the best indicators
14
Top node analysis setup
1. Build 10 decision trees for each attribute set
3. Repeat using all attributes
2. Record the frequency and level of each attribute
Decision tree prediction model
15
No. files>= 5 < 5
Dev exp>= 3 < 3
Re-openedMonth
Time>= 12 < 12
Time to resolve>= 6 < 6 >= 24 < 24
Re-opened Not Re-opened Re-opened...
.
.
.
Level 1
Level 2
Level 3
16
Top node analysis example with 3 trees
Comment
Time No. comments
Comment
Time No. files
No. files
Time Description size
Level Frequency AttributesLevel 1 2
1CommentNo. files
Level 2 3111
TimeNo. commentsNo. filesDescription size
.
...
.
.
17
Which attributes best indicate re-opened bugs?
Work habit attributes
9 X Month 1 X Time (Hour of day)WeekdayDay of month
18
Which attributes best indicate re-opened bugs?
Bug report attributes
Component PlatformSeverity PriorityCC listPriority changedDescription size Description textNumber of commentsComment size10 X Comment text
Metadata
Textualdata
19
Which attributes best indicate re-opened bugs?
7 X Time to resolve3 X Last statusNumber of files in fix
Bug fix attributes
20
Which attributes best indicate re-opened bugs?
5 X Reporter name5 X Fixer nameReporter experienceFixer experience
People attributes
21
Combining all attributes
+ ++
Level Frequency AttributesLevel 1 10 Comment textLevel 2 19
1Description textComponent
22
Research question 2
Can we accurately predict if a bug will be re-opened using the extracted attributes?
Our models can correctly predict re-opened bugs with 63% precision and 85% recall
Decision tree prediction model
23
No. files>= 5 < 5
Dev exp>= 3 < 3
Re-openedMonth
Time>= 12 < 12
Time to resolve>= 6 < 6 >= 24 < 24
Re-opened Not Re-opened Re-opened...
.
.
.
Level 1
Level 2
Level 3
24
Performance measures
Re-opened precision:
Re-opened Recall:
Re-opened Not re-opened
Re-opened TP FP
Not re-opened FN TNPredicted
Actual
TPTP+FP
TPTP+FN
Not re-opened precision:
Not re-opened recall:
TNTN+FN
TNTN+FP
25Work habits Bug report Bug fix People
33
63
2127
74
83 83
67
PrecisionRecall
Prec
isio
n an
d re
call
(%)
Predicting re-opened bugs
26
Work habits Bug report Bug fix People
9397
93 91
71
91
39
66
PrecisionRecall
Prec
isio
n an
d re
call
(%)
Predicting NOT re-opened bugs
27
Combining all attributes
re-opened NOT re-opened
63
9785 90
PrecisionRecall
Prec
isio
n an
d re
call
(%)
+ ++
28
Bug comments are important …
Bug report is most important set
What words are important?
Comment text most important bug report attribute
29
Important words
Re-opened Not Re-opened
controlbackgrounddebuggingbreakpointblocked platforms
verifiedduplicatescreenshotimportanttestingwarning
30
31
Predicting re-opened bugs
Pr: 93 %Re: 71 %
Work habits Bug report Bug fix People
Pr: 33 %Re: 74 %
Pr: 97%Re: 91%
Pr: 93%Re: 39%
Pr: 63 %Re: 83 %
Pr: 21%Re: 83%
Pr: 91%Re: 66%
Pr: 27%Re: 67% Re-opened
Not Re-opened
32
Predicting re-opened bugs
Work habits Bug report Bug fix People
33
Predicting NOT re-opened bugs
Pr: 93 %Re: 71 %
Work habits Bug report Bug fix People
Pr: 97%Re: 91%
Pr: 93%Re: 39%
Pr: 91%Re: 66%
34
Predicting re-opened bugs
Pr: 97 %Re: 90 %
Pr: 63 %Re: 85 % Re-opened
Not Re-opened
+ ++
Bug report re-opened Bug report NOT re-opened
RecallPrecision
35
Predict re-opened
bugs
Mine code and bug
repositories
Approach overview
Attributes of re-opened
bugs
Measure performance
36Work habits Bug report Bug fix People
RecallPrecision
Prec
isio
n an
d re
call
quan
tity
Predicting re-opened bugs
37
Which attributes best indicate re-opened bugs?
Month (9)Time (1)
Work habits
Comment text (10)
Bug report Bug fix
Time to fix (7)Last status (3)
People
Fixer (5)Reporter (5)
38
Bug report
39
A typical work day…
40
Bug report attributes1. Component 2. Platform3. Severity 4. Priority5. CC list6. Priority changed7. Description size 8. Description text9. Number of comments10. Comment size11. Comment text
Metadata
Textualdata