wcre2010 shihab

Post on 15-Apr-2017

284 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Predicting Re-opened BugsA Case Study on the Eclipse Project

Emad Shihab, A. Ihara, Y. Kamei, W. Ibrahim, M. Ohira, B. Adams, A. E. Hassan and K. Matsumoto

emads@cs.queensu.caSAIL, Queen’s University, Canada

NAIST, Japan

2

When you discover a bug …

Report bug Fix bug Verify fix Close bug

Re-opened

Bug report

3

Degrade quality …

4

Increase maintenance costs …

5

Unnecessary re-work…

6

Research questions …

1. Which attributes indicate re-opened bugs?

2. Can we accurately predict if a bug will be re-opened using the extracted attributes?

7

Determine best

attributes

Mine code and bug

repositories

Approach overview

Extract attributes

Predict re-opened bugs

8

Our dimensions …

Work habit Bug report

Bug fix People

9

Work habit attributes

1. Time (Hour of day)2. Weekday3. Day of month4. Month

10

Bug report attributes1. Component 2. Platform3. Severity 4. Priority5. CC list6. Priority changed7. Description size 8. Description text9. Number of comments10. Comment size11. Comment text

Metadata

Textualdata

11

Bug fix attributes

1. Time to resolve (in days) 2. Last status3. Number of edited files

12

People attributes

1. Reporter Name 2. Reporter experience3. Fixer name4. Fixer experience

13

Research question 1

Which attributes indicate re-opened bugs?

Comment text, description text and fix location (component) are the best indicators

14

Top node analysis setup

1. Build 10 decision trees for each attribute set

3. Repeat using all attributes

2. Record the frequency and level of each attribute

Decision tree prediction model

15

No. files>= 5 < 5

Dev exp>= 3 < 3

Re-openedMonth

Time>= 12 < 12

Time to resolve>= 6 < 6 >= 24 < 24

Re-opened Not Re-opened Re-opened...

.

.

.

Level 1

Level 2

Level 3

16

Top node analysis example with 3 trees

Comment

Time No. comments

Comment

Time No. files

No. files

Time Description size

Level Frequency AttributesLevel 1 2

1CommentNo. files

Level 2 3111

TimeNo. commentsNo. filesDescription size

.

...

.

.

17

Which attributes best indicate re-opened bugs?

Work habit attributes

9 X Month 1 X Time (Hour of day)WeekdayDay of month

18

Which attributes best indicate re-opened bugs?

Bug report attributes

Component PlatformSeverity PriorityCC listPriority changedDescription size Description textNumber of commentsComment size10 X Comment text

Metadata

Textualdata

19

Which attributes best indicate re-opened bugs?

7 X Time to resolve3 X Last statusNumber of files in fix

Bug fix attributes

20

Which attributes best indicate re-opened bugs?

5 X Reporter name5 X Fixer nameReporter experienceFixer experience

People attributes

21

Combining all attributes

+ ++

Level Frequency AttributesLevel 1 10 Comment textLevel 2 19

1Description textComponent

22

Research question 2

Can we accurately predict if a bug will be re-opened using the extracted attributes?

Our models can correctly predict re-opened bugs with 63% precision and 85% recall

Decision tree prediction model

23

No. files>= 5 < 5

Dev exp>= 3 < 3

Re-openedMonth

Time>= 12 < 12

Time to resolve>= 6 < 6 >= 24 < 24

Re-opened Not Re-opened Re-opened...

.

.

.

Level 1

Level 2

Level 3

24

Performance measures

Re-opened precision:

Re-opened Recall:

Re-opened Not re-opened

Re-opened TP FP

Not re-opened FN TNPredicted

Actual

TPTP+FP

TPTP+FN

Not re-opened precision:

Not re-opened recall:

TNTN+FN

TNTN+FP

25Work habits Bug report Bug fix People

33

63

2127

74

83 83

67

PrecisionRecall

Prec

isio

n an

d re

call

(%)

Predicting re-opened bugs

26

Work habits Bug report Bug fix People

9397

93 91

71

91

39

66

PrecisionRecall

Prec

isio

n an

d re

call

(%)

Predicting NOT re-opened bugs

27

Combining all attributes

re-opened NOT re-opened

63

9785 90

PrecisionRecall

Prec

isio

n an

d re

call

(%)

+ ++

28

Bug comments are important …

Bug report is most important set

What words are important?

Comment text most important bug report attribute

29

Important words

Re-opened Not Re-opened

controlbackgrounddebuggingbreakpointblocked platforms

verifiedduplicatescreenshotimportanttestingwarning

30

31

Predicting re-opened bugs

Pr: 93 %Re: 71 %

Work habits Bug report Bug fix People

Pr: 33 %Re: 74 %

Pr: 97%Re: 91%

Pr: 93%Re: 39%

Pr: 63 %Re: 83 %

Pr: 21%Re: 83%

Pr: 91%Re: 66%

Pr: 27%Re: 67% Re-opened

Not Re-opened

32

Predicting re-opened bugs

Work habits Bug report Bug fix People

33

Predicting NOT re-opened bugs

Pr: 93 %Re: 71 %

Work habits Bug report Bug fix People

Pr: 97%Re: 91%

Pr: 93%Re: 39%

Pr: 91%Re: 66%

34

Predicting re-opened bugs

Pr: 97 %Re: 90 %

Pr: 63 %Re: 85 % Re-opened

Not Re-opened

+ ++

Bug report re-opened Bug report NOT re-opened

RecallPrecision

35

Predict re-opened

bugs

Mine code and bug

repositories

Approach overview

Attributes of re-opened

bugs

Measure performance

36Work habits Bug report Bug fix People

RecallPrecision

Prec

isio

n an

d re

call

quan

tity

Predicting re-opened bugs

37

Which attributes best indicate re-opened bugs?

Month (9)Time (1)

Work habits

Comment text (10)

Bug report Bug fix

Time to fix (7)Last status (3)

People

Fixer (5)Reporter (5)

38

Bug report

39

A typical work day…

40

Bug report attributes1. Component 2. Platform3. Severity 4. Priority5. CC list6. Priority changed7. Description size 8. Description text9. Number of comments10. Comment size11. Comment text

Metadata

Textualdata

top related