Software Cost Software Cost EstimationEstimation
Strictly speaking effort!
강릉대학교 컴퓨터공학과권 기 태
Slide Slide 22: : 23年 4月 20日23年 4月 20日
AgendAgendaa
1. Background
2. “Current” techniques
3. Machine learning techniques
4. Assessing prediction systems
5. Future avenues
Slide Slide 33: : 23年 4月 20日23年 4月 20日
1. 1. BackgroundBackground
Scope:
software projects
early estimates
effort ≠ cost
estimate ≠ expected answer
Slide Slide 44: : 23年 4月 20日23年 4月 20日
What the Papers What the Papers Say...Say...
From Computing, 26 November 1998:
Defence system never worked
MoD project loses £34m The Ministry of Defence has been forced to write off £34.6 million on an IT project it commissioned in 1988 and abandoned eight years later, writes Joanne Wallen. The Trawlerman system, designed ...
Slide Slide 55: : 23年 4月 20日23年 4月 20日
The The ProblemProblemSoftware developers need to predict, e.g.
effort, duration, number of features
defects and reliability
But ...
little systematic data
noise and change
complex interactions between variables
poorly understood phenomena
Slide Slide 66: : 23年 4月 20日23年 4月 20日
So What is an So What is an Estimate?Estimate?
An estimate is a prediction based upon probabilistic assessment.
p
effort0
most likely
equal probability of under / over estimate
Slide Slide 77: : 23年 4月 20日23年 4月 20日
Some Causes of Poor Some Causes of Poor EstimationEstimation
We don’t cope with political
problems that hamper the
process.
We don’t develop estimating
expertise.
We don’t systematically use
past experience.Tom DeMarcoControlling Software Projects. Management, Measurement and Estimation. Yourdon Press: NY, 1982.
Slide Slide 88: : 23年 4月 20日23年 4月 20日
2. “2. “Current” Current” TechniquesTechniques
Essentially a software cost estimation system is an input vector mapped to an output.
expert judgement
COCOMO
function points
DIY models
Barry Boehm“Software Engineering Economics,” IEEE Transactions on Software Engineering, vol. 10, pp. 4-21, 1984.
Slide Slide 99: : 23年 4月 20日23年 4月 20日
2.1 2.1 Expert Expert JudgementJudgement
Most widely used estimation technique
No consistently “best” prediction system
Lack of historical data
Need to “own” the estimate
Experts plus … ?
Slide Slide 1010: : 23年 4月 20日23年 4月 20日
Expert Judgement Expert Judgement DrawbacksDrawbacks
BUT Lack of objectivity
Lack of repeatability
Lack of recall /awareness
Lack of experts!
Preferable to use more than one expert.
Preferable to use more than one expert.
Slide Slide 1111: : 23年 4月 20日23年 4月 20日
What Do We Know About What Do We Know About Experts?Experts?
Most commonly practised technique.
Dutch survey revealed 62% of estimators used intuition supplemented by remembered analogies.
UK survey - time to estimate ranged from 5 minutes to 4 weeks.
US survey found that the only factor with a significant positive relationship with accuracy was responsibility.
Slide Slide 1212: : 23年 4月 20日23年 4月 20日
Information Information UsedUsed
Design requirements
Resources available
Base product/source code (enhancement projects)
Software tools available
Previous history of product
...
Slide Slide 1313: : 23年 4月 20日23年 4月 20日
Information Information NeededNeeded
Rules of thumb
Available resources
Data on past projects
Feedback on past estimates
...
Slide Slide 1414: : 23年 4月 20日23年 4月 20日
Delphi Delphi Techniques?Techniques?
Methods forstructuring group communication processes
tosolve complex problems.
Characterised byiterationanonymity
Devised by Rand Corporation (1948). Refined by Boehm (1981).
Slide Slide 1515: : 23年 4月 20日23年 4月 20日
Stages for Delphi Stages for Delphi ApproachApproach
1. Experts receive spec + estimation form
2. Discussion of product + estimation issues
3. Experts produce individual estimate
4. Estimates tabulated and returned to experts
5. Only expert's personal estimate identified
6. Experts meet to discuss results
7. Estimates are revised
8. Cycle continues until an acceptable degree of convergence is obtained
Slide Slide 1616: : 23年 4月 20日23年 4月 20日
Wideband Delphi Wideband Delphi FormForm
Project: X134 Date: 9/17/03
Estimator: Hyolee
Estimation round: 1
0 10 20 30 40 50
x x* x x! x x x
Key: x = estimate; x* = your estimate; x! = median estimate
Slide Slide 1717: : 23年 4月 20日23年 4月 20日
Observing Delphi Observing Delphi GroupsGroups
Four groups of MSc student
Developing a C++ prototype for some simple scenarios
Requested to estimate size of prototype (number of delimiters)
Initial estimates followed by 2 group discussions
Recorded group discussions plus scribes
Slide Slide 1818: : 23年 4月 20日23年 4月 20日
Delphi Size Estimation Delphi Size Estimation ResultsResults
Estimation Mean Median Min Max
Initial 371 160.5 23 2249Round 1 219 40 23 749Round 2 271 40 3 949
Absolute errors
Slide Slide 1919: : 23年 4月 20日23年 4月 20日
Converging Converging GroupGroup
0
50
100
150
200
250
300
350
400
450
Initial Size Round1 Size Round2 Size
Series1Series2Series3
true size
Slide Slide 2020: : 23年 4月 20日23年 4月 20日
A Dominant A Dominant IndividualIndividual
0
500
1000
1500
2000
2500
3000
Initial Size Round1 Size Round2 Size
Series1Series2Series3
true size
Slide Slide 2121: : 23年 4月 20日23年 4月 20日
2.2 2.2 COCOMOCOCOMO
Best known example of an algorithmic cost model. Series of three models: basic, intermediate and detailed.
Models assume relationships between: size (KDSI) and effort effort and elapsed time
MMa.KDSIb
TDEVc.MMd
http://sunset.usc.edu/COCOMOII/cocomo.html
Barry Boehm“Software Engineering Economics,” IEEE Transactions on Software Engineering, vol. 10, pp. 4-21, 1984.
Slide Slide 2222: : 23年 4月 20日23年 4月 20日
COCOMO COCOMO contd.contd.
Model coefficients are dependant upon the type of project:
organic: small teams, familiar application
semi-detached embedded: complex organisation,
software and/or hardware interactions
Slide Slide 2323: : 23年 4月 20日23年 4月 20日
COCOMO Cost COCOMO Cost DriversDrivers
• product attributes• computer attributes• personnel attributes• project attributes
Drivers hard to empirically validate.Many are inappropriate for 1990's e.g. database size.Drivers not independent e.g. MODP and TOOL.
Drivers hard to empirically validate.Many are inappropriate for 1990's e.g. database size.Drivers not independent e.g. MODP and TOOL.
Slide Slide 2424: : 23年 4月 20日23年 4月 20日
COCOMO COCOMO AssessmentAssessment
Very influential, non-proprietory model.
Drivers help the manager understand the impact of different factors upon project costs.
Hard to port to different development environments without extensive re-calibration.
Vulnerable to mis-classification of development type
Hard to estimate KDSI at the start of a project.
Slide Slide 2525: : 23年 4月 20日23年 4月 20日
2.3 2.3 What are Function What are Function Points?Points?
A synthetic (indirect) measure derived from
a software requirements specification of
the attribute functionality.
This conforms closely to our notion of
specification size.
Uses:
effort prediction
productivity
Slide Slide 2626: : 23年 4月 20日23年 4月 20日
Function Points (a brief Function Points (a brief history)history)
Albrecht developed FPs in mid 1970's at IBM.
Measure of system functionality as opposed to size.
Weighted count of function types derived from specification:
interfaces
inquiries
inputs / outputs
files
A. Albrecht and J. Gaffney, “Software function, source lines of code, and development effort prediction: a software science validation,” IEEE Transactions on Software Engineering, vol. 9, pp. 639-648, 1983.C. Symons, “Function Point Analysis: Difficulties and Improvements,” IEEE Transactions on Software Engineering, vol. 14, pp. 2-11, 1988.
A. Albrecht and J. Gaffney, “Software function, source lines of code, and development effort prediction: a software science validation,” IEEE Transactions on Software Engineering, vol. 9, pp. 639-648, 1983.C. Symons, “Function Point Analysis: Difficulties and Improvements,” IEEE Transactions on Software Engineering, vol. 14, pp. 2-11, 1988.
Slide Slide 2727: : 23年 4月 20日23年 4月 20日
Function Point Function Point RulesRules
Weighted count of different types of functions:external input types (4) e.g. file names
external output types (5) e.g. reports, msgs.
inquiries (4) i.e. interactive inputs needing a response
external files (7) i.e. files shared with other software systems
internal files (10) i.e. invisible outside system
The unadjusted count (UFC) is the weighted sum
of the count of each type of function.
Slide Slide 2828: : 23年 4月 20日23年 4月 20日
Function Function TypesTypes
Type Simple Average Complex
External input 3 4 6
External output 4 5 7
Logical int. file 7 10 15
Ext. interface 5 7 10
Ext. inquiry 3 4 6
Slide Slide 2929: : 23年 4月 20日23年 4月 20日
Adjusted Adjusted FPsFPs
14 factors contribute to the technical complexity factor (TCF), e.g. performance, on-line update, complex interface.
Each factor is rated 0 (n.a.) - 5 (essential).
TCF = 0.65 + (sum of factors)/100
Thus TCF may range from 0.65 to 1.35, and
FP = UFC*TCF
Slide Slide 3030: : 23年 4月 20日23年 4月 20日
Technical Complexity Technical Complexity FactorsFactors
Data communicationsDistributed functionsPerformanceHeavily used configurationTransaction rateOnline data entryEnd user efficiency
Online updateComplex processingReusabilityInstallation easeOperational easeMultiple sitesFacilities change
Slide Slide 3131: : 23年 4月 20日23年 4月 20日
Function Points and Function Points and LOCLOC
Language LOC per FPAssembler 320C 150 (128)COBOL 106 (105)Modula-2 71 (80)4GL 40 (20)Query languages 16 (13)Spreadsheet 6
Behrens (1983), IEEE TSE 9(6).C. Jones “Applied Software Measurement, McGraw-Hill (1991)
Slide Slide 3232: : 23年 4月 20日23年 4月 20日
FP Based FP Based PredictionsPredictions
Simplest form is:
effort = FC + p * FP
Need to determine local productivity, p and fixed costs, FC.
10000
20000
30000
40000
500 1000 1500 2000
FP
EFFORT
Effort v FPs at XYZ Bank
Slide Slide 3333: : 23年 4月 20日23年 4月 20日
All environments are not All environments are not equal equal
Productivity figures in FPs per 1000
hours:
IBM 29.6
Finnish 99.5
Canada 58.9
Mermaid 37.0
US 28.5
trainingpersonnelmanagementtechniquestoolsapplicationsetc.
Slide Slide 3434: : 23年 4月 20日23年 4月 20日
Function Point Function Point UsersUsers
Widely used, (e.g. government, financial organisations) with some success: monitor team productivity cost estimation
Most effective where homogeneous environment
Variants include Mk II Points and Feature Points
Slide Slide 3535: : 23年 4月 20日23年 4月 20日
Function Point Function Point WeaknessesWeaknesses Subjective counting (Low and Jeffery report
30% variation between different analysts). Hard to automate. Hard to apply to maintenance work. Not based upon organisational needs, e.g. is
it productive to produce functions irrelevant to the user?
Oriented to traditional DP type applications. Hard to calibrate.
Frequently leads to inaccurate prediction systems.
Slide Slide 3636: : 23年 4月 20日23年 4月 20日
Function Point Function Point StrengthsStrengths
The necessary data can be available early on in a project.
Language independent.
Layout independent (unlike LOC)
More accurate than estimated LOC?
What is the alternative?
Slide Slide 3737: : 23年 4月 20日23年 4月 20日
2.4 2.4 DIY DIY modelsmodels
250
500
750
1000
75 150 225
FILES
ACT
Predicting effort using number of files
Slide Slide 3838: : 23年 4月 20日23年 4月 20日
To introduce an economies or diseconomies of scale exponent:
effort = p * Se
where 0<e.
An empirical study of 60 projects at IBM Federal Systems Division during the mid 1970s concluded that effort could be modelled as:
effort (PM) = 5.2 * KLOC0.91
A Non-linear A Non-linear ModelModel
Slide Slide 3939: : 23年 4月 20日23年 4月 20日
Productivity and Productivity and SizeSize
Effort (PM) Size (KLOC) KLOC/PM
42.27 10 0.24
79.42 20 0.25
182.84 50 0.27
343.56 100 0.29
2792.57 1000 0.36
Productivity and Project Size using the Walston and Felix Model
Slide Slide 4040: : 23年 4月 20日23年 4月 20日
Productivity v Productivity v SizeSize
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 200 400 600 800 1000 1200
KLOC
Slide Slide 4141: : 23年 4月 20日23年 4月 20日
Bespoke is Bespoke is Better!Better!
Model Researcher MMRE
Basic COCOMO Kemerer 601%
FP Kemerer 103%
SLIM Kemerer 772%
ESTIMACS Kemerer 85%
COCOMO Miyazaki & Mori 166%
Intermediate COCOMO Kitchenham 255%
Slide Slide 4242: : 23年 4月 20日23年 4月 20日
So Where Are So Where Are We?We?
• A major research topic.
• Poor results “off the shelf”.
• Accuracy improves with calibration but still mixed.
• Needs accurate, (largely) quantitative inputs.
Slide Slide 4343: : 23年 4月 20日23年 4月 20日
3. 3. Machine Learning Machine Learning TechniquesTechniques
A new area but demonstrating promise.
System “learns” how to estimate from a training set.
Doesn’t assume a continuous functional relationship.
In theory more robust against outliers, more flexible types of relationship.
Du Zhang and Jeffrey Tsai, “Machine Learning and Software Engineering,” Software Quality Journal, vol. 11, pp. 87-119, 2003.
Du Zhang and Jeffrey Tsai, “Machine Learning and Software Engineering,” Software Quality Journal, vol. 11, pp. 87-119, 2003.
Slide Slide 4444: : 23年 4月 20日23年 4月 20日
Different ML Different ML TechniquesTechniques
Case based reasoning (CBR) or analogical reasoning
Neural nets
Neuro-fuzzy systems
Rule induction
Meta-heuristics e.g. GAs, simulated annealing
Slide Slide 4545: : 23年 4月 20日23年 4月 20日
Case Based Case Based ReasoningReasoning
new case
new case
retrieved case
previous cases
solved case
tested / repaired case
general knowledge
RETRIEVE
REUSEREVISE
RETAIN
suggested solution
confirmed solution
problem
Slide Slide 4646: : 23年 4月 20日23年 4月 20日
Using Using CBRCBR
Characterise a project e.g.
no. of interrupts
size of interface
development method
Find similar completed projects
Use completed projects as a basis for estimate (with adaptation)
Slide Slide 4747: : 23年 4月 20日23年 4月 20日
ProbleProblemsms Finding the analogy, especially
in a large organisation.
Determining how good the analogy is
Need for domain knowledge and expertise for case adaptation.
Need for systematically structured data to represent each case.
Slide Slide 4848: : 23年 4月 20日23年 4月 20日
ANGANGELEL
http://dec.bmth.ac.uk/ESERG/ANGEL/
ANaloGy Estimation tooL (ANGEL)
Slide Slide 4949: : 23年 4月 20日23年 4月 20日
ANGEL ANGEL FeaturesFeatures Shell
n features (continuous or categorical)
Brute force search for optimal subset of features — O((2**n) -1)
Measures Euclidean distance (standardised dimensions)
Uses k nearest cases.
Simple adaptation strategy (weighted mean). With k=1 becomes a NN technique
Slide Slide 5050: : 23年 4月 20日23年 4月 20日
CBR CBR ResultsResults
A study of 275 projects from 9 datasets suggests that CBR outperforms more traditional statistical methods e.g. stepwise regression.
Shepperd, M. Schofield, C. IEEE Trans. on Softw. Eng. 23(11), pp736-743.
Slide Slide 5151: : 23年 4月 20日23年 4月 20日
Sensitivity Sensitivity AnalysisAnalysis
0
20
40
60
80
100
120
140
160
180
2003 5 7 9 11 13
15
17
19
21
23
25
27
29
31
No. of Projects
% M
MR
E T1T2
T3
Slide Slide 5252: : 23年 4月 20日23年 4月 20日
Independent ReplicationIndependent Replication
Niessink and van Vliet (1997)
Stensrud and Myrtviet (1998, 99)
Jeffery and Walkerden (1999)
no search for best subset of features
Briand and El Eman (1998)
approx. 30 features so exhaustive search for best subset not possible
homogeneity + well defined relationships favour regression techniques
Slide Slide 5353: : 23年 4月 20日23年 4月 20日
Artificial Neural Artificial Neural NetsNets
Input layer
Hidden layers Output layer
effort
FP
# files
# screens
team size
A multi-layer feed forward ANN
Slide Slide 5454: : 23年 4月 20日23年 4月 20日
ANN ANN ResultsResults
Study LearningAlgorithm
n Results
Venkatachalam BP 63 “Promising”Wittig & Finnie BP 81
136 MMRE = 29%
Jorgenson BP 109 MMRE = 100%Serluca BP 28 MMRE = 76%Karunanithi etal.
Cascade-Correlation
N/A “More accuratethan algorithmicmodels”
Samson et al BP 63 MMRE = 428%Srinivasan &Fisher
BP 78 MMRE = 70%
Hughes BP 33 MMRE = 55%
BP = back propagation learning algorithm
Slide Slide 5555: : 23年 4月 20日23年 4月 20日
ANN ANN LessonsLessons
need large training sets
deal with heterogeneous datasets
opaque (poor explanatory power)
sensitive to choices of topology and learning algorithm
problems of over adaptation (neuro-fuzzy approaches?)
Slide Slide 5656: : 23年 4月 20日23年 4月 20日
Rule Rule InductionInduction
IF module_size > 100 THEN
high_development_effort
ELSE
IF developer_experience < 2
THEN
low_development_effort
ELSE
moderate_development_effort
C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster, “An investigation of machine learning based prediction systems,” J. of Systems Software, vol. 53, pp. pp23-29, 2000.
Slide Slide 5757: : 23年 4月 20日23年 4月 20日
Machine Learning Machine Learning SummarySummary
Need training sets
ANNs require significant sized sets n≈50
Configuring the system can be a hard search problem
Don’t need to specify the form of the relationship in advance
Can produce more accurate results than other methods
Adapts as new cases acquired
Slide Slide 5858: : 23年 4月 20日23年 4月 20日
4. 4. Assessing Estimation Assessing Estimation SystemsSystems
accuracy
tolerant of measurement error
explanatory power
ease of use
availability of inputs
...
Slide Slide 5959: : 23年 4月 20日23年 4月 20日
Assessing Model Assessing Model PerformancePerformance
Absolute error
Percentage error and mean percentage error
Magnitude of relative error and mean magnitude of relative error (MMRE)
PRED(n)
Sum of the squares of the residuals (SSR)
...
Slide Slide 6060: : 23年 4月 20日23年 4月 20日
Absolute Absolute ErrorError
But it fails to take into account the size of the project. A 6 PM error is serious if predicted is only 3 PMs, yet, a 6 PM error for a 3,000 PM project is a triumph.
Epred Eact
Slide Slide 6161: : 23年 4月 20日23年 4月 20日
Percentage Percentage ErrorError
or for more than one estimate the mean percentage error:
where n is the number of estimates.
• Reveals any systematic bias to a predictive model, e.g. if the model always over-estimates then the percentage error will be positive.
• A weakness is that it will mask compensating errors
• Reveals any systematic bias to a predictive model, e.g. if the model always over-estimates then the percentage error will be positive.
• A weakness is that it will mask compensating errors
Epred Eact Eact
1n.
Epred Eact Eact
i1
in
i
Slide Slide 6262: : 23年 4月 20日23年 4月 20日
MMRMMREE
MMRE is defined as:
Masks any systematic bias but highlights overall accuracy.
Penalises regression derived models based on least squares algorithms.
1n.
Epred Eact Eact
i1
in
i
Slide Slide 6363: : 23年 4月 20日23年 4月 20日
PRED(nPRED(n))Conte et al. suggest ≤ 25% as an indicator of an acceptable prediction model.
PRED(25) measures the % of predictions that lie within 25% of actual values.
PRED(25) ≥ 75% is a typical target (seldom achieved!)
Slide Slide 6464: : 23年 4月 20日23年 4月 20日
Sum of the Squared Sum of the Squared ResidualsResiduals
If you are risk averse it penalises large deviations more than small ones
SSR = ∑ (Epred-Eact)2
Can also compute mean square error.
Slide Slide 6565: : 23年 4月 20日23年 4月 20日
A Comparison Case A Comparison Case StudyStudy
Statistic LSR Robust MedianR-squared 0.28 0.25 0.26
MMRE 0.78 0.62 0.62
Pred (25) 45% 35% 35%
Balanced MMRE 0.84 0.78 0.77
Slide Slide 6666: : 23年 4月 20日23年 4月 20日
So What’s Going So What’s Going On?On?
central tendency (mean, median)
spread (variance, kurtosis + skewness)
The ith residual is ii yy ˆ
M. J. Shepperd, M. H. Cartwright, and G. F. Kadoda, “On building prediction systems for software engineers,” Empirical Software Engineering, vol. 5, pp175-182, 2000.
Slide Slide 6767: : 23年 4月 20日23年 4月 20日
Estimation Estimation ObjectivesObjectives
Objective Indicator Type
Risk averse sum of squares spread
Error minimising median absolute error spread
Portfolio total error centre
Slide Slide 6868: : 23年 4月 20日23年 4月 20日
5. 5. SummarySummary
Accuracy is a non-trivial concept
No ‘best’ technique
Algorithmic models need to be calibrated
Simple linear models can be surprisingly effective
ANNs need large, not necessarily homogeneous training sets
Evidence to suggest that CBR is often the most accurate and most robust technique
Slide Slide 6969: : 23年 4月 20日23年 4月 20日
Some Estimation Some Estimation GuidelinesGuidelines
Collect data
Use more than one estimating technique.
Minimise the number of cost drivers / coefficients in a model to facilitate calibration:
smaller, more homogeneous data sets
look for simple solutions first
Exploit any local structure or standardisation.
Remember an estimate is a probabilistic statement (bounds?).
Provide feedback for estimators.
Slide Slide 7070: : 23年 4月 20日23年 4月 20日
Future Future AvenuesAvenues
Great need for useful prediction systems
Consider the nature of the prediction problem
Combining prediction systems
Collaboration with experts
Managing with little or no systematic data
Slide Slide 7171: : 23年 4月 20日23年 4月 20日
Experts plus … Experts plus … ?? Experiment by Myrtveit and Stensrud
using project managers at Andersen Consulting
Asked subjects to make predictions
Found expert+tool significantly better than either expert or tool alone.
?What type of estimation systems are easiest to collaborate with? I. Myrtveit and E.
Stensrud, “A controlled experiment to assess the benefits of estimating with analogy and regression models,” IEEE Trans on Softw. Eng, 25, pp510-525, 1999.