Calibration with Historical Data

The use of company internal historical data improves estimation accuracy. First it has to be defined what data is collected and how it is done to guarantee a consistent data set. Afterwards calibrations can be executed like average lines of code per staff hour.

Description

Cost estimates always use explicit and implicit calibration. In small projects the outcome is influenced by individual capabilities. But the larger the project gets the efforts are either supported or undermined by organizational characteristics. The project outcome is influenced by:

  •  Software complexity
  •  Execution time constraints
  •  Required documentation
  •  Required reliability
  •  Stable vs. volatile requirements
  •  Possibility of removing or replacing a team member
  •  Interruptions from other projects
  •  Effectivity of design, construction, quality assurance and testing practices
  •  Employment turnover

All these factors are adjusted by historical data and are controlled by the organization. Most of these factors are difficult to control for one project. They are also often subjects of unfounded optimism, like "The team constellation will be better this time." or "The requirements are more stable than in the last project". Using historical data you assume that the next project will perform like the last one. The famous cost estimation specialist Lawrence Putnam says that "Productivity is an organizational attribute that cannot easily be varied from project to project". With historical data productivity is computed as the average and cannot be assumed above or below.
Besides the definition of the estimation process it is important that all practitioners receive an appropriate training.

How To

Estimation with historical data only requires a small set of data including:

  • Size: Lines of code, function points, stories, Web pages, database tables or any size related measure can be used for size measures. You have also to address two issues such as how the code is counted and what is counted (only released code, third-party code, code from previous versions, blank lines, comments, interfaces).
  • Effort: Effort measures shall consider in what units the effort is counted (hours, days or something else) and what is included (holidays, vacations, unpaid overtime, meetings, support, travel time).
  • Time: Calendar time measures are sometimes different to determine. Often it is hard to say how long a project really lasted. Before start measuring calendar time it has to be defined when project is seen to be started and ended. Well-defined project launch and project completion milestones are helpful in this area.
  • Defects: First of all you have to take into account what defects are counted (change requests, defects detected by developers or testers), when defects are considered (during development process, after a specific milestone or after the software has been released) and how the severity of a defect is classified.

McConnel recommends to start with this small set of data, so you don't can end up with data that is defined inconsistently across projects, which makes the data meaningless. Historical data is also easier to collect during project development and not afterwards since it is often difficult to reconstruct.
The final calibration is mostly linear but some models will need to be adjusted for different size ranges.
Some examples of models to create are listed below:

  •  Average lines of code per staff month
  •  Stories delivered per calendar month considering a 3-person team
  •  Average staff hours for use case creation, construction and delivery
  •  Average lines of code per function point

Calibration with Industry-Average Data

For this approach you use to data from other organizations that develop the same kind of software as your project to estimate.

The productivity rates for different organizations vary by a factor of 10. You only can guarantee a 25 to 75 percent confidentiality using industry-average data. The industry-average estimate must account differences in productivity. The estimates are less accurate due the variability and uncertainty in the productivity assumptions between different companies.
Industry-Average Data shall only be used if no historical data is available. Industry-average weights raises questions to the inherent variability of the averages, how representative the contributing systems are, and how stable the averages are over time.
Data collection is expensive and time-consuming for individual organizations. An example for a standard set of industry specific average data is the ISBSG - the International Software Benchmarking Standards Group which provides multi-organizational data sets.

Calibration with Project Data

Using data from the project itself to make estimations will account influences that are unique to the project. it can be used as well if no historical data from other project is available.

last modified by superadmin on 2009/07/27 10:35


Creator: superadmin on 2009/07/27 10:27
Copyright 2004-2010 XWiki
1.9.1.21780