Florida Atlantic University
Empirical Software Engineering Laboratory


Projects HELP

Current projects


Past Sponsors


Project for EMERALD, a Nortel Networks business unit

EMERALD

Enhanced Measurement for Early Risk Assessment of Latent Defects (EMERALD) is a project of Nortel for assessing reliability risk for software developers and managers. Florida Atlantic University has been a participant in the EMERALD Project since 1995. The focus of the work has been software metrics and software quality modeling. Current plans include software quality modeling of systems developed under the object-oriented paradigm, and refining models of system developed under the procedural paradigm. Technology transfer to the EMERALD project is also an integral part of the work. Participation in EMERALD thus far has resulted in technical reports, software tools, plus publications in refereed journals or conference proceedings.


Project for NSF

Information Theory-Based Measurement of Software Designs

Information Theory-Based Measurement of Software Designs is a research project that is defining and validating a new generation of software design metrics, based on information theory, which in turn, will facilitate a new level of cost-effective improvement to software quality. Dr. Taghi M. Khoshgoftaar is the Principal Investigator.

Software is a significant component of a wide range of products, from cameras to airline reservation systems. As society depends more and more on software-based products, high quality software becomes a necessity. All too often, poor software quality threatens public safety, costs money, or inconveniences customers. More than ever before, high software quality will be required in the economy of the twenty-first century. Models based on software metrics can predict quality factors early enough for corrective action to be cost-effective, and thus, contribute to the quality of the delivered product.

Much of software-metrics research to date has been based on analysis of code and control flow graphs, which are available rather late in development. In contrast, this project is developing metrics on abstractions of software that are created during high-level design, so that quality problems may be detected earlier. Software engineers often discuss the quality of software design in terms of vague attributes, such as ``complexity'', ``cohesion'', and ``coupling''. Since many design abstractions are represented by graphs, metrics of graph attributes will likely find broad application. This project is defining and validating specific metrics that have the formal properties recently proposed for attributes of directed graphs in general, namely, size, length, complexity, cohesion, and coupling.

Software is, in essence, information. It is the product of a large number of design decisions on many levels. Each decision is an element of information. Many traditional software metrics count features, as if all items were equally important in the design process. Information theory is an alternative approach that focuses on the amount of information in an attribute, rather than a count. This project is establishing a new generation of information theory-based software design metrics by defining and validating metrics for size, length, complexity, cohesion, and coupling.

The first step is to formulate sound definitions, followed by validation of the new design metrics through empirical case studies and analysis of properties. The Principal Investigator's extensive experience with software quality modeling is being brought to bear on these issues. Collaboration with a large telecommunications equipment manufacturer is providing the basis for credible case studies, where the system studied is being developed by a group of professionals in an industrial environment, and is large enough to be comparable to other industry projects. The data-collection phase acquires historical data on projects where actual software quality is known. A case study builds and evaluates models that could have been developed during the historical project, calculating predictions that could have been made, and evaluating the accuracy of those predictions against the actual software quality. Such studies illustrate the usefulness of a proposed metric in a real-world setting, and provide practical examples of how to create software quality models. This project is an important step toward the long-term goal to develop modeling methods for predicting software quality. Software metrics are the foundation of such models.


Project for NSF

Discovering Software Process Measures Using Genetic Programming

Discovering Software Process Measures Using Genetic Programming is a research project that is developing automated methods for finding algorithms that measure novel and useful attributes of software development processes using genetic programming. Dr. Taghi M. Khoshgoftaar is the Principal Investigator, Dr. Matthew P. Evett is co-Principal Investigator, and Dr. Edward B. Allen is Senior Research Associate.

High quality software is essential for mission-critical systems. However, assuring high quality often entails time-consuming, costly development processes, such as more rigorous design and code reviews, automatic test-case generation, extended testing, strategic assignment of key personnel, and reengineering of high-risk portions of a system. One cost-effective strategy is to target enhancement activities to those modules that are most likely to have faults. When the methods being developed by this project are applied to a software development project, the resulting software process metrics will be novel, specialized for the development organization, and useful for predicting software quality.

A ``feature tracking system'' tracks the implementation of detailed requirements similar to the way a problem reporting system tracks the resolution of problems. This project is focusing on data stored in both kinds of systems. Many developers use such systems to manage implementation of requirements and fixing bugs. Although these kinds of systems appear to be a rich source of information on development process history, this readily available data has been almost completely unused for predicting software quality.

Genetic programming is a promising technology that is well-suited to searching for novel combinations of algorithmic elements that satisfy complex criteria. GP is a member of the evolutionary computation family of adaptive search techniques, whose defining characteristic is an adaptation mechanism based upon neo-Darwinian natural selection. Several characteristics of GP make it particularly appropriate here:

  1. GP offers an automated way to discover process metrics that are unique to the particular software development organization under study.
  2. GP enables the exploitation of identifiers, categories, dates, and text, as well as counts and quantities.
  3. GP makes no prior assumptions about the functional form of relationships among algorithmic elements.
  4. Objective functions in GP can be defined in many ways, and in particular, a ``useful'' metric is defined in this project by statistics over a set of modules.
  5. GP systems can be designed so that the resulting measurement tools execute quickly.

This project's contributions to the state of the art are

  1. Empirical exploration of data stored by feature tracking systems to discover software process metrics that are useful as independent variables in software quality models.
  2. Similarly, empirical exploration of problem-reporting-system data to discover useful software process metrics.
  3. An automated method for goal-driven discovery and measurement of specialized software process metrics using GP.
  4. A demonstration of the application of GP to a new domain, namely, software measurement.
This improvement in the state of the art will enable software developers to achieve practical, cost-effective process improvement guided by predictions from software quality models at a new level of accuracy and robustness. Moreover, costly enhancement techniques, such as automatic test-case generation, will become more worthwhile when they are targeted to modules with high risk of faults.

Neural Network Models of Software Quality

Our research team has found that artificial neural networks can model software quality with better results than competing statistical techniques. There remain numerous empirical research questions regarding the practical application of neural network technology to predicting software quality. This project is exploring additional network architectures, and training algorithms. Our goal is to improve model accuracy, and to apply the technology to additional kinds of data. Training neural networks for industrial applications requires substantial computer power and sophisticated software tools.


Fuzzy Case-Based Reasoning for Software Engineering

Our empirical research in fuzzy logic and case-based reasoning (CBR) is applying techniques to solve problems in software engineering. Experiments are addressing issues related to solving problems with fuzzy case-based reasoning techniques. Software engineering problems such as prediction, classification, cost estimation, and design areg being considered. Specific fuzzy case-based reasoning techniques applicable to these problems are being evaluated in the laboratory. The ultimate goal is to identify robust, fast, reliable, and practical methods that contribute to software development. The computations and large volumes of data collected by industry will use the Empirical Software Engineering Laboratory's significant computing power and sophisticated software tools to conduct the necessary experiments.


Project for Nortel SEAL

Software Metrics and Models: Predicting Bugs

Software is a major component of modern telecommunications systems. Society's reliance on such systems mandates high reliability. The size and complexity of telecommunications software systems present major challenges to the developers who must deliver highly reliable software. A defect in software is called a ``fault'' or a ``bug''. Few defects implies high quality.

Nortel is a major manufacturer of telecommunications systems. The Software Engineering and Analysis Laboratory (SEAL) of Nortel Technology is responsible for managing research and development of software dependability technology for Nortel. SEAL also helps Nortel software developers introduce that technology into their projects. The SEAL strategy for software dependability includes the following techniques to provide a solid ``Pillars of Assurance''.

The Empirical Software Engineering Laboratory at Florida Atlantic University is one of SEAL's research partners. We are focusing on complexity and maintainability issues, such as predicting faults.

``An ounce of prevention is worth a pound of cure.'' It is widely recognized that discovering and correcting faults early in the development life cycle is much less expensive than during testing and especially after release. Controlling faults in software presupposes that one can predict which modules will be bug-prone early enough to take preventive action. If predictions are made during design, then design reviews could be more rigorous for bug-prone modules. Similarly, if predictions are made during coding, then code walkthroughs and unit testing could be more thorough. If predictions are available at the start of integration or system testing, then test strategies could focus on those parts of the system that are most likely to have faults. Predictions that identify potential bug-prone modules can be very valuable to software developers.

Software metrics are the foundation for such predictions [6]. Many organizations rush to collect volumes of software metric data without a plan for utilizing it in the course of development. Collecting software metrics is not enough. One must translate measurements into predictions. Software quality models are the tools needed for predicting faults, based on (1) the experience of an earlier project and (2) metrics of the current product and development process. The Software Engineering Analysis Laboratory (SEAL) at Nortel is systematically employing a modeling methodology that translates software metric data into quality predictions. The methodology was developed by the Empirical Software Engineering Laboratory (ESEL) at Florida Atlantic University. Our goal is to develop models that will predict quality factors related to faults for each module. A quantitative model is an equation (or algorithm) where the dependent variable is a function of one or more independent variables. If one supplies values for the independent variables, then one can calculate the value of the dependent variable. A software quality model has independent variables that may be measured earlier in the life cycle than the dependent variable, quality. Thus, the calculated dependent variable value is a prediction of what one expects its measured value to be.

A variable such as the number of faults is the dependent variable that we want to predict. Total faults are directly measurable only after software has become operational. In contrast, software product metrics and process metrics can be measured during development. A suitable software quality model can make predictions when it is not too late to take economical compensatory actions. We develop models based on data from a completed past project where the quality factor is available for each module. Then the model is ready to apply to a current similar project, using only software metrics to make a quality prediction for each module. With predictions in hand, a project can adapt the development process so that product reliability is improved.

Our modeling methodology [5] has been used numerous times to develop models that predict some aspect of software quality [1, 3, 4]. Other companies such as Northrop Grumman are also improving their software quality using techniques pioneered by Nortel's SEAL [2].

References

[1]T. M. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, and J. McMullan. Detection of software modules with high debug code churn in a very large legacy system. In Proceedings of the Seventh International Symposium on Software Reliability Engineering, White Plains, NY, Oct. 1996. IEEE Computer Society.

[2]T. M. Khoshgoftaar, E. B. Allen, R. Halstead, and G. P. Trio. Detection of fault-prone software modules during a spiral life cycle. In Proceedings of the International Conference on Software Maintenance, Monterey, CA, Nov. 1996. IEEE Computer Society.

[3]T. M. Khoshgoftaar, E. B. Allen, K. S. Kalaichelvan, and N. Goel. Early quality prediction: A case study in telecommunications. IEEE Software, 13(1):65--71, Jan. 1996.

[4]T. M. Khoshgoftaar, E. B. Allen, K. S. Kalaichelvan, and N. Goel. The impact of software evolution and reuse on software quality. Empirical Software Engineering: An International Journal, 1(1):31--44, 1996.

[5]T. M. Khoshgoftaar, E. B. Allen, K. S. Kalaichelvan, and N. Goel. Predictive modeling of software quality for very large telecommunications systems. In Proceedings of the International Communications Conference, volume 1, pages 214--219, Dallas, TX, June 1996. IEEE Communications Society.

[6]T. M. Khoshgoftaar and P. Oman. Software metrics: Charting the course. Computer, 27(9):13--15, Sept. 1994. Guest editors of special issue on software metrics.


Software Testability Measurement

This project, in cooperation with Reliable Software Technologies Corp., is exploring the characteristics of software testability. First, this project is empirically exploring relationships between testability and static software product metrics. Second, using case studies, this project is empirically investigating the hypothesis that testability is a valid model component for reliability prediction. Testability is defined as the probability that a set of tests will discover a fault given that one exists. For a fault to be discovered, (1) a statement with a fault must be executed, (2) the data state must become corrupted, and (3) the faulty data state must be propagated to an output state. Measuring testability implies estimating the joint probability of these events occurring given an input. This is a computation intensive process.


Project for Northrop Grumman

Software Quality Models Supporting JSTARS

The Joint Surveillance Target Attack Radar System, JSTARS, is being developed by Northrop Grumman for the U.S. Air Force in support of the U.S. Army. The system consists of an E-8 aircraft with a multimode radar system and mobile ground stations. Computer systems are both in the aircraft and on the ground. The system performs ground surveillance, providing real time detection, location, classification, and tracking of moving and fixed objects. During Operation Desert Storm in 1991, two prototypes were pressed into service. JSTARS gave air and ground commanders a real time tactical view of the battlefield every day of the war.

Tactical military software is required to have high reliability. Each software function is often considered mission-critical, and the lives often depend on failure-free operation of the software. Under the spiral life cycle model, identifying fault-prone modules early in an iteration can lead to a more reliable prototype. High reliability of each iteration translates into a highly reliable final product.

Florida Atlantic University worked with Northrop Grumman researchers to support JSTARS developers with models based on software metrics that classify software modules as either fault-prone or not fault-prone. Rather than predicting the number of faults, classification models serve quite well when early identification of the most troublesome modules is needed. Validation case studies of the models illustrated how classification prior to testing can help management to focus resources on those modules that cause the bulk of problems during testing and maintenance. Participation in the project has resulted in technical reports, models and publications in refereed conference proceedings.


Project for Chrysler Corp.

Software Reliability Modeling

The Software Reliability Modeling project for Chrysler Challenge Fund, supported improved software quality at Chrysler Corporation. The following were the objectives of this project: (1) Development of a probabilistic theory for measurement of software reliability at the time of release. Once empirically validated, this will enable establishing targets for reliability during the warranty period. (2) Development of a methodology and tools for identifying fault-prone software modules and for predicting software reliability at various points in the software development life cycle, such as design, coding, or testing phases. This will enable cost-effective strategies to reduce the risk of poor reliability at release; (3) Transfer of the developed methodologies and tools to industry.


help
BACK ESEL
[ FAU| CSE Dept.| ESEL| People| Facilities| Courses| Projects| Publications| Resources| Help ]
Created by Jason C. Busboom

Last update: 03 May 99 eba
Comments about this web page?
Contact webmaster@cse.fau.edu
URL: http://www.cse.fau.edu/research/ESEL/projects.html