COPING WITH COMPLEX PROGRAMS

Quality control using software metrics

n1 = number of unique operators
N1 = total number of operators
n2 = number of unique operands
N2 = total number of operands

Halstead's entire theoretical framework is based on these four parameters. For example, the "length" of a program (N) is defined as the total number of operators and operands used. Also, the vocabulary (n) of a program is the total number of unique operators and operands.

  
N (Length) = N1 + N2   
an (Vocabulary) = n1 + n2

Software Science Counting Rules

What is an "operator," and what is an "operand" in a program? How do you compute the values of n1, N1, n2, N2 for a program? The best way to illustrate this is through an example. Figure 1 is a program fragment and Figure 2 shows a tally of the operators and operands. Notice that paired items such as {}, ( ) are counted as operators. In some situations, there may be differences in opinion as to what is an operator and what is an operand. The statement goto Label can be legitimately viewed as consisting of the operator goto and the operand Label. You could also treat the whole statement as an operator. Several researchers have shown that minor differences in counting rules have little impact on the effectiveness of Software Science as long as the rules are applied consistently.

The Length Equation and Purity Ratio

Several interesting relations are derived based on the count of operators and operands. The first one is the predicted "length" of a program. Remember, that the actual "length" (N) is defined as N1 + N2 or simply the count of all the operators and operands in a program. Halstead theorized that a well-written program should have a predicted length, N^ (pronounced N-hat) as shown here:

Figure 1: Program fragment 1

  int      atoi(NumString)
  char     NumString[I]
  {
           int          Number;
           int          i;
           i = 1;
           Number = O;
           while(NumString[I] !='\0')
           {
             Number = Number + NumString[I] - '0';
              i ++;
           }
           return Number;
  }/* end
  atoi */

Figure 2: Software science counting rules applied to the program in Figure 1

  Operator     Count      Operand                Count
  ___________________    ______________________________

  =              3       i                         4
  ;              5       Number                    4
  ()             1       NumString                 2
  []             2       '/O'                      1
  {}             2       'O'                       1
  !=             1
  +              1       N2          =            12
  )              1       n2          =             5
  ++             1       n1          =             9
                         n           =            14
  N1      =     17       N           =            29

  N^ = [n1 X log[2](n1)] + [n2 x log[2](n2)]

This equation is called the length equation. If the actual length, N, is different from the predicted length, the program could have any of the following six classes of impurities: canceling of operators, ambiguous operands, synonymous operands, common subexpressions, unnecessary replacements, or unfactored expressions.

Based on N and N^, the purity ratio is defined as N^/N. A purity ratio of 1 would imply few impurities. C programs usually have a purity ratio greater than 1. This means that the actual length, N, is less than the predicted length, N^. This could be because logic can be expressed very succinctly in C.

What is interesting about the length equation is that the first term [n1 x log [2] (n1)] can be treated as a constant. The rationale for this is that the number of unique operators in a programming language is finite, and most programmers have a small subset of operators they regularly use. The second term [n2 x log[2] (n2)] is the one that has a greater impact on the predicted length. If you can compute an accurate estimate of how many operands will be used in a program, you can predict how large a program is going to be.

Program Volume and Estimated Errors

Halstead also theorized that if a program has "n" (n = n1 + n2), unique operators and operands, the number of bits used to represent or encode these as a binary number would be log[2](n). Since there are "N" (N = N1 + N2) such usages of these operators and operands, the total number of bits used to represent the program would be N x log[2](n). This expression is termed the volume (V) of a program.

  V (Volume) = N x log[2](n)

Consider two programs, each with 200 uses of operators and operands (N = 200). The first program uses 16 unique operators and operands, while the second uses 64 unique operators and operands. The volume for the first program is 200 x log[2] (16) = 800, while the volume for the second program is 200 X log[2] (64) = 1200. The volume is a better estimate of program size than a simple count of operators and operands (N), since the value is larger with a larger number of unique operators and operands.

Another way of looking at the program volume is that N total operators and operands will require log[2](n) mental lookups using the best search algorithm. The total number of mental lookups is N x log[2](n), same as the program volume. On the average, people tend to make mistakes every E0 mental comparisons. Work done by psychologists shows that the value of E0 is about 3,000 to 3,200. The number of estimated errors, B^ is

  B^ = V/E0 = [N x log[2](n)]/E0

A programmer with superior ability will have a larger value of E0, and someone with an inferior ability will have a lower value of E0. The programmer with superior ability will make fewer errors. If the actual number of errors is recorded for a number of programs, a project manager can determine the E0 values for the team members.

Program Effort and Estimated Time

You can expect that the effort required to develop a program will be greater for a program with larger program volume. A parameter called "effort" (or E) is defined as

  E = V/L

where V is the volume of a program and L is an estimate of the level of abstraction in a program. The level of abstraction is defined as

  L = V*/V

where V is the volume measure (as previously defined) and the new V* is the volume a program would have if it could be represented as a procedure call within the language. An example will clarify this.

Consider a program to print a report from a data file. A procedure call to this program could be written as

  PrintReport(DataFile)

The volume (V*) of this statement would be 3 x log[2](3) = 4.74. The volume (V*) is called the potential volume. An estimate for L (L^) is often used instead of the equation L = V*/V, which is based on the number of operators and operands. This estimate is given by the following empirical formula:

  L^ = 2/n1 x n2/N2

A good thing about this estimate for the level of abstraction is that it is easily computed by a simple analysis without knowing too much about the internal structure of a program.

What do all the given equations mean? Halstead suggested that the effort (E) represents the number of "mental discriminations" a programmer would have to perform in order to write the program. A psychologist, Stroud, found that humans are capable of making five to twenty mental discriminations per second. This number is called the Stroud Number, S. You now have the following equation:

Estimated time to develop a program,

  T^ = E/S

A common value of "S" for programmers is 18.

Cyclomatic Complexity

  Cyclomatic Number of graph "G,"                      V(G) = e - n + 2

where n is the number of nodes in the graph, and e is the number of edges or lines connecting each node. In Figure 4, n is 4, and e is 4, giving a cyclomatic number of 2.

A simpler way of calculating the cyclomatic number is to add 1 to the sum of all the decision-making statements (if, while, for, and so on). In the example in Figure 3, there is only one if statement, and adding 1 gives 2.

Extended Cyclomatic Complexity is defined as including not just decision-making statements, but also decision-making predicates. Examples of decision-making predicates are the logical expressions in an if statement that are connected by logical and, or. A shortcut method of calculating the Extended Cyclomatic Complexity is

Figure 3: Program fragment 2

  Init( ):                            /* Node A */
  if (ReportOption)
  {
        GenReport(ReportOption);      /* Node B */
  }
  else
  {
        ReportError(ReportOption);    /* Node C */
  }
  Finish( );                          /* Node D */

Sum of the number of decisions,                        ANDs, ORs + 1.

The following if statement has an Extended Cyclomatic Complexity of 5 because there is 1 if statement, 2 AND (&&), 1 OR (| |).

  if (_dbStatus == 0 && IsValid( ) | |
             _dbStatus > 10 && IsExpr())
{     
    return 0;   
}

Uses of Software Metrics

The useful thing about all these metrics is that they actually help develop better code. Here are some examples.

After writing a piece of code, a software developer can compute the software metrics (using a tool, of course,) to provide feedback about potentially troublesome aspects of a program. The software metrics can be compared against a norm that is user-definable. Remedial action, if necessary, can then be taken. This could be a different approach or algorithm that will produce less complex code, breaking the program down to simpler modules. There are applications where even the best solution may be very complex. In this case, software metrics can identify the modules that need better and extra documentation.

A small percentage of modules in a software system has an inordinate amount of errors. Software metrics can identify the more error-prone modules, and greater resources can be allocated to testing them. During the maintenance phase, software metrics can be used as one of the tools to measure the amount of effort and time required to make changes to existing modules.

Software metrics should be viewed as one of the tools (among many) to help manage the process of developing software. However, this tool is not a replacement for common sense.

In some programs, the metrics may not accurately capture complexity and flag the programs as being overly complex or less complex than they actually are. Common sense will aid in determining if the metric results are reasonable. Applying the metrics to one or two individual programs may be disappointing. Software metrics work best when applied to a large group of programs.

Validity of Software Science and Cyclomatic Complexity

While much of the literature in software metrics reports studies supporting the software metrics described in this article, several researchers have expressed doubts about its validity. Shen, Conte, and Dunsmore voice most of the objections raised by researchers. Much of the criticism is directed at certain assumptions made by Halstead and the methodology used by those studies that support Software Science. These objections drive home the point that software metrics are effective in a statistical sense when applied consistently to a large number of programs.

References

Much of the information for this article was taken from Reference 1. References 2 and 3 are classics by the founders of the metrics examined in this article. Reference 4 is a critique on Software Science and gives examples of situations where it does not work.

1. SET Labs Inc., PO Box 03627, Portland, OR 97203, "PC-METRIC."

2. Halstead, M., "Elements of Software Science," Elsevier North Holland, New York 1977.

3. McCabe, T., "A Complexity Measure," IEEE Transactions on Software Engineering (Dec. 1976): 308 - 320.

4. V. Shen, S. Conte, and H. Dunsmore, "Software Science Revisited: A Critical Analysis of the Theory and Its Empirical Support," IEEE Transactions on Software Engineering (March 1983): 155 - 165.