Engineering 101: The Importance of CCM

What is this Acronym?

Image from http://webdocs.cs.ualberta.ca/~sorenson/cmput401/lectures/Measurement/img009.gif Cyclomatic Complexity Metric (CCM) is an extremely important number for code management.  Basically, CCM is the count of the number of paths through a function or method. Practically, the CCM of code would be a good indicator of how easy the code is to understand and maintain, with higher numbers being harder to understand and maintain. For example, consider the pseudocode:

// find the maximum of two numbers, x and y
max = y
if x > y
max = x
return max

This has a CCM of 2, because there are only two ways you can traverse it. You either execute:

max = y
return max

OR

max = y
max = x // because x > y
return max

Why Should I Bother With Another Acronym?

Functions with low CCM tend to be easier to understand, test, and maintain. Let’s think about writing a unit test for the previous example: the test only needs to pass in a few combinations of two numbers and make sure that each time the larger number is reported as the max. This is a very easy test to write. I’m not sure how a function to find the maximum of two numbers would need to be updated, but given the simplicity of the function I would feel confident updating this code from the first time I lay eyes on it. Let’s consider another example:

// get two numbers and find the max of each until the user quits
prompt user for two numbers
x = first user input

while x isn’t actually a number
keep prompting for a number

y = second user input
while y isn’t actually a number
keep prompting for a number

max = y

if x > y
max = x

tell the user the maximum input was max

This has a CCM of 8:

  1. prompt for numbers, get x and it’s valid, get y and it’s valid, set max to y, and tell the user the max
  2. prompt for numbers, get x and it’s valid, get y and it’s valid, set max to y, set the max to x because x is greater, and tell the user the max
  3. prompt for the numbers, get x and it isn’t valid so keep prompting, get y and it’s valid, set max to y, and tell the user the max.
  4. prompt for the numbers, get x and it isn’t valid so keep prompting, get y and it’s valid, set max to y, set the max to x because x is greater, and tell the user the max.
  5. prompt for the numbers, get x and it’s valid, get y and it isn’t valid so keep prompting, set max to y, and tell the user the max.
  6. prompt for the numbers, get x and it’s valid, get y and it isn’t valid so keep prompting, set max to y, set the max to x because x is greater, and tell the user the max.
  7. prompt for the numbers, get x and it isn’t valid so keep prompting, get y and it isn’t valid so keep prompting, set max to y, and tell the user the max.
  8. prompt for the numbers, get x and it’s valid, get y and it isn’t valid so keep prompting, set max to y, set the max to x because x is greater, and tell the user the max.

A CCM of 8 is still a very healthy number. I wouldn’t even think about simplifying a function strictly because of its CCM until it got over 10, and even then I wouldn’t consider it a must. Loosely, you could consider anything under 10 to be very healthy, anything  10-20 to be fairly healthy yet potentially worrisome, 20-30 to be unhealthy, and anything above 30 to be extremely unhealthy. Despite this example still falling well within the realm of healthy, we can see how coming up with a test is considerably more involved:

You’d need to supply many different inputs, making sure that only two correct inputs have to be entered. You’d also need to check to be certain the maximum input is selected, and finally you’d need to check that the message output to the user is correctly formatted. Obviously, this is still very doable, but you can see how adding one or two more variations to the function could make it very difficult to test.

Additionally, you can see that what is making this function more complicated is a series of related yet distinct tasks; this function is actually performing three tasks:

  1. Getting two numbers input from the user
  2. Determining the maximum
  3. Informing the user of the maximum

Many functions with very high CCM develop it organically; you start out with a well-written function, but eventually you need to add more and more to it, and with every additional path you get more complexity, until eventually you wind up with a function that is very difficult for a newcomer to readily understand and maintain.

For this example and in my experience you can usually reduce CCM adequately by dividing what are functionally distinct tasks into different functions. Doing this makes testing and understanding simpler; function flow becomes logical and clear, unit tests are easy to write, and you’re well on your way to coding nirvana. In my next post, I’ll be discussing how Adaptive Computing is applying this knowledge to our development.

Facebook Twitter Email