When is a for loop like a do .. while loop?

At Red Lizard Software, we care about providing the most accurate static analysis for your cpu cycle. Therefore, we spend a lot of our time thinking about the nature of false positives (when Goanna gives a warning about completely reasonable code) and how to avoid them.

One class of false positives we have noticed recently happens when you want to warn about an action that must occur on all execution paths. These properties might be expressed as “you must initialise all variables on all paths before accessing their values” for some definitions of initialise and access. A problem with these kinds of requirements appears when the initialisation of a variable is performed within a looping construct, and then access after the loop. This loop is usually designed to execute at least once (thus initialising the variable at least once) and so the programmer knows that the access after the loop is perfectly valid. Goanna has historically not been very good at identifying this false positive and will often warn anyway because there is an execution path that might not initialise the variable, the path where the condition evaluates to false. This is probably a case where the programmer should have used a do .. while loop to convey the desired semantics of the loop, but given that do .. while loops are not as popular as for loops, Goanna needs to be able to deal with this scenario.

There are two steps to making Goanna more intelligent about loops. The first step is identifying when a for or while loop should be represented as a do .. while, and the second is presenting this information to Goannas internal analysis engine.

In order to determine that a loop will execute at least once, it may be simpler to ask the inverse question. When will a loop not execute at least once? A sub question of this is when will we not know if a loop can execute at least once? This is actually much easier to answer because it can be boiled down to a structural condition. If the condition of the loop contains global variable references or function calls, then it is almost impossible to determine if a loop will execute only once. So what is left? Loops that contain only literals and local variable references. Parameters are a trickier issue since each call to the function is potentially different. With additional interprocedural analysis it may be possible to determine the boundaries of function parameters accurately but at present these loops can be ignored as well. The only thing left to do is to determine the state of the variables used in the loop condition right before it is evaluated and then evaluate the condition.

The analysis engine of Goanna works upon what is known as a control flow graph. This graph is created by looking at the source tree and determining which operations happen in which order. So the best way to present this modification of a for loop is through modifications to the control flow graph. Specifically we would like to create a copy of the control flow graph of the loops condition and wire up the rest of the graph such that there is a direct path through this path to the body of the for loop. The graph must also go into this new condition instead of into the old condition in order for the modification to be complete.

After implementing this change we have noticed that there is a small drop in the number of certain types of false positives, specifically in the SPC-uninit-var-some, with no impact on the runtime performance of the Goanna analysis engine. We hope to roll this improvement into the next release of the Goanna static analysis product line.

No Comments

Post a Comment