Final Report - GSoC 2023

Introduction

For my Google Summer of Code project, I worked on the implementation of a backward bounded model checking (BBMC) algorithm in CPAchecker. This project also required an evaluation of the existing support for backward analyses in CPAchecker by relevant CPAs and implementing it where it did not exist yet. PredicateCPA was known to not support backward formula construction in the case of pointer aliasing, so making progress toward this support was also part of the project.

The BBMC algorithm unrolls the CFA of a program backwards, from the error locations to the main entry, up to a loop bound. If the path formula of a target state (corresponding to the entry location) is satisfiable, the property is false, i.e., an error location is reachable. Otherwise, the algorithm tests the path formulas corresponding to the loop heads for satisfiability. If both the entry of the program and the loop heads are unreachable, the property is proved. Else, the loop bound is incremented and the CFA is unrolled further. In section 2 and 3 of the paper Backward Symbolic Execution with Loop Folding by Marek Chalupa and Jan Strejcek, the BBMC algorithm is stated to be theoretically equivalent to k-induction. To evaluate this claim in practice, the final part of the project was to evaluate BBMC on a benchmark set and evaluate its performance compared to k-induction.

Completed Work

The work done is divided into three merge requests:

As the first part of the project, these changes make the exporting of an ARG from a backward analysis possible. This support was important to be able to inspect the ARGs generated by backward analysis to use for debugging and understanding the program structure. This is of benefit to any future use of backward analysis as well.

In this MR, the class for the backward BMC algorithm is implemented together with configuration files to run the algorithm. The algorithm makes use of the LoopBoundCPA, which did not yet support backward analysis, so the LoopBoundTransferRelationBackward was implemented as well to provide support for this. This can be used by other forms of backward analysis requiring LoopBoundCPA as well.

This MR implements support for pointer aliasing for backward formula construction by PredicateCPA. It implements changes in SSA indexing for backward analysis and handles addressed variables. This MR is a work-in-progress, as it does not yet provide full support for all cases of pointers and pointer aliasing. The union structure as well as pointers to struct are not yet supported as they lead to incorrect SSA indexing.

The following example program is not yet supported by backward pointer aliasing due to the use of a union structure:

extern void reach_error();
union Data { int a; int b; };

// BBMC with pointer aliasing returns TRUE
int main() { // Expected result: FALSE
  union Data d;
  d.a = 6;
  d.b = d.a + 1;
  if (d.b == 7) {
    reach_error();
    return -1;
  }
  return 0;
}

The following example program is not yet supported by backward pointer aliasing due to the use of a pointer to a struct:

extern void reach_error();
struct Point { int x; };

// BBMC with pointer aliasing returns TRUE
int main() { // Expected result: FALSE
  struct Point p;
  struct Point *pt = &p;
  pt->x = 2;
  p.x = pt->x - 1;
  if (p.x == 1) {
    reach_error();
    return -1;
  }
  return 0;
}

Evaluation

Handling of pointer aliasing

I evaluated the performance of the implemented BBMC algorithm, once with the handling of pointer aliasing disabled (BBMC NP), and once enabled (BBMC+PA), on a subset of 6362 programs of the SV-COMP benchmark to evaluate the improvement of handling pointer aliasing. This subset of programs was chosen as the remaining ReachSafety test sets often included union and pointers to a struct, which are not yet supported. For comparison, I also tested the BMC and k-induction algorithms on this set of programs. All evaluations were done on revision 44346 of the branch backward-bmc-algorithm.

Result BBMC NP BBMC+PA BMC
Correct True 795 1061 937
Correct False 1190 1150 1575
Incorrect True 15 55 5
Incorrect False 484 8 6

The complete results for this benchmark can be found here: Results 1

The BBMC algorithm without pointer aliasing produces 484 false alarms, which is reduced to just 8 by the handling of pointer aliasing. Handling pointer aliasing also leads to the algorithm an increase of 11% in correct results, from 1985 to 2211. However, the handling of pointer aliasing does lead to an increase of false proofs from 15 to 55. Investigating these programs shows that most have unsupported language features as well, which is likely the cause of these false proofs. This can thus likely be solved by future support for these features.

BMC outperforms both variants of BBMC with 2512 correct results in total. Interestingly however, BBMC+PA does achieve more correct true results than BMC. Especially on some parts of the test set, notably ReachSafety-ECA, BBMC significantly outperforms BMC as BBMC achieves 492 correct results in total, whereas BMC achieved only 254. This suggests that BBMC might complement BMC well on some test sets. Moreover, it may be the case that the performance on BBMC+PA will improve with more supported features, beyond only the programs where it currently returns an incorrect result. This is because incorrect formula construction on programs with unsupported features can also lead to a TIMEOUT result, where the algorithm might have returned a correct result with support for these features.

In general, BMC is also slightly more efficient than BBMC+PA, as can be seen in the following quantile plot: Quantile. This suggests that BBMC may benefit from optimization, although these results may change as more language features are supported for backward formula construction.

BBMC and k-induction on ECA

Both BBMC and k-induction work by unrolling a program up to the same bound and attempting to solve the task. If it cannot be solved, the bound is incremented and the program is unrolled further. I evaluated the performance of BBMC and k-induction to investigate the claim that the algorithms are equivalent, meaning that they are able to solve the same set of programs at the same unrolling bound. This was done on the ReachSafety-ECA test set, which includes programs that have a loop taking in an input, and generate an output based on a long sequence of if-statements. These programs do not include any pointers or pointer aliasing, so runs of BBMC with pointer aliasing enabled and disabled achieved the same results.

Result BBMC KI
Correct True 398 411
Correct False 94 131
Incorrect True 0 0
Incorrect False 0 0

The complete results for this benchmark can be found here: Results 2

BBMC achieved 492 correct programs, while k-induction had 542 correct programs, around 10% more. Although this result seems to indicate that the algorithms perform differently, it is worth noting that on the programs where k-induction achieved a correct result, the BBMC algorithm always hit the time limit before it was able to reach the iteration at which the k-induction algorithm determined the result. Conversely as well, on the 3 programs where BBMC achieved a correct result and k-induction did not, k-induction timed out before reaching the iteration at which BBMC determined the correct result. On the programs that were investigated where both achieved correct results, the algorithms always did so at the same program unrolling iteration. This suggests that the difference in results between the algorithms may be due to different running times on the test set and that the actual strength of the algorithms is equivalent on this test set.

From the scatter plot (Scatter) with the running time for k-induction on the x-axis and that of BBMC on the y-axis, it is clear that the algorithms have a similar running time on most programs and BBMC is often slightly faster. However, there is a smaller group of programs that k-induction is able to solve in 200-300 seconds but BBMC only solves in >400 seconds, which is also clearly visible on the quantile plot (Quantile). On these programs, the speed of BBMC is limited by the SMT solving phase, especially the solving of the path formula of the loop heads. This is likely due to the fact that the backward-constructed path formulas are more difficult to solve.

Future Work

As mentioned, support for pointer aliasing is not yet complete, as there are language features that are not yet supported. This can be done in future work as it will improve the performance of BBMC as well as any other backward analysis. With more support for advanced language features, the algorithm can be benchmarked against BMC and k-induction again to evaluate the improvement. From the comparison of BBMC to k-induction on the ECA test set, it seems that the algorithms may produce equivalent results, only with different running times, where BBMC is often slower. Making improvements to the efficiency of the algorithm could then be of benefit to its performance on programs where it currently gives a TIMEOUT result, but could theoretically be verified by BBMC. The efficiency improvements may come from the SMT solving part of the algorithm, as it seems that this is where the algorithm lags behind k-induction. For this reason, it is worth investigating whether backward-constructed path formulas are more difficult to solve than forward-constructed formulas, and what kind of control flow leads to this increased difficulty.