CPA-SymExec: Efficient Symbolic Execution in CPAchecker

Dirk Beyer and Thomas Lemberger

Article:
Article PDF
Artifact:
Artifact DOI
Demo Video:
YouTube Video
Presentation at ASE'18:
PDF Presentation
Poster at ASE'18:
PDF Poster

Abstract

We present CPA-SymExec, a tool for symbolic execution that is implemented in the open-source, configurable verification framework CPAchecker. Our implementation automatically detects which symbolic facts to track, in order to obtain a small set of constraints that are necessary to decide reachability of a program area of interest. CPA-SymExec is based on abstraction and counterexample-guided abstraction refinement (CEGAR), and uses a constraint-interpolation approach to detect symbolic facts. We show that our implementation can better mitigate the path-explosion problem than symbolic execution without abstraction, by comparing the performance to the state-of-the-art Klee-based symbolic-execution engine Symbiotic and to Klee itself. For the experiments we use two kinds of analysis tasks: one for finding an executable path to a specific location of interest (e.g., if a test vector is desired to show that a certain behavior occurs), and one for confirming that no executable path to a specific location exists (e.g., if it is desired to show that a certain behavior never occurs).
CPA-SymExec is released under the Apache 2 license and available (inclusive source code) at https://cpachecker.sosy-lab.org. A demonstration video is available at https://youtu.be/qoBHtvPKtnw.

Overview

This website consists of the following sections:

Example for CEGAR

(to top)

We will show the effect of CEGAR on the following example program (download):

extern void __VERIFIER_error();
extern unsigned char __VERIFIER_nondet_uchar();

int main() {
  unsigned char a = __VERIFIER_nondet_uchar();
  unsigned char b = __VERIFIER_nondet_uchar();
  unsigned char c = b + 1;
  unsigned char i = 0;
  while (a < 100) {
    if (__VERIFIER_nondet_uchar())
      a++;
    else
      i++;
  }

  if (c == b && i > a)
ERROR:
    __VERIFIER_error();
}

We want to check the specification that method __VERIFIER_error is never called.

Without CEGAR, CPA-SymExec will not terminate and compute an infinite symbolic-execution tree that grows exponentially in size:

Symbolic Execution Tree of CPA-SymExec without CEGAR

But with CEGAR, CPA-SymExec first performs an analysis that tracks no information about the symbolic memory and no path constraints. This creates a finite symbolic execution tree, because CPA-SymExec stops computation as soon as states repeat (cf. our paper on Symbolic Execution with CEGAR).

Though, because of the missing information, CPA-SymExec finds an (infeasible) program path to a call to __VERIFIER_error; the computed symbolic execution tree looks like the following, with the target location where the call to __VERIFIER_error occurs marked with red.

Symbolic Execution Tree of first CEGAR iteration

Through its counterexample-guided refinement, CPA-SymExec derives that it must track the symbolic memory of program variables b and c so that it knows at the final if-condition that b == c is never true. It then restarts analysis and computes the following symbolic-execution tree that does not contain a path to a call to __VERIFIER_error:

Symbolic Execution Tree of second CEGAR iteration

This way, CPA-SymExec can prove that the program fulfills the specification.

Note that i > a is also false for the found error path, because a >= 100 and i = 0. Thus, the refinement procedure could also choose to track the symbolic memory of i and a, instead of b and c. This would lead to an infinite symbolic-execution tree similar to the one of CPA-SymExec without CEGAR.

To make refinement avoid such "bad decisions", we use refinement selection with sliced path prefixes.

Symbolic and Concrete Program Traces

(to top)

If a target path is found, CPA-SymExec produces both a symbolic and an examplary concrete trace.

For this slightly modified program:

extern void __VERIFIER_error();
extern unsigned char __VERIFIER_nondet_uchar();

int main() {
  unsigned char a = __VERIFIER_nondet_uchar();
  unsigned char b = __VERIFIER_nondet_uchar();
  unsigned char c = b + 1;
  unsigned char i = 0;
  while (a < 100) {
    if (__VERIFIER_nondet_uchar())
      a++;
    else
      i++;
  }

  if (c <= b)
ERROR:
    __VERIFIER_error();
}

The following traces are produced:

SymbolicTrace

none:   N1 -{INIT GLOBAL VARS}-> N27
line 1: N27 -{void __VERIFIER_error();}-> N28
line 2: N28 -{unsigned char __VERIFIER_nondet_uchar();}-> N29
lines 4-19: N29 -{int main();}-> N30
none:   N30 -{Function start dummy edge}-> N2
line 5: N2 -{unsigned char a;}-> N3
line 5: N3 -{a = __VERIFIER_nondet_uchar();}-> N4
line 6: N4 -{unsigned char b;}-> N5
line 6: N5 -{b = __VERIFIER_nondet_uchar();}-> N6
line 7: N6 -{unsigned char c = b + 1;}-> N7
line 8: N7 -{unsigned char i = 0;}-> N8
    a == main::a#1;
    c == ((unsigned char)(((signed int)main::b#3) + 1));
    b == main::b#3;
    i == 0U;
line 9: N8 -{while}-> N9
line 9: N9 -{[!(a < 100)]}-> N11
    main::a#1 >= 100;
line 16:    N11 -{[c <= b]}-> N24
    main::a#1 >= 100;
    ((unsigned char)(((signed int)main::b#3) + 1)) <= main::b#3;
line 17:    N24 -{Label: ERROR}-> N25
    main::a#1 >= 100;
    ((unsigned char)(((signed int)main::b#3) + 1)) <= main::b#3;

Concrete Trace

none:   N1 -{INIT GLOBAL VARS}-> N27
line 1: N27 -{void __VERIFIER_error();}-> N28
line 2: N28 -{unsigned char __VERIFIER_nondet_uchar();}-> N29
lines 4-19: N29 -{int main();}-> N30
none:   N30 -{Function start dummy edge}-> N2
line 5: N2 -{unsigned char a;}-> N3
line 5: N3 -{a = __VERIFIER_nondet_uchar();}-> N4
    a == 104U;
line 6: N4 -{unsigned char b;}-> N5
line 6: N5 -{b = __VERIFIER_nondet_uchar();}-> N6
    b == 255U;
line 7: N6 -{unsigned char c = b + 1;}-> N7
    c == 0U;
line 8: N7 -{unsigned char i = 0;}-> N8
    a == 104U;
    c == 0U;
    i == 0U;
    b == 255U;
line 9: N8 -{while}-> N9
line 9: N9 -{[!(a < 100)]}-> N11
    a == 104U;
    a == 104U;
line 16:    N11 -{[c <= b]}-> N24
    c == 0U;
    b == 255U;
    c == 0U;
b == 255U;
line 17:    N24 -{Label: ERROR}-> N25

Experimental Data

(to top)

The experimental data was produced on machines with:

Each experiment run was limited to:

We used 5590 tasks with a reach-safety property from the sv-benchmarks in revision svcomp18. To run CPA-SymExec, we used CPAchecker in version cpachecker-1.7.7-ase18-symExec. Symbiotic was used in version SPIN-18, and Klee was used in revision 86e6f6f8. To run the experiments in a reproducible manner, we used benchexec in version 1.16.

Results

In number of solved tasks, our experiments gave the following results:

CPA-SymExec (no CEGAR) CPA-SymExec (CEGAR) Klee Symbiotic
Correct true 196 2137 446 1201
Correct false 590 545 899 848
Incorrect true 1 0 6 3
Incorrect false 22 22 33 9

Data

All experimental data can be found in folder data_raw of our artifact.
A comparison table of our results can be found here (csv).

Tool HTML CSV Logfiles Raw Results
CPA-SymExec without CEGAR HTML CSV Logfiles XML
CPA-SymExec with CEGAR HTML CSV Logfiles XML
Klee HTML CSV Logfiles XML
Symbiotic HTML CSV Logfiles XML

Visual Analysis

(to top)

Tool Comparison

The following quantile plot compares CPA-SymExec, CPA-SymExec without the use of CEGAR, Klee and Symbiotic with each other.

The x-axis describes the n-th fastest task that each individual tool could solve: point 1 on the x-axis describes for each tool the task that took the least time for that tool, point 2 the task that took the second-least time, and so on.
The y-axis describes the CPU time (in seconds) that analysis took: point ~(2000, 400) for Symbiotic means that the 2000-th fastest task took 400 seconds of CPU time for analysis.

With this, it is possible to see how fast a tool is, and how many tasks it can solve---the further to the right and the further to the bottom, the more tasks the tool could solve and the faster it could solve these.

It is clearly visible that, overall, CPA-SymExec (with CEGAR) is competitive with Klee and Symbiotic both in the number of tasks it can solve, and the time it requires to do so. But CPA-SymExec, Klee and Symbiotic have different capabilities.

The following plots show the CPU time (in seconds) that CPA-SymExec (x-axis) and Klee (y-axis) require for each task. Tasks that contain a feasible call to __VERIFIER_error (false property) are represented by red crosses; tasks that contain no call to __VERIFIER_error (true property) are represented by green circles. The numbers in parantheses below the caption show how many tasks of both properties exist. The plot only shows tasks for which either a correct solution was found or a timeout occurred - tasks for which one of the tools ran into an error are not shown. If a data point is above the line through origin, it means that CPA-SymExec was faster solving that corresponding task. If a data point is below the line through origin, it means that Klee was faster solving that task. The plot also contains supporting lines (in grey) for x * 10 and x * 0.1, to mark the area at which one tool is ten times faster than the other. Tasks close to 1000 on the x-axis depict an analysis timeout for CPA-SymExec, tasks close to 1000 on the y-axis depict an analysis timeout for Klee. Tasks close to point (1000, 1000) can't be solved by either tool.

The plot shows that, for many tasks that both tools can solve, Klee is significantly faster than CPA-SymExec (points below x * 0.1 line) but that there are lots of tasks that only one of the tools can solve, but not the other. (points at upper and right border, close to 1000 seconds).

The following plots give a more detailed comparison, based on the different benchmark categories:

The following plots show the CPU time (in seconds) that CPA-SymExec (x-axis) and Symbiotic (y-axis) require for each task, followed by the detailed plots per category:

The same applies as for Klee: Symbiotic is faster for tasks that both can solve, but the tools can solve a lot of different tasks.

Description of Benchmark Set

(to top)

For our experiments, we used all tasks with a reach-safety property of the sv-benchmarks and without recursion. The benchmark set is categorized according to the features of the verification tasks - the table below gives a quick overview over the used set. The category LDV is special: its verification tasks are created from industrial-scale Linux modules, by the Linux Driver Verification team.

Category Tasks LOC C features
Sum Min Max Avg Median
Arrays 167 7275 14 1161 44 36
False 44 1575 15 57 36 37
True 123 5700 14 1161 46 36
BitVectors 50 10511 13 696 210 39
False 14 2236 13 636 160 32
True 36 8275 15 696 230 47
ControlFlow 94 183904 94 22300 1956 1645
False 42 83038 220 10835 1977 1695
True 52 100866 94 22300 1940 1057
ECA 1149 29685918 344 185053 25836 4269
False 411 11948617 566 185053 29072 4827
True 738 17737301 344 185053 24034 2590
Floats 172 47518 9 1122 276 37
False 31 970 15 154 31 31
True 141 46548 9 1122 330 50
Heap 181 142378 11 4605 787 649
False 71 53834 19 4605 758 661
True 110 88544 11 4576 805 503
Loops 163 10026 14 1647 62 25
False 52 4035 14 1647 78 23
True 111 5991 14 476 54 26
ProductLines 597 1160305 838 3789 1944 1014
False 265 620859 847 3789 2343 2951
True 332 539446 838 3693 1625 979
Sequentialized 273 580463 330 18239 2126 1098
False 170 325198 331 15979 1913 974
True 103 255265 330 18239 2478 1223
Systems_DeviceDriversLinux64 2744 40335619 339 227732 14700 9432
False 355 6116255 1389 85772 17229 13420
True 2389 34219364 339 227732 14324 8288
Total 5590 72163917 9 227732 12909 3522
False 1455 19156617 13 185053 13166 3000
True 4135 53007300 9 227732 12819 3967

Tutorial: Using CPAchecker with Symbolic Execution

(to top)

! CPAchecker can be used without setup from its web client !

In the following, we assume that you use the artifact to run the tools, but all command lines can be transferred to the web client.

For a quick introduction, watch our tutorial video (~4 minutes).

This section gives an introduction into the usage of CPA-SymExec from the command line and explains how to run the experiments cited in our paper.

For more information about development in CPAchecker, see the CPAchecker documentation.

Requirements

We assume that you have:

The system provided by our SoSy-Lab Virtual Machine already meets these requirements.

If you don't have benchexec installed yet, you can install it from the deb-package shipped with the artifact. To do so, run from the artifact's root directory:

dpkg -i debs/python3-tempita_0.5.2-1build1_all.deb
dpkg -i benchexec_1.14-1_all.deb

For more information on benchexec, see the documentation.

Running CPA-SymExec on an Example Program

Verification / Falsification

To run CPA-SymExec on the example program exampleProgram.c to check whether the ERROR-label is reachable, execute from directory cpachecker-1.7, on the command line:

scripts/cpa.sh -symbolicExecution-Cegar ../exampleProgram.c

CPA-SymExec will find a target path and report the verdict FALSE (i.e., it is false that the ERROR-label is unreachable). CPA-SymExec will also create multiple files in the directory output/. The HTML report Counterexample.2.html gives a good overview over most of these files. We suggest to use Chrome or Firefox to open the report. The 2 in its name implies that a first target path (i.e., counterexample) was found that was not feasible.

Test-case Generation

To use CPA-SymExec to generate tests for condition coverage on the example program exampleProgram.c, execute, from directory cpachecker-1.7:

scripts/cpa.sh -testCaseGeneration-symbolicExecution-Cegar ../exampleProgram.c

The generated test cases will be created as output/Test.*.harness.c.

To execute a test, compile it together with the original program and run the product. For example, to compile the test case for test 0 into a executable file testcase, run, from directory cpachecker-1.7:

gcc -o testcase output/Test0.harness.c ../exampleProgram.c

Then, to execute, run ./testcase. (Since the program does not create any output, no behavior will be visible.)

Running the Benchmarks

Running Benchmarks for CPAchecker

To repeat the experiments for CPAchecker, run, from directory cpachecker-1.7:

benchexec --container ../benchmark_defs/cpachecker.xml

Logfiles and the result XMLs will be in cpachecker-1.7/results/.

Running Benchmarks for Klee and Symbiotic

Klee is run from the existing Symbiotic wrapper script, but without any Symbiotic-specific modifications. Thanks to this, the experiments for Klee and Symbiotic use the exact same Klee and LLVM version.

To repeat the experiments for both Klee and Symbiotic, run, from directory symbiotic:

benchexec --container ../benchmark-defs/symbiotic.xml

Logfiles and the result XMLs will be in symbiotic/results/. Result files for Klee are named symbiotic.DATE.results.klee.*.xml, result files for Symbiotic are named symbiotic.DATE.results.symbiotic.*.xml.

Note: If you only want to run experiments for Klee, run, from directory symbiotic: bash benchexec --container -r klee ../benchmark-defs/symbiotic.xml If you only want to run experiments for Symbiotic, run, from directory symbiotic: bash benchexec --container -r symbiotic ../benchmark-defs/symbiotic.xml

Creating Result Tables

To create HTML and CSV-tables for the results, run table-generator RESULT-FILES on the corresponding files. For example, to create tables for all three tools, run:

table-generator \
        cpachecker-1.7/results/cpachecker.????-??-??_????.results.symbolicExecution-Cegar.xml.bz2  \
        symbiotic/results/symbiotic.????-??-??_????.results.klee.xml.bz2 \
        symbiotic/results/symbiotic.????-??-??_????.results.symbiotic.xml.bz2

The program will tell you the name of the generated files on the console output.

You can also create tables for the existing data of our experiments. To do so, run:

table-generator \
        data_raw/cpachecker.????-??-??_????.results.symbolicExecution-Cegar.xml.bz2  \
        data_raw/symbiotic.????-??-??_????.results.klee.xml.bz2 \
        data_raw/symbiotic.????-??-??_????.results.symbiotic.xml.bz2

The program will, again, tell you the name of the generated files on the console output.