Correctness Witnesses:
Exchanging Verification Results between Verifiers

Dirk Beyer, Matthias Dangl, Daniel Dietsch, and Matthias Heizmann

Abstract

Standard verification tools provide a counterexample to witness a specification violation, and, since a few years, such a witness can be validated by an independent validator using an exchangeable witness format. This way, information about the violation can be shared across verification tools and the user can use standard tools to visualize and explore witnesses. This technique is not yet established for the correctness case, where a program fulfills a specification. Even for simple programs, it is often difficult for users to comprehend why a given program is correct, and there is no way to independently check the verification result. We close this gap by complementing our earlier work on violation witnesses with correctness witnesses. While we use an extension of the established common exchange format for violation witnesses, the techniques for producing and validating correctness witnesses are completely different. The overall goal to make proofs available to engineers is probably as old as programming itself, and proof-carrying code was proposed two decades ago — our goal is to make it practical: We consider witnesses as first-class exchangeable objects, stored independently from the source code and checked independently from the verifier that produced them, respecting the important principle of separation of concerns. At any time, the invariants from the correctness witness can be used to reconstruct a correctness proof to establish trust. We extended two state-of-the-art verifiers, CPAchecker and UltimateAutomizer, to produce and validate witnesses, and report that the approach is promising on a large set of verification tasks.

A preprint of our published article is available for you to download.

Supplementary Data

We performed an extensive experimental evaluation to show the potential and feasibility of our approach. We used CPAchecker and UltimateAutomizer to verify tasks from the SV-COMP 2016 benchmark set and applied both tools to validate their own witnesses. For CPAchecker, we used a configuration of k-induction with auxiliary invariant generation. For UltimateAutomizer, we used the standard configuration. We also made an experiment where we re-validated the witnesses produced by the CPAchecker-based validator, which in this case takes the role of a testifier, and therefore, we call the producer 'CPAtestifier'. The following table provides links to the raw results of the experimental evaluation.

Verifier (Witness Producer) Witness Validator (Consumer)
CPAchecker (log files) CPAchecker HTML Table CSV Table Validation Log Files
CPAtestifier (log files) CPAchecker HTML Table CSV Table Validation Log Files
CPAchecker (log files) UltimateAutomizer HTML Table CSV Table Validation Log Files
UltimateAutomizer (log files) CPAchecker HTML Table CSV Table Validation Log Files
UltimateAutomizer (log files) UltimateAutomizer HTML Table CSV Table Validation Log Files

The witnesses produced by the tools are contained in the log file archives of CPAchecker and UltimateAutomizer.

We used version 1.9 of BenchExec to run our experiments. The evaluated tools are available for download:

We also provide a virtual machine image set up for producing and validating witnesses with both tools. Log into the virtual machine as user 'fse' with password 'fse'.

Tutorial

In the following we will explain how our experimental results can be reproduced. A detailed description of our benchmarking configuration can be found in the paper. We presume that you have a machine that meets (at least) the following requirements (other configurations could work as well):

We provide a virtual machine image that is already set up to fulfill all requirements. If you use this virtual machine image by logging in as user 'fse' with password 'fse', you can skip ahead to producing witnesses.

First of all, please choose the root directory $BASE to which we extract all the data and programs that are necessary to reproduce our results. This can be an arbitrary directory on your file system that can be written.

Now you can retrieve copies of the data and programs and extract them to the directory $BASE:

  1. Obtain a copy of the SV-COMP'16 verification tasks:
    git clone https://github.com/dbeyer/sv-benchmarks.git svcomp16
    cd svcomp16
    git checkout tags/svcomp16
    cd ..
  2. Obtain a copy of CPAchecker in version cpachecker-1.6.8-fse16-correctnessWitnesses, for example by downloading and unpacking the archived version of CPAchecker.
  3. Obtain a copy of UltimateAutomizer in revision c3312191, for example by downloading and extracting the archived version of UltimateAutomizer.

The directory subtree of $BASE should now look as follows:

./
├── CPAchecker-1.6.1-svn-22870-unix/
├── svcomp16/
├── UAutomizer-linux/

As a running example, we use the verification task svcomp16/c/loop-acceleration/multivar_true-unreach-call1.i.

Producing Correctness Witnesses with CPAchecker

To produce a witness for the example task with CPAchecker, simply execute the following commands:

  cd CPAchecker-1.6.1-svn-22870-unix/
  scripts/cpa.sh \
    -correctness-witnesses-k-induction \
    -spec ../svcomp16/c/loop-acceleration/ALL.prp \
    ../svcomp16/c/loop-acceleration/multivar_true-unreach-call1.i
  cd ..

The output of CPAchecker should be similar to the following listing:

Running CPAchecker with default heap size (1200M). Specify a larger value with -heap if you have more RAM.
Running CPAchecker with default stack size (1024k). Specify a larger value with -stack if needed.
CPAchecker 1.6.1-svn 22870M (Java HotSpot(TM) 64-Bit Server VM 1.8.0_101) started (CPAchecker.run, INFO)

The following configuration options were specified but are not used:
 properties 
 (CPAchecker.printConfigurationWarnings, WARNING)

Starting analysis ... (CPAchecker.runAlgorithm, INFO)

Using predicate analysis with MathSAT5 version 5.3.12 (fd820dac73f2) (Aug  8 2016 11:12:19, gmp 5.1.3, gcc 5.3.0, 64-bit).
  (Parallel analysis 1:PredicateCPA:PredicateCPA., INFO)

Using predicate analysis with MathSAT5 version 5.3.12 (fd820dac73f2) (Aug  8 2016 11:12:19, gmp 5.1.3, gcc 5.3.0, 64-bit).
  (Parallel analysis 1:InductionStepCase:PredicateCPA:PredicateCPA., INFO)

config/invariantGeneration-no-out-no-typeinfo.properties finished successfully.
  (ParallelAlgorithm.handleFutureResults, INFO)

Analysis was terminated (Parallel analysis 1:ParallelAlgorithm.runParallelAnalysis, INFO)

Stopping analysis ... (CPAchecker.runAlgorithm, INFO)

Verification result: TRUE. No property violation found by chosen configuration.
More details about the verification run can be found in the directory "./output".
Run scripts/report-generator.py to show graphical report.

You will find the correctness witness produced by CPAchecker for the example task
at CPAchecker-1.6.1-svn-22870-unix/output/correctness-witness.graphml.

Producing Correctness Witnesses with UltimateAutomizer

To produce a witness for the example task with UltimateAutomizer, simply execute the following commands:

  cd UAutomizer-linux/
  ./Ultimate.py \
    ../svcomp16/c/loop-acceleration/ALL.prp \
    32bit precise \
    ../svcomp16/c/loop-acceleration/multivar_true-unreach-call1.i
  cd ..

The output of UltimateAutomizer should look similar to the following listing:

Checking for ERROR reachability
Using default analysis
Version c3312191
Calling Ultimate with: ./Ultimate --console "./Automizer.xml"
  "../svcomp16/c/loop-acceleration/multivar_true-unreach-call1.i"
  "--settings" "/hard-disk/dangl/tmp/UAutomizer-linux/svcomp-Reach-32bit-Automizer_Default.epf"
...............................................................................................
...............................................................................................
.....................................................................
Execution finished normally
Writing output log to file Ultimate.log
Result:
TRUE

You will find the correctness witness produced by UltimateAutomizer for the example task
at UAutomizer-linux/witness.graphml.

Validating Correctness Witnesses with CPAchecker

For the validation, we assume that one of the previously obtained witnesses for the example task has been moved to $BASE/correctness-witness.graphml. To validate the correctness witness with CPAchecker, simply execute the following commands:

  cd CPAchecker-1.6.1-svn-22870-unix/
  scripts/cpa.sh \
    -correctness-witness-validation \
    -setprop invariantGeneration.kInduction.invariantsAutomatonFile=../correctness-witness.graphml \
    -spec ../svcomp16/c/loop-acceleration/ALL.prp \
    ../svcomp16/c/loop-acceleration/multivar_true-unreach-call1.i
  cd ..

If the witness is valid, the output of CPAchecker should end in the following lines:

Stopping analysis ... (CPAchecker.runAlgorithm, INFO)

Verification result: TRUE. No property violation found by chosen configuration.
More details about the verification run can be found in the directory "./output".

Because the CPAchecker-based validator is actually even a correctness-witness testifier, the validation will produce another (usually more abstract) correctness witness
in CPAchecker-1.6.1-svn-22870-unix/output/correctness-witness.graphml.

Validating Correctness Witnesses with UltimateAutomizer

For the validation, we assume that one of the previously obtained witnesses for the example task has been moved to $BASE/correctness-witness.graphml. To validate the correctness witness with UltimateAutomizer, simply execute the following commands:

  cd UAutomizer-linux/
  ./Ultimate.py \
    ../svcomp16/c/loop-acceleration/ALL.prp \
    32bit precise \
    ../svcomp16/c/loop-acceleration/multivar_true-unreach-call1.i \
    ../correctness-witness.graphml
  cd ..

A successful validation will result in an output similar to the following:

Checking for ERROR reachability
Using default analysis
Version c3312191
Calling Ultimate with: ./Ultimate --console
  "/hard-disk/dangl/tmp/UAutomizer-linux/AutomizerWitnessValidation.xml"
  "../svcomp16/c/loop-acceleration/multivar_true-unreach-call1.i"
  "../correctness-witness.graphml"
  "--settings" "/hard-disk/dangl/tmp/UAutomizer-linux/svcomp-Reach-32bit-Automizer_Default.epf"
...............................................................................................
...............................................................................................
...............................................................................................
...............................................................................................
.............................................................................................
Execution finished normally
Writing output log to file Ultimate.log
Result:
TRUE

UltimateAutomizer currently supports correctness-witness validation, but not testification, so validating a correctness witness will not produce another witness.