On-The-Fly Decomposition
of Specifications
in Software Model Checking

—Replication Package—

Sven Apel, Dirk Beyer, Vitaly Mordan, Vadim Mutilin, and Andreas Stahlbauer

Abstract

Software model checking has gained considerable momentum in the last decade. Major breakthroughs have increased its efficiency and effectiveness considerably such that this technology is now applicable to industrial-scale software. However, verifying the full formal specification of a software module is still considered too complex, and in practice, sets of properties are verified one by one in isolation.

We propose an approach that takes the full formal specification as input and verifies all properties simultaneously in one verification run. Our verification algorithm monitors itself and detects situations for that the full set of properties is too complex. In such cases, we perform an automatic decomposition of the full set of properties into smaller sets, and the verification is seamlessly continued on the smaller sets. To avoid state-space explosion for large sets of properties, we introduce on-the-fly property weaving. Properties get weaved on the fly, along with the analysis; which properties to weave, or check, can be decided dynamically during the verification process.

We perform an extensive evaluation based on verification tasks that were derived from 4336 Linux kernel modules, and a set of properties that define the correct usage of the Linux API. Checking several properties simultaneously can lead to a significant increase of the efficiency, due to the fact that abstract models share many parts among different properties.

Supplementary Data

We performed an extensive experimental evaluation to illustrate the potential and feasibility of our approach. The central parts of our replication package can be found here:

Artifact	Download
Full set of 4336 Linux 4.0-rc1 kernel modules.	Archive (compressed: 177 MB)
Reduced set of 250 randomly chosen Linux 4.0-rc1 kernel modules with at least two relevant properties.	Archive (compressed: 15 MB)
Set of properties (for on-the-fly weaving with CPAchecker).	Archive (compressed: 7 KB)
Full table with results used to answer RQ1.1 and RQ1.2.	HTML
Relevance of properties	HTML
Benchmark definition files (XML)	Archive (compressed: 8 KB)
CPAchecker in revision 21027	Archive (compressed: 71 MB)
CPAchecker in revision 21027, all benchmark definition files, and all verification tasks.	Archive (compressed: 292 MB)

Specification

Property	Description
LDV_08_1a	Each module that was referenced with module_get must be released by module_put. More Information on kernel.org.
LDV_10_1a	Each memory allocation that gets performed in the context of an interrupt must use the flag GFP_ATOMIC.
LDV_32_1a	The same mutex must not be acquired or released twice in the same process.
LDV_43_1a	Each memory allocation must use the flag GFP_ATOMIC if a spinlock is held.
LDV_68_1a	All resources that were allocated with usb_alloc_urb must be released with usb_free_urb.
LDV_68_1b	Each DMA-consistent buffer that was allocated with usb_alloc_coherent must be released by calling usb_free_coherent.
LDV_77_1a	Each memory allocation in a code region with an active mutex must be peformed with the flag GFP_NOIO.
LDV_101_1a	All structs that were obtained with blk_make_request must be released by calling blk_put_request afterwards. More Information on kernel.org.
LDV_106_1a	The modules gadget, char, and class that were registered with usb_gadget_probe_driver, register_chrdev, and class_register must be unregistered by calling usb_gadget_unregister_driver, unregister_chrdev and class_unregister correspondingly in reverse order of the registration.
LDV_118_1a	Reader-writer spinlocks must be used in the correct order.
LDV_129_1a	An offset argument of a find_bit function must not be greater than the size of the corresponding array.
LDV_132_1a	Each device that was allocated by by usb_get_dev must be released with usb_put_dev.
LDV_134_1a	The probe functions must return a non-zero value in case of a failed call to register_netdev or usb_register.
LDV_147_1a	RCU pointer/list update operations must not be used inside RCU read-side critical sections. More Information on kernel.org.

Replicating the Experiments

Prerequisites (Virtual Machine)

We have configured a virtual machine (based on an Ubuntu 16.04) which includes everything that is needed for replicating our experiments. The machine can be downloaded here. Both the username and the password is "fse" (without the quotes). The contents of the replication package can be found in the $BASE directory /home/fse/fse16-spec-decomposition.

Important: Due to the limitations of the virtual machine (8 GB of RAM, 1 CPU core), the virtual machine ships with benchmark-definition files that specify lower resource requirements.

Prerequisites (Manually)

In the following we will explain how our experiments can be replicated. A detailed description of our benchmarking configuration can be found in the paper. We presume that you have a machine that meets (at least) the following requirements (other configurations could work as well):

Desktop operating system based on Linux 3.x
OpenJDK 1.7 (or compatible)
32 GiB of RAM (less RAM would work in many cases; you would have to adjust some of the benchmark definition files manually)

First of all, please choose the root directory $BASE to which we extract all the data and programs that are necessary to replicate our experiments. This can be an arbitrary directory on your file system that can be written.

Now you can retrieve a copy of our replication package and extract its contents to $BASE: Download the archive (compressed: 292 MB) that includes all files for re-running our experiments, and extract them:

wget http://www.sosy-lab.org/~dbeyer/spec-decomposition/CPAchecker-package.zip
unzip CPAchecker-package.zip
export BASE=`pwd`

The directory sub tree of $BASE should now contain, at least, following elements:

  .
  └── CPAchecker-package
      ├── config
      ├── doc
      ├── experiments
      │   ├── properties
      │   ├── tasks-main0
      │   └── tasks-main0-atleast2-rand250
      ├── lib
      └── scripts

Single Linux Kernel Module

We will now illustrate how to replicate our experiments. Instead of describing the process for the full set of kernel modules tasks, we will discuss the process using the module drivers-isdn-gigaset-bas_gigaset-ko_main0.c

cd $BASE/CPAchecker-package
./scripts/cpa.sh -noout -heap 10000M -skipRecursion -disable-java-assertions \
    -mpa-alg-predicate-abelf \
    -setprop analysis.mpa.partition.time.cpu=900s \
    -setprop analysis.mpa.partition.operator=AllThenNoneOperator \
    -spec ./experiments/properties/LDV_134_1a_encode.spc \
    -spec ./experiments/properties/LDV_43_1a_encode.spc \
    -spec ./experiments/properties/LDV_68_1a_encode.spc \
    ./experiments/tasks-main0/drivers-isdn-gigaset-bas_gigaset-ko_main0.c

The output of the command should look similar to the following:

Running CPAchecker with Java heap of size 10000M.
Using the following resource limits: CPU-time limit of 900s (ResourceLimitChecker.fromConfiguration, INFO)

CPAchecker 1.5-svn (Java HotSpot(TM) 64-Bit Server VM 1.8.0_66) started (CPAchecker.run, INFO)

[...]

Using predicate analysis with SMTInterpol 2.1-224-gfd408f2-comp and JFactory 1.21. (PredicateCPA:PredicateCPA., INFO)

Using refinement for predicate analysis with PredicateAbstractionRefinementStrategy strategy.
    (PredicateCPA:PredicateCPARefiner., INFO)

[...]

Starting analysis ... (CPAchecker.runAlgorithm, INFO)

Checking 3 properties. (MultiPropertyAnalysis.run, INFO)

New partitioning with 1 partitions. (MultiPropertyAnalysis.partition, WARNING)

Partition 1 with 3 elements: [LDV_68_1a_encode.spc, LDV_43_1a_encode.spc,
    LDV_134_1a_encode.spc] (MultiPropertyAnalysis.partition, WARNING)

1 states in reached. (MultiPropertyAnalysis.initReached, WARNING)

1 states in waitlist. (MultiPropertyAnalysis.initReached, WARNING)

Waitlist with 3 active (0 inactive) properties. (MultiPropertyAnalysis.initReached, WARNING)

[...]

Fixpoint with 119112 states reached for: [LDV_43_1a_encode.spc, LDV_134_1a_encode.spc].
    0 properties remain to be checked. (MultiPropertyAnalysis.run, INFO)

Multi-property analysis terminated: 1 violated, 2 satisfied, 0 unknown (MultiPropertyAnalysis.run, WARNING)

Stopping analysis ... (CPAchecker.runAlgorithm, INFO)

Verification result: FALSE. Property violation (LDV_68_1a_encode.spc) found by chosen configuration.
    Number of considered properties: 3
    Number of violated properties: 1
    Number of unknown properties: 0
    Number of satisfied properties: 2
    Number of relevant properties: 3
    Status by property:
        Property LDV_68_1a_encode.spc: FALSE
        Property LDV_134_1a_encode.spc: TRUE
        Property LDV_43_1a_encode.spc: TRUE
More details about the verification run can be found in the directory "./output".

Characteristics of Java and the JVM

Since CPAchecker is written in Java, its performance depends on the JVM and its configuration. In cases where we compare several launches of CPAchecker, or multiple launches of the same code within CPAchecker, we have to consider the behaviour of the just-in-time compiler (JIT). In order to reduce the effect of the JIT as much as possible, we forced the JVM for some experiments to compile most of the byte code already during the startup. Starting CPAchecker with the argument -compile enables this behaviour by configuring the JVM with the arguments -Xcomp -XX:CompileThreshold=100.

Different Decomposition Strategies

The parameter analysis.mpa.partition.operator defines the decomposition strategy. In the following, we provide the configuration options for the strategies that were presented in the paper:

Strategy S₀: No Specification Decomposition

analysis.mpa.partition.operator=AllThenNoneOperator
analysis.mpa.partition.time.cpu=12600s

Strategy S₁: All Simultaneously, Remaining Separately

analysis.mpa.partition.operator=AllThenSepOperator
analysis.mpa.partition.time.cpu=900

Strategy S₂: All Simultaneously+Refinement Limit, Remaining Separately

analysis.mpa.partition.operator=AllThenSepOperator
analysis.mpa.partition.time.cpu=900
analysis.mpa.budget.limit.refinementsTimesMore=2
analysis.mpa.budget.limit.numRefinements=10

Strategy S₃: All Simultaneously, Irrelevant, Relevant Separately

analysis.mpa.partition.operator=AllThenIrrelevanceThenRelevantSepOperator
analysis.mpa.partition.time.cpu=900
analysis.mpa.partition.first.time.cpu=900
analysis.mpa.partition.second.time.cpu=900

Strategy S₄: All Simultaneously+Refinement Limit, Irrelevant, Relevant Separately

analysis.mpa.partition.operator=AllThenIrrelevanceThenRelevantSepOperator
analysis.mpa.partition.time.cpu=900
analysis.mpa.partition.first.time.cpu=900
analysis.mpa.partition.second.time.cpu=900
analysis.mpa.budget.limit.refinementsTimesMore=2
analysis.mpa.budget.limit.numRefinements=10

Running the Benchmark Suite

Our CPAchecker distribution includes a copy of BenchExec, a framework for reliable benchmarking and resource measurement. The script $BASE/CPAchecker-package/scripts/benchmark.py can to be started with a benchmark definition file as an argument; this will start a defined set of verification runs on the local machine.

A benchmark definition file specifies (1) which verification tool to use, (2) the configurations of the tool to benchmark, and the set of verification tasks, that is, (3) the set of properties to verify, and (4) the set of programs to check. As part of our replication package, we provide several of benchmark definition files (ZIP archive):

Benchmark Definition File	Purpose	Download
`bench-relevance-all.xml`	For each of the 4336 Linux kernel modules, identify which properties are relevant. We get the `Number of relevant properties` for each pair of kernel module and property, that is, the property is relevant if the value is equal to 1, and it is irrelevant if it equals 0.	XML
`bench-baseline.xml`	Performance of traditional model checking runs where only one property gets verified in one run of the model checker.	XML
`bench-rq1.xml`	Multi-property verification runs that are configured to check all properties in one partition, that is, we use the decomposition strategy S₀ on 250 kernel modules for that we check all 14 properties.	XML
`bench-rq2.xml`	Evaluation of the decomposition strategies S₁ – S₄ on the 250 kernel modules for that we check all 14 properties.	XML
`bench-rq3-baseline.xml`	Performance of traditional model checking runs where only one property gets verified in one run of the model checker; we now do NOT force the JVM to compile most of the bytecode during its startup.	XML
`bench-rq3a.xml`	Experiment for that we do not force the JVM to compile most of the bytecode during its startup; on the 250 kernel modules for that we check all 14 properties.	XML
`bench-rq3b.xml`	Experiment for that we do not force the JVM to compile most of the bytecode during its startup. To confirm our results and increase their validity, this experiment is on the larger set of verification tasks, that is, we check the 14 properties of all 4336 Linux kernel modules.	XML

If you want to run, for example, the experiments that are defined in bench-baseline.xml, you can do this by running benchmark.py with the benchmark definition file as argument:

cd $BASE/CPAchecker-package
./scripts/benchmark.py ./experiments/bench-baseline.xml

Before you can use CPAcheckers benchmarking infrastructure, and BenchExec, a number of system requirements have to be satisfied: BenchExec System Requirements. It might be sufficient to setup the local Cgroups with the following commands:

sudo mount -t cgroup cgroup /sys/fs/cgroup
sudo chmod o+wt,g+w /sys/fs/cgroup/

Important: The benchmarking script might complain that there is not sufficient memory installed:

Memory banks [0] do not have enough memory for one run, only 16704045056 bytes available.

The benchmark definition files specify the requirements on the hardware; you might have to adjust them on your machine(s):

<benchmark tool="cpachecker" timelimit="1000" hardtimelimit="1100" memlimit="30GB" cpuCores="4">
  <option name="-heap">26000M</option>

On-The-Fly Decompositionof Specificationsin Software Model Checking