We are hiring new student research assistants and tutors. Apply now!
We receive 4 Contributor Projects for Google Summer of Code 2023

Teaching at LMU

Student Talks at LMU

Thesis talks are part of our Oberseminar. All students and staff are welcome to participate.

Teaching in Summer 2023



Training (Praktikum):

Teaching in Past Semesters

Information on courses tought in past semesters can be found on a separate page.

Regularly Offered Courses


For all seminars these rules apply.

Important Links

Projects, Bachelor and Master Theses

We offer several topics for theses on Bachelor and Master level in our research projects (German).

For the MSc in computer science, it is also possible to do one of our proposed projects as practical training (Modulhandbuch, 2.1 P 1) for 6 or 12 ECTS.

When writing a thesis at our chair, follow our thesis guidelines and evaluation criteria and use these technical hints for writing. You can find more important information in our wiki.

Please contact us if you are interested in one of these topics or if you want to propose your own topic that is related to our chair. Please include your transcript in the request so that we can meet and assess suitable topics that fit your qualification. You can also have a look at our collection of past student reports and theses to get an impression of what a thesis at our chair could look like.

Click on marked topics (▶) to show a more detailed description.
Currently unavailable or finished topics are also shown, and may give inspiration for own ideas.

Formal Verification and Testing

Available topics
Support witness format conversion in Btor2C [1, 2, 3, 4] (Mentors: Nian-Ze Lee, Po-Chun Chien)

Background: Btor2 is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software verifiers. If a software verifier finds an error in the program, it generates a violation witness, which records the execution path leading to that error. On the other hand, if a software verifier proves a program correct, it generates a correctness witness containing some program invariants that could reconstruct a correctness proof.

Goal: The goal of this project to implement a translator that converts witnesses (in GraphML format) produced by software verifiers to Btor2 format, such that one could use hardware verifiers/simulators to validate them.


  • Ability to program in Python (and/or Java/C)
  • Basic knowledge of transition system and SAT/SMT solving

Support bit-precise integer representation in CPAchecker for hardware-verification tasks [1, 2] (Mentors: Nian-Ze Lee, Po-Chun Chien)

Background: Btor2 is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software verifiers, e.g. CPAchecker. In Btor2C, we encode Btor2 bit-vectors with C integers. A Btor2 bit-vector can have an arbitrary width, thus there might be some spare most-significant bits in the C-integer variable. These spare bits might complicate the analysis and harm the performance of software verifiers.

Goal: The goal of this project to support bit-precise integer representation in CPAchecker specifically for these hardware-verification tasks. There are several possible ways to achieve this:

  • Define a customized C annotation or integer type such that CPAchecker can infer the precise bit-width
  • Implement a Btor2 frontend in CPAchecker


  • Ability to program in C and Java
  • Basic knowledge of transition system and SAT/SMT solving

Automatic Program Repair and Patch Generation [1, 2, 3] (Mentor: Matthias Kettl)

Verifiers are able to tell whether a given task is safe or faulty with respect to a given specification. However, it does not provide enough information to immediately fix the bug. To mitigate this drawback, we want to apply automatic program repair to unsafe tasks. Given a program that violates a given specification we want to synthesize a patch for the program such that the same faulty behavior cannot be observed anymore.

Tubes: Identifying Abstract Error Traces (Mentor: Matthias Kettl)

There might be several reasons for reaching a specific error location in a program. We want to identify all classes of inputs that trigger the same bug. They should be as abstract as possible but as precise as necessary. For a given program we want to find out, which restrictions to nondeterministic variables are necessary to reach an error location. For example, we might reach error location E1, if the nondeterministic variable x equals 1 and the nondeterministic variable y is less than 1024. Error location E2 might be reachable for x > 100 and y < 100. Otherwise the program does not show faulty behavior. In our approach we are interested in the most abstract restrictions for all inputs for reaching error locations instead of a single testvector like x = 1 & y = 1023 for E1 and/or x = 101 & y = 99 for E2.

Deductive Verification with CPAchecker [1, 2] (Mentor: Martin Spiessl)

CPAchecker is a framework for automatic software verification developed at our chair. As such it has a broad technology stack that should provides everything that is needed for verifying C programs. It should therefore be possible to extend the framework such that one can perform deductive proofs by adding the right annotations to the C program. Support for parsing the specification language ACSL was added recently to CPAchecker's frontend, and for solving the different verification conditions that arise from user-provided annotations one could rely reuse the program analyses that are already part of CPAchecker.

Analysis of a real world software project: What can be specified, analyzed?
Refinement Selection with Adjustable-Block Encoding [1, 2, 3] (Mentor: Philipp Wendler)
Currently assigned topics
Boost Symbolic Memory Graph based Explicit Value Analysis with Numerical Abstract Reasoning (Mentor: Daniel Baier)

Explicit Value Analysis with Symbolic Memory Graphs (SMGs) can analyse programs, including memory operations, fast and efficient. However, Value Analysis often over-approximates values beyond an already encountered bound. The goal is to combine the existing SMG based Value Analysis with the existing Numerical Abstract Domain (APRON) using a composite analysis, reusing the already existing components. This is then implemented in CPAchecker and evaluated against other verification approaches over the SV-COMP benchmark set.

Reverse program synthesis for backward reachability analysis [1] (Mentors: Nian-Ze Lee, Po-Chun Chien)

In most verification tools, the reachability analysis searches for a path from the initial program location to the error location. The analysis could also be done backward, i.e. tries to reach the initial location from the error location. The goal of this project is to implement a control-flow automata (CFA) transformer in CPAchecker that reverses program executions by following the transitions backward. One could then apply existing analysis to verify the reverse CFA.


  • Ability to program in Java and understand C code
  • Basic knowledge of control-flow automata

Improve array encoding in Btor2C [1, 2] (Mentors: Nian-Ze Lee, Po-Chun Chien)

Background: Btor2 is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software analyzers. Currently, on every write of an array, Btor2C creates a copy of it, even when unnecessary (see this issue). This might put burdens on software analyzers and hinders their performance.

Goal: The goal of this project is to alleviate such situation and allocate a new array only when necessary. Additionally, we also want to implement a translator that converts Btor2 array tasks to pure bit-vector tasks by blasting the array operations (read and write) into a chain of ite operations.


  • Ability to program in C (and/or Python)
  • Basic knowledge of transition system and SAT/SMT solving

Implement backward bounded model checking in CPAchecker [1] (Mentors: Nian-Ze Lee, Po-Chun Chien, Henrik Wachowitz)

Bounded model checking (BMC) is a well-known software analysis technique that searches for execution paths that could lead to the error location from the initial program location. Backward BMC, on the contrary, follows the execution backward and searches paths from the error location to the initial location. The goal of the project is to implement such technique in CPAchecker.


  • Ability to program in Java and understand C code
  • Basic knowledge of SAT/SMT solving and weakest precondition

Scaling Formal Verification: Parallel Analysis of Functions in CPAchecker [1] (Mentor: Matthias Kettl)

Verifying complex programs with respect to a given specification (e.g., 'check that function reach_error is never called') may be time consuming. To make verifiers more applicable in industry, we want to tackle the problem with a divide-and-conquer approach. We decompose the verification task in multiple smaller tasks that are then verified in parallel. As soon as one of the analyses reports a violation, we abort all other analyses and report the violation. This will lead to many false alarms since we do not know with which input parameters functions will be called. Return values of functions will be ignored and interpreted as unknown value in a first step. From there we can continue in two directions:

  1. Analyze functions under a user defined condition (e.g., a user restricts the values of parameter x of function foo to the interval [0;100])
  2. We identify each function call automatically and try to infer all parameters.
The goals of this topic are:
  1. The verification verdict should be available faster (in terms of passed walltime).
  2. Wrong proofs should be non-existing. In case a program is safe, the decomposition should be a sound over-approximation.
Finally, we will evaluate this approach on the SV-COMP benchmark set. This topic is inspired by the tool Infer

Fault Injection: Generating a Benchmark Set for Automated Fault Repair [1, 2, 3] (Mentor: Matthias Kettl)

The SV-COMP benchmarks set is one of the largest benchmarks sets of C programs for software verification. For all programs in the set we know whether the program fulfills certain properties, e.g., the function 'reach_error' is never called. The benchmarks help to evaluate state-of-the-art verification tools and their effectiveness. For fault localization techniques there is no large benchmark set that provides sufficient metadata for each task to perform a large-scale evaluation. The idea is to create such tasks by manually crafting them, injecting faults in safe tasks, or identifiying fixes for unsafe tasks.

Fault Localization for Distributed Block Summaries [1, 2] (Mentor: Matthias Kettl)

CPAchecker is a framework for automatic software verification developed at our chair. As such it has a broad technology stack that should provides everything that is needed for verifying C programs. CPAchecker is able to distribute the analysis of a single task to different workers by decomposing the input program into blocks. Each worker analyzes one block and communicates the results to all other workers over messages to establish the correct verification result. In case we find a violation, fault localization can help to explain the root cause of the error. There already exist several techniques of fault localization in CPAchecker. Unfortunately, some of them may take long since they have to check 2^n possibilities in the worst-case scenario. By distributing fault localization to the relevant workers, 'n' shrinks since they work on smaller blocks. We expect a performance boost for fault localization after distributing it.

Reconstruction of Violation and Correctness Witnesses for Distributed Block Summaries [1] (Mentor: Matthias Kettl)

CPAchecker is a framework for automatic software verification developed at our chair. As such it has a broad technology stack that should provides everything that is needed for verifying C programs. CPAchecker is able to distribute the analysis of a single task to different workers by decomposing the input program into blocks. Each worker analyzes one block and communicates the results to all other workers over messages to establish the correct verification result. In case we find a violation, we want to provide the user with a counterexample that proves the reachability of a violation. Currently, neither the counterexample nor invariants can be synthesized from the messages. However, witnesses are important for improving the trust in the verfication results.

Hardware Speed-Independent Analysis Selection Heuristrics (Mentor: Martin Spiessl)

Background:Current state-of-the-art verifiers like CPAchecker use some kind of time-based criterion for deciding when to switch to a different analysis. For example, in CPAchecker it might happen that a so-called value analysis is executed for 90 seconds and after that a switch to a different analysis is performed. These time boundaries are optimized to yield the best results in the Competition on Software Verification (SV-COMP), where a certain benchmark set (SV-Benchmarks) is used. SV-COMP is typically executed on our Apollon cluster with nodes that have a Intel Xeon E3-1230 v5 CPU each with a CPU time limit of 900 seconds per verification task. Now this has implications for reproducibility on other hardware. If someone wants to replicate these experiments on newer(faster) or older(slower) hardware and uses the same 900 seconds as time limit, the results might actually not be comparable. For example, on a machine that is tenfold slower than our cluster nodes, the switch after 90 seconds described above could stop the value analysis at a point where it only analyzed 10% as much as on the Apollon nodes.
Goals: 1. One goal is to first proof the existence of the described effect by performing some experiments on our benchmarking hardware. Here also other tools from SV-COMP can be analysed to determine which tools have the same problem. 2. After that one could look into ways to make the execution of CPAchecker more independent of time measures like CPU time or wall time. A simple approach could be to do a small micro benchmark in the beginning to determine the hardware 'speed' and take this into account for time-based decisions (though this is not actially making the analysis independent of time measures and just reduces this effect). This can either be done in the tool or tool-independently as a step before the actual tool is executed. Another (more promising) idea is to introduce alternative metrics for analysis progress and use these to determine some sort of analysis 'fuel' that gets used up by the analysis and is dependent factors like number of analysis rounds, explored states, or similar. This then also opens the way for potentially improving the current heuristics because there might be other time-independent indicators that can predict better whether an analysis will be successful or whether switching the analysis will be beneficial.

Genetic Programming in Software Verification (Mentor: Martin Spiessl)

The idea is to generate verification tasks via genetic programming,i.e. do the following repeatedly on a set of programs:
1. mutation of the programs' ASTs
2. selection of 'fit' programs based by running the generated programs against some verifier and chosing some fitness metric that is determined by the verification results

This can then be used used for a large number of interesting applications:
One very simple would be fault localization: Here the allowed mutations are those that remove program statements, and the fitness is determined by a) the bug has to still be present and b) the shorter the program the fitter.
Another application would be to generate structure-wise simple programs that are still hard to solve or to generate programs that are easy to solve by verifier/approach A, but hard to solve by verifier/approach B.

In the end the challenging part here (and why this probably has not been done really for software verification) is to keep time per roundtrip low enough to get enough iterations. For that the idea would be to start use bounded model checking/ limit the analysis time. Other tactics like speculative execution or using some static method to predict the fitness might also be worth exploring.
From a techical point of view the goal is to use BenchExec for executing the mutated programs each round (and such reduce the wall time for each round).
The language used for implementation will be Python3, which allows to use the pycparser as a frontend.

Verification of Timed Automata with BDDs [1] (Mentors: Prof. Dr. Dirk Beyer, Nico Weise)

Timed automata are model of computation that was invented 30 years ago with the goal of modelling not only the sequence of events but also the time of their occurrence. Timed automata are automata with a finite set of control states and clock variables whose values continuously increase over time. The objective of this thesis topic is to construct a basic reachability analysis for timed models.

Consolidate pointer-aliasing analyses in CPAchecker [1] (Mentor: Philipp Wendler)

CPAchecker has several different implementations of pointer-aliasing analyses, and the current state of what they support is unclear. This leads to the fact that they are rarely used and causes wrong results in case of pointer aliasing. The goal of this thesis is to create a strong CPA for pointer aliasing (either by merging the existing CPAs or rewriting them) that supports all necessary use cases (fast and imprecise as well as more precise) and migrate all existing uses of other pointer CPAs, such that the old code can be deleted (#358). Afterwards, it may be possible to add additional features (e.g., #742) or extend other components of CPAchecker to make use of the new CPA (e.g., #527, #398).

Support for exceptions in the Java analysis of CPAchecker (Mentor: Philipp Wendler)
Non-termination analysis in CPAchecker (Mentor: Marie-Christine Jakobs)
User-friendly tactics for interactive proofs (Mentor: Gidon Ernst)
Symbolic Automata Learning for Unbounded Systems (Mentor: Gidon Ernst)
Case-studies in verifying Python programs (Mentor: Gidon Ernst)
Loop summarization for array programs (Mentor: Gidon Ernst)
Formal Verification of Interactive Visual Novels (Mentor: Gidon Ernst)
Invariant Generation through Sampling [1] (Mentor: Martin Spiessl)

Finding the right invariants is essential when trying to proof properties about programs with loops. In CPAchecker we currently have a search-based analysis that generates candidate invariants. Another approach is to try to learn and continuously refine the invariant by training a classifier. This should work especially well if the neccessary invariants are linear equalities, as this corresponds to the classification boundary of a linear classifier.

Finished topics
Program Transformation in CPAchecker: Design and Implementation of a Source-Respecting Translation from Control-Flow Automata to C Code (Mentor: Thomas Lemberger)
Developing a Verifier Based on Parallel Portfolio with CoVeriTeam (Mentor: Sudeep Kanav)

CoVeriTeam is a framework to compose verification tools. We can compose different tools resulting in a verification tool. This thesis aims to construct a new verifier based on a composition of a few tools that solves more tasks than (most of) the verifiers participating in SVCOMP.
In this work, we aim to develop a verifier that is fast with respect to walltime (not cputime). This verifier will be a portfolio solver. We will develop a new kind of composition called portfolio in CoVeriTeam. This composition will make use of MPI technology to execute the processes on different machines.
evaluation: execute the composed verifier on the SVCOMP benchmarks and compare it with the participants in the walltime track competition. The implementation will be done in Python.

A New Spin on Verification with Symbolic Execution: Symbolic Execution as Formula-Based Predicate Analysis in CPAchecker (Mentor: Thomas Lemberger)
Taint Analysis for CPAchecker [1] (Mentor: Martin Spiessl)

While taint analysis has many applications in practice (especially for software security), there is currently no implementation of a taint analysis in CPAchecker. Also there is no category in the Competition on Software Verification (SV-COMP) for taint analysis. The idea behind this thesis is to add support for a simple taint analysis to CPAchecker, search or create some benchmark tasks and standardize them in such a way that they can be used as part of SV-COMP in a new category.

Mining a Benchmark Set for Partial Fixes in C programs [1] (Mentor: Thomas Lemberger)
Concurrent Software Verification through Block-based Task Partitioning and Continuous Summary Refinement [1] (Mentor: Thomas Lemberger)
Adjustable Block Analysis: Actor-Based Creation of Block Summaries for Scaling Formal Verification [1] (Mentor: Thomas Lemberger)
Mutation based Automatic Program Repair in CPAchecker [1] (Mentors: Sudeep Kanav, Thomas Lemberger)

Debugging software usually consists of three steps: 1. finding an error in the program, 2. finding the cause for that error (the fault), and 3. fixing that fault. By default, fixing the fault is done manually by a programmer, which often consumes vast amount of resources and may introduce new bugs. To ease that task, this topic is concerned with automatic program repair. After finding a potential fault in a program, automatic program repair proposes fixes and often checks the validity of these fixes. The goal is to implement and evaluate a technique for automatic program repair in C programs in the state-of-the-art verification framework CPAchecker. The complexity of the technique and the scope of the topic is based on the thesis type (project, bachelor or master). Programming is done in Java.

String analysis in CPAchecker [1, 2] (Mentor: Philipp Wendler)
String analysis in CPAchecker [1, 2] (Mentor: Philipp Wendler)
LTL software model checking (Mentor: Philipp Wendler)
Conversion of counterexamples found by software verifiers into an executable test harness that exposes the bug in the program (prototype exists) [1] (Mentor: Thomas Lemberger)
Correctness-witness validation using predicate analysis (Mentor: Martin Spiessl)

Verifiers take a program and a specification as input and output whether the program fulfills the specification.If the answer is yes, the verifier can output additional information in form of a correctness witness.This correctness witness can then be used (e.g. by another verifier) to validate that the answer is correct.Our verifier CPAchecker provides many different analyses for verification, but only one of them (called k-Induction) is currently able to validate witnesses.An analysis based on predicate abstraction already exists in CPAchecker that should be able to validate witnesses with minor to moderate programming effort.Once this is working, predicate abstraction could also give insights into which information is missing in the witness.At the end, an evaluation of the new witness validation can be performed on an already existing set of representative programs using our verification cluster.This topic is suitable for a bachelor's thesis, but could also be extended for a master's thesis on request.

Heuristic selection of verification analyses (Mentor: Matthias Dangl)
A JavaScript frontend for CPAchecker (Mentor: Philipp Wendler)
Improve analysis of Java programs in CPAchecker (Mentor: Philipp Wendler)
Towards understandability of verification result: visualization of graphs produced by verifiers (Mentor: Philipp Wendler)
Predicate analysis based on splitting states for CPAchecker [1, 2] (Mentor: Philipp Wendler)
Improve predicate analysis based on domain-type of variables in CPAchecker [1, 2] (Mentor: Philipp Wendler)
New Approaches and Visualization for Verification Coverage [1] (Mentor: Philipp Wendler)
Application of Software Verification to OpenBSD Network Modules [1] (Mentor: Thomas Lemberger)
Implementation of an Analysis for Symbolic Memory Graphs (SMG) in CPAchecker [1, 2] (Mentor: Thomas Bunk)
Unify analyses (ValueAnalysis, IntervalAnalysis, SignAnalysis) in CPAchecker by using a common data-wrapper for numerical values (integers, bitvectors, floats, ...), optional: support for symbolic values, BDDs, and Octagons [1] (Mentor: Karlheinz Friedberger)
Define and evaluate different refinement approaches for interval domain or BDD domain (widening, splitting, interpolation, path prefixes) [1, 2, 3] (Mentor: Karlheinz Friedberger)
Separation Logic in CPAchecker (Mentor: Martin Spiessl)
Symbolic Memory Graphs in CPAchecker [1, 2] (Mentor: Karlheinz Friedberger)
Multi-process architecture for BAM-parallel to provide cluster-based software analysis, maybe with central or shared database for reached states (Mentor: Karlheinz Friedberger)
Backwards analysis for CPAchecker, initially BDD- and predicate-based (both partially implemented somewhere), later symbolic and explicit analysis. (Mentor: Karlheinz Friedberger)
Good, bad, ugly interpolants. Measure interpolants and build heuristics. Topic might be related to refinement selection. [1] (Mentor: Philipp Wendler)
Hybrid Testcase Generation with CPAchecker [1] (Mentor: Thomas Lemberger)
A Web Frontend For Visualization of Computation Steps and their Results in CPAchecker [1] (Mentor: Thomas Lemberger)
Parallel BDDs in Java (cf. Sylvan) [1] (Mentor: Karlheinz Friedberger)
Fault Localization in Model Checking. Implementation and Evaluation of Fault-Localization Techniques with Distance Metrics [1] (Mentor: Thomas Lemberger)

There are many different ways to find out whether a program is buggy, for example testing or formal verification. Once we know that a program is buggy, i.e., that it shows some faulty behavior, a developer has to manually debug the program to find the cause of the problem and fix it - this is a difficult and long-taking process. The aim of this thesis is to help programmers with debugging through the use of distance metrics (cf. Explaining Abstract Counterexamples, 2004).

Fault Localization for Formal Verification. An Implementation and Evaluation of Algorithms based on Error Invariants and UNSAT-cores [1] (Mentor: Thomas Lemberger)

There are many different ways to find out whether a program is buggy, for example testing or formal verification. Once we know that a program is buggy, i.e., that it shows some faulty behavior, a developer has to manually debug the program to find the cause of the problem and fix it - this is a difficult and long-taking process. The aim of this thesis is to help programmers with debugging by implementing Error Invariants and bug-finding with unsat cores, techniques for bug-finding based on boolean representations of the faulty program, to (1) automatically find potential causes for faulty behavior and (2) present the found causes.

Test-based Fault Localization in the Context of Formal Verification: Implementation and Evaluation of the Tarantula Algorithm in CPAchecker [1] (Mentor: Thomas Lemberger)

There are many different ways to find out whether a program is buggy, for example testing or formal verification. Once we know that a program is buggy, i.e., that it shows some faulty behavior, a developer has to manually debug the program to find the cause of the problem and fix it - this is a difficult and long-taking process. The aim of this thesis is to help programmers with debugging by implementing Tarantula, a technique for bug-finding based on test coverage, to (1) automatically find potential causes for faulty behavior and (2) present the found causes.

Converting Test Goals to Condition Automata [1] (Mentor: Thomas Lemberger)

Conditional Model Checking and Conditional Testing are two techniques for the combination of respective verification tools. Conditional Testing describes work done through a set of covered test goals and Conditional Model Checking describes work done through condition automata. Because of this discrepancy, the two techniques can not be combined. To bridge this gap, this thesis transforms a set of test goals into a condition automaton to allow easy cooperation between Conditional Testing and Conditional Model Checking.

Implementation and Evaluation of k-Shrinkability [1, 2, 3, 4] (Mentor: Martin Spiessl)
A Language Server and IDE Plugin for CPAchecker [1] (Mentor: Thomas Lemberger)

At the moment, there are two ways to use CPAchecker: Locally through the command-line, and in the cloud through a web interface. Both options require C developers to leave their IDE everytime they want to run CPAchecker, which disrupts their workflow. The goal of this project is to build an IDE plugin that provides a seamless experience of using CPAchecker (e.g., for Eclipse CDT, VisualStudio Code or mbeddr). We would like the plugin to focus on ease of use and immediate feedback that is easy to comprehend. The plugin should:

  1. allow developers to run CPAchecker on their current project,
  2. provide some options for customization (e.g., whether to run CPAchecker locally or in the cloud), and
  3. provide useful feedback (e.g., if a possibly failing assert-Statement was found by CPAchecker)

Specifying Loops with Contracts. [1] (Mentor: Gidon Ernst)
Information flow testing for PGP keyservers [1] (Mentor: Gidon Ernst)
Implement Fuzzing in CPAchecker (Mentor: Gidon Ernst)
SMT-based checking and synthesis of formal refinements (Mentor: Gidon Ernst)
Rely/Guarantee Reasoning for Separation Logic in SecC (Mentor: Gidon Ernst)
Loop Contracts for Boogie and Dafny (Mentor: Gidon Ernst)
Frontends for a deductive verifier (Python) (Mentor: Gidon Ernst)
Frontends for a deductive verifier (Boogie) (Mentor: Gidon Ernst)
Guided fuzz testing with stochastic optimization (Mentor: Gidon Ernst)
Interactive verification debugging [1] (Mentor: Gidon Ernst)
Simplification and solving strategies for SMT (Mentor: Gidon Ernst)
Development of an interactive theorem prover for untyped logic (Mentor: Gidon Ernst)
Inference of Invariants over Algebraic Data Types (Mentor: Gidon Ernst)
Test-case generation for protocol implementations via fuzzing (Mentor: Gidon Ernst)
Concolic Testing for Liskov's substitution principle (Mentor: Gidon Ernst)
Studie: PBT Driven Design am Beispiel eines Embedded Software Projekts (Mentor: Gidon Ernst)
Fuzzing Strategies in Legion/SymCC (Mentor: Gidon Ernst)
Experiments with Inference of Loop Contracts in Dafny (Mentor: Gidon Ernst)
A code-complexity analysis on the component level on the example of CPAchecker [1] (Mentor: Thomas Lemberger)
Real-World Requirements for Software Analysis [1, 2, 3, 4, 5, 6] (Mentor: Martin Spiessl)

The National Institute of Standards and Technology (NIST) has published a specification for capabilities that a "Source Code Security Analysis Tool" needs in order to be useful in software security assurance [1][2]. This is accompanied by test suites (subsets of the NIST SARD test suite, e.g. suite 100 and 101 [3]) with programs that have to be checked for certain weaknesses [4][5]. For classification of the weaknesses, CWE-IDs are used.

This now leads to various interesting research questions. A subset of those can be addressed by a project or thesis, depending on the available time and progress.

0. How can the CWE-IDs from the relevant SARD test suites be mapped to the specification that are used in the Competition on Software Verification(SV-COMP)?
Some of the weaknesses, e.g. memory-safety related ones or undefined behavior like overflows, are already checked for in the SV-COMP. For the CWEs that cannot be translated into the specifications currently used in the SV-COMP, a critical discussion should be added on how easy/hard it would be to add these and on how easy/hard it is to check for those properties. The extend of this discussion can be tuned to the scope of the project/thesis.

1. How many of the tests in the test suite can already be checked by the automatic software verifiers that participated in the Competition on Software Verification(SV-COMP)?
One of the claims of the SV-COMP is that it shall reflect real-world problems in software verification. As such, it is of interest to determine whether this is the case by applying the participating tools to tasks that are used in industry, like the SARD test suite.
Provided that the license allows this, the corresponding test cases can be extracted from the relevant SARD test suites and converted into tasks for the SV-COMP. A task in the SV-COMP consists of a program and a specification, so ideally the right specification is automatically determined by parsing the information in the xml manifest of the SARD test suite. After the conversion is complete, an extensive evaluation with the tools that participated in the SV-COMP (and optionally other tools, like the ones listed in [6]) shall be performed.

2. Can CPAchecker be extended (e.g. by new analyses) to find these weaknesses and become a Source Code Security Analyzer like many other tools[6]?
Currently, CPAchecker is not really designed to return multiple findings, as in verification it is usually enough to show one specification violation. For some CWEs there should be support, e.g. memory safety or overflows. Others however (e.g. "CWE-134: Uncontrolled Format String") would require a new analysis to be implemented. For reducing the scope of the thesis/project, one can also just pick a subset of the CWEs and implement the corresponding analysis or analyses. Another way to adapt the scope is to only look at some of the requirements. These are divided into mandatory features SCSA-RM-{1,2,3,4,5,6} and optional features SCSA-RO-{1,2,3}[1].

CWE-Ids for Software Analysis in CPAchecker and SV-COMP (subtopic from 'Real-World Requirements for Software Analysis') [1, 2, 3, 4, 5, 6, 7] (Mentor: Karlheinz Friedberger)

The long description of the supertopic can be found under 'Real-World Requirements for Software Analysis'.
This subtopic is a specialization for finding and documenting CWE-Ids that match them against CPAchecker's analyses. The target is not as broad as the supertopic, but more related to our own software project. Plan:

  • Which CWE-Ids are supported by CPAchecker? Can we simply extend CPAchecker for a few more?
  • Can we describe CWE-Ids with a specification like the BLAST query language?
  • Does SV-COMP provide tasks for the found CWE-Ids?
  • Provide a nice overview (webpage?) for those CWE-Ids?
The thesis does not include much code-related work, but focuses more on literature research and documentation.

Predicate-based Analysis of Concurrent Programs [1, 2, 3] (Mentor: Karlheinz Friedberger)

Issue 464

Reliable Benchmarking and Measurements

Currently assigned topics
Extend allocation of CPU cores in BenchExec [1, 2] (Mentor: Philipp Wendler)

BenchExec is a benchmarking framework and assigns a specific set of CPU cores to each benchmark run. The calculation which core to use for which run needs to consider low-level details of the hardware architecture (like hyper threading, NUMA, etc.) in order to avoid interference between runs as far as possible. Our current implementation supports multi-CPU systems with NUMA and hyper threading, but not some more recent features like hybrid-architecture CPUs or CPUs with split Level 3 caches. The goal of this thesis is to implement a more general core allocation (in Python) with support for all kinds of modern CPUs and additional features and rigorous tests. There should also be an experimental evaluation on how different allocation strategies and such hardware details influence benchmarking results.

Finished topics
Use cgroups v2 for BenchExec to improve container isolation [1, 2] (Mentor: Philipp Wendler)

Our benchmarking framework BenchExec uses cgroups in order to limit and measure time and memory usage during benchmarking. The Linux kernel is replacing cgroups with a new API (v2) and we need to support this in BenchExec (#133). Once we have it, we can use cgroup namespaces to better isolate our containers and provide full support for nested containers (#436).

Modern architecture and tests for HTML tables of BenchExec [1] (Mentor: Philipp Wendler)
Improve usability of summary tab for HTML tables of BenchExec [1] (Mentor: Philipp Wendler)
Measuring and optimizing energy consumption of verification work (Mentor: Martin Spiessl)

SMT Solvers and Java SMT

Currently assigned topics
JavaSMT: integrate a new solver (with automated tests and build documentation). Possible candidates: Bitwuzla, Vampire, dReal, OpenSMT, STP,... (Mentor: Daniel Baier)

JavaSMT is a framework that provides a common API for several SMT solvers. Adding support for an additional solver is straightforward. However, JavaSMT is written in Java, while most solvers are written in C or C++. This means one usually has to write bindings using the Java Native Interface. This topic is suited for a bachelor's thesis

JavaSMT: develop and implement a solver-independent SMT2 String parser/composer (with automated tests and build documentation). (Mentor: Daniel Baier)

JavaSMT is a framework that provides a common API for several SMT solvers. However, there are solver that do not offer parsing/dumping of formulas in the SMT2 format. The goal of this thesis would be the development and implementation of a new, solver-independent layer in JavaSMT, that enables users to generate SMT2 formulas in String form using the API, but also read arbitrary SMT2 formulas into any of the existing solvers APIs without using the solvers parsing abilities. This topic is suited for a bachelor's thesis

Interpolants build from an arbitrary SMT solver (develop a layer into JavaSMT to get (tree) interpolants without explicit support of the solver e.g. from proofs or unsat cores) [1, 2, 3, 4] (Mentor: Daniel Baier)
Finished topics
SMT query caching, slicing, and simplification module for Java SMT [1] (Mentor: Karlheinz Friedberger)
Fuzzing for SMT Solvers on API Level [1, 2] (Mentor: Karlheinz Friedberger)

Formal methods use SMT solvers extensively for deciding formula satisfiability. However, solvers may contain critical bugs that lead to unsound results. We can use tools for mutation-based fuzzing on our own API and the underlying SMT solvers. We can report bugs back to the developers and update JavaSMT with fixed solver libraries or bindings.

Implementing a fast SMT Solver based on interval-computation [1] (Mentor: Karlheinz Friedberger)

An SMT formula is a boolean formula that includes numbers and common arithmetic operations. Since formal verification techniques often use SMT formulas to represent programs and models , the speed of solving SMT formulas is a very important aspect of formal verification. Studies show that SMT formulas that are created by model checkers or symbolic execution often have a specific structure: The relation between program variables (e.g., "x == y + z") is dependent on program constraints over these variables that only contain a single variable each, e.g., x > 0, y > 0, and z > 10. This example would be represented as the SMT formula x == y + z ∧ x > 0 ∧ y > 0 ∧ z > 10.

The satisfiability of SMT formulas with this specific structure can be computed through the following decision procedure:
1. For each program variable in the formula, compute the interval(s) of possible values for this variable for which the constraints of this variable are satisfied, and
2. Check whether the relations between program variables can be satisfied with the computed intervals.

The goal of this topic is to implement this procedure and to make it available as a Java library. Work on this in the symbolic execution engine Klee has shown great potential, leading to speedups of up to 55x compared to the use of existing complete SMT solvers.

Non-Deterministic Tests (Flaky Tests)

Currently assigned topics
Modern Tests for BenchExec [1] (Mentor: Philipp Wendler)

BenchExec's tests are currently written using the test framework nose, which is not maintained anymore and has problems on new Python versions. The goal of this project is to switch the test framework used for BenchExec. As a first step, an appropriate framework needs to be chosen. This should be done systematically based on criteria like ease of use, how common the framework is, whether it is well maintained etc. Next, the framework needs to be integrated in BenchExec's project structure and the existing tests need to adjusted where necessary. This might also include restructuring the tests and in general improving their state. Filling gaps in our test coverage with more unit tests would also be highly welcome.

Data Flow Based Assessment of Unintended Test Result Deviations (Mentors: Stefan Winter, Philipp Wendler)

Background: Unintended Test Result Deviations (UTRD) denote changes in test results that are not due to changes in the code under test. UTRD constitute a major impediment to regression testing (a corner-stone of modern software development), because a change in test results is commonly attributed to a change in the code under test (CUT). If a test verdict changes due to other reasons, developers may waste time debugging the CUT, which is not the cause of the changed test result. UTRD has been acknowledged as a major problem by major software companies and organizations, including Microsoft, Google, and Mozilla.
Goal: The goal of the proposed thesis is to develop a static data-flow analysis to detect whether test outcomes, which are usually based on data comparisons, depend on possibly uncontrolled inputs to the CUT. Such uncontrolled inputs may, for instance, be environment variables, file I/O, usage of random number generators, etc. The analysis is supposed to work on C programs and should be based on the CPAchecker framework.

BDDs, ZDDs and ParallelJBDD

Finished topics
Implement more decision diagrams in ParallelJBDD: CBDDs [1] (Mentor: Stephan Holzner)
Implement more decision diagrams in ParallelJBDD: tagged BDDs [1] (Mentor: Stephan Holzner)

Collaboration with Bundeswehr University

We also offer thesis topics in collaboration with the PATCH lab at Bundeswehr University. These theses will be formally supervised by our chair, but the day to day work will be guided by a mentor from the PATCH lab.

The language for the following theses is English.

Finished topics
SV-comp Benchmarks for Weak Memory Models [1] (Mentor: Sudeep Kanav)

SV-Comp is the International Competition on Software Verification. It serves as a platform for the research community to evaluate tools targeting verification of C and Java programs. A total of 28 tools from 11 different countries participate in SV-Comp’20.

The simplest (and strongest) model of computation for concurrent programs is sequential consistency (SC), i.e. interleaving semantics. However, for performance reasons, mainstream processors perform optimisations that results in more possible executions than those allowed by SC. This new execution can result in bugs in the program which are hard to detect. All executions which are valid according to a certain processor are specified by its memory model.

We aim to create a new category in SV-Comp dealing with memory models. We currently have a collection of more than twelve thousand benchmarks that were used to validate the formalisation of the memory models of x86, Arm and Power processors and the Linux kernel. Those benchmarks are written in the litmus format, which is only supported by a small number of tools. The goal of this thesis is to port the benchmarks from the litmus format to C code. Dartagnan is a verification tool with support for memory model that can already parse all such benchmarks. What is required, is to implement an exporter from Dartagnan’s intermediate representation of programs to C.

Verification Witnesses: from Llvm to C [1] (Mentor: Sudeep Kanav)

Dartagnan is a software verification tool. It takes a C program, it compiles it to Llvm and then it performs the verification on the Llvm code. In 2020, Dartagnan participated for the first time in Sv-Comp, the International Competition on Software Verification. One of the requirements of Sv-Comp, is to provide verification witnesses: a proof of correctness or a counterexample. Dartagnan can already generate a counterexample witness as a graph. However, this graph refers to the Llvm code while Sv-Comp requires to generate a counterexample witness w.r.t to input C program. The information mapping the C instructions with the corresponding Llvm code is provided by the transformation from C to Llvm used by Dartagnan. However it is not currently used. The goal of this thesis is to use this information and convert the counterexamples generated by Dartagnan to witnesses w.r.t the C code, thus fulfilling Sv-Comp requirements.