The IC3 symbolic model-checking algorithm ----------------------------------------- Symbolic model checking - - - - - - - - - - - - We recall from FSV-I the basic setup of symbolic model checking. We are given a transition system with a finite set of states Q, a binary relation N <= Q x Q, a subset I<=Q of initial states, and a predicate P<=Q of allowed states. We want to know whether all states reachable form I via N are in P. Formally, N*(I) <= P. Here, the operator N(-) on state sets is given (as usual) by N(X) = {s:Q | exists s0:X.N(s0,s)} It yields the set of all states reachable from a given state set, here X, with a single N-step. The notation N*(-) refers to the reflexive, transitive closure of N, i.e., N*(X) = {s:Q | exists i>=0. s:N^i(X)} and it gives the set of states reachable from within X in any number, including zero, of N-steps. If the set of states is given explicitly one solve this with breadth-first search: enumerate the entire set N*(I) and check that it is contained in P. To speed this up, one can check for any enumerated state s':N*(I) whether it is in P. As soon as a non-P state is found the entire search can be called off. In *symbolic model checking* the set of states Q is given implicitly as the set of valuations of certain boolean variables. Likewise, the sets I,P, and the transition relation are given as boolean formulas over these variables (and primed variants in the case of N). One can of course construct from these data an explicit description, but the latter would be of exponential size (in the number of variables). The hope is that by using symbolic methods operating directly on formulas one can obtain a faster solution. Of course, in principle the question is NP hard and thus probably intractable in general. In practice, though, one achieves very good results. Example: a binary semaphore shared between n processes (n a fixed number, e.g. 2) Variables: q_{pz} for z : {sleep, wait, work}, p:{1..n}, and s_w for w:{free,occ} i.e. 3n+2 variables. (Helper)-formula delineating sensible states: sanity = (/\_p \/_z q_{pz}) /\ (/\_{z!=z'} ~q_{pz'} /\ (s_free \/ s_occ) /\ ~(s_free /\ s_occ) sanity': same as sanity but with primed variables. I = sanity /\ /\_p q_{p sleep} /\ s_free P = sanity /\ ~(\/_{p!=p'} q_{p work} /\ q_{p' work}) N = sanity /\ sanity' /\ \/_p (/\_{p1 != p} /\_z q_{p1 z} <=> q_{p1 z}') /\ \/_{(z,w,z',w'):R} q_{pz} /\ q_{pz'}' /\ s_w /\ s_w' where R={(z,w,z',w') | z=sleep /\ z'=wait /\ w=w' \/ z=wait /\ z'=work /\ w=free /\ w'=occ \/ z=work /\ z'=sleep /\ w'=free} Notice that the number of states satisfying "sanity" is already 3^n * 2. The classical method for determining whether N*(I) <= P consists of representing sets of states, i.e. boolean formulas as binary decision diagrams (BDDs) and to iteratively compute the sets S_n = N^n(I) by S_0 = I S_{n+1} = N(S_n) \/ S_n until one finds n0 such that S_{n0+1} = S_n0. One then has N*(I)=S_{n0} and one can check whether S_n0 <= P. BDDs are particularly advantageous here because large parts of S_n repeat as we increase n and so can be reused. Furthermore, BDDs support the oputation of N(X) from X via an efficient dynamic programming algorithm, and, finally, BDDs are unique for a given set and thus the test for equivalence needed in each iteration (to know whether one is finished) can be performed in constant time. For more detail, see the FSV-I material by MH and U Schoepp or the original literature, e.g. Ken McMillan's PhD thesis. Using SAT-Solvers for symbolic model checking - - - - - - - - - - - - - - - - - - - - - - - In the last 10-20 years BDDs have been all but superseded by SAT-solvers because they can be implemented ore efficiently. For this particular application, SAT-Solvers were not so useful because forward images (N(X) from X) are hard to compute with SAT solvers and also because equivalence tests are expensive. In 2011, however, a new method has been found for employing SAT-Solvers successfully and efficiently for the purpose of symbolic model checking. This is the IC3 algorithm invented in 2011 by Aaron Bradley. Aaron R. Bradley, SAT-Based Model Checking without Unrolling, VMCAI 2011. We begin by giving a declarative presentation of IC3 before moving on to its concrete implementation using SAT-Solvers. Backward search as a dual to forward iteration - - - - - - - - - - - - - - - - - - - - - - - - As before, we assume given state sets I,P, and transition relation N. Rather than computing all the states reachable from I via N (and checking that those lie in P) we can also compute the set of states all whose N-descendants go into P (and check that this set contains I). To this end, we define the universal inverse image operator N_A on state sets by N_A(X) = {s | for all s'.N(s,s') => s':X} Exercise: check that N(X) <= Y iff X <= N_A(Y). Now, define the descending chain T_i P = T_0 >= T_1 >= T_2 >= ... by T_0 = P and T_{i+1} = N_A(S_i) & T_i We see that T_i comprises those states s such that whenever one makes up to i N-moves from s one will end up in P. As a result, it suffices to construct the chain of the T_i until either no more progress is made (T_i = T_{i+1}) for some i or I <= T_i fails for some i for this means that after i N-steps it is possible to move out of P starting from a state in I. (Check this!). This is dual to the forward search above in the following sense. If (S_i)_i is the forward chain for I and N (as defined further up) and (T_i)_i is the backward chain for P := Q\I and N then T_i = Q \ S_i holds for all i. This is because N_A(Q\X) = Q \ N(X) holds for all state sets X (prove this). We now give a different, but equivalent, presentation of backward search which brings us closer to IC3. Throughout the algorithm one maintains a sequence of state sets F_1, F_1, ..., F_n known as "frames", where the following invariants hold: P1) I <= F_i for i=1..n P2) F_i <= F_{i+1} holds for i=1..n-1 P3) F_i <= P for i=1..n P4) N(F_i) <= F_{i+1} or equivalently, F_i <= N_A(F_{i+1}) for i=1..n-1 We notice that as soon as equality holds in 1) for some i then N*(I)<=P is proved: Letting Z=Q_i we have I<=Z<=P and N(Z)=Z, so, by induction, N*(I)<=Z<=P. Initially, one puts n=1 and P_1 = P. Now, P2, P3, P4 all hold; P2, P4 vacuously. P1 also holds unless I<=P fails in which case we can safely report failure. Now, the IC3 algorithm increases the number of frames step by step in the following way. Suppose that the invariants P1..P4 hold. One tries to add a new frame F_{n+1} := P If N(F_n) <= P holds then this is ok. If not, then one determines the subset of F_n which causes problems, i.e. which comprises those states that lead outside of P: Let K_n = {s:F_n | exists s'.N(s,s') & s' not in P}. Replacing F_n by F_n'=F_n \ K_n restores P4 for the last frame. But now, of course, P2 and P4 may become violated further down. To restore them, we first remove K_n from all the earlier frames as well. Should this lead to a violation of P1 this furnishes an actual counterexample, for then I contains a state in K_n which leads outside of P in one N-step. This (temporarily) restores P2. In order to restore P4 for the other frames as well, we now continue in the same fashion but one level down. We identify the set K_{n-1} which comprises those states in F_{n-1} that lead in one step outside the new F_n, i.e. F_n \ K_n and remove it from F_{n-1} and all the earlier frames. Should this violate P1 we have again found a counterexample, this time of length two. We continue in this fashion until the properties P1..P4 are restored and the number of frames has been increased. Now, what does this have to do with backward search? Well, we can show by induction that at all times F_{n-i} = T_i where n is the current level and n=0..n-1. Let us write F^n_i for the i-th frame in the n-th level. Initially, we have n=1 and F^1_1 = P and, indeed, F^_1 = F^1_{1-0} = T_0 = P. Now suppose, the equation holds true for some n. We have F^{n+1}_{n+1} = P = T_0 The complement of K_n is N_A(P) and thus, F^{n+1}_n = F^n_n \ K_n = P & N_A(P) = T_1 We continue inductively in this fashion. Once we know that F^{n+1}_k = T_{n+1-k}, we find that the complement of K_k is N_A(T_{n+1-k}) and, using the outside induction F^n_{k-1} = T_{n+1-k}, we then conclude F^{n+1}_{k-1} = F^n_{k-1} \ K_k = T_{n+1-k} & N_A(T_{n+1-k}) = T_{n+2-k} The advantage of this procedure for use with a SAT-Solver is two-fold: First, the all updates to frames are via intersections which can be easily done on the level of clauses whereas the unions implicit in forward iteration are awkward to express with CNFs. For, the update from one level to the next by pushing in a new frame P to be sound, it suffices that the invariants P1-P4 hold. We can thus, in between major iterations heuristically shrink frames so long as the invariants are maintained. We come back to this further down. Now, we explain in some more detail how this pays off concretely when working with a SAT-Solver. IC3 implemented with CNF and SAT-Solver - - - - - - - - - - - - - - - - - - - - We fix boolean variables x1..xn and their primed variants x1'..xn'. A state is represented by a valuation of the variables x1..xn. A set of states ("state set") is represented as a boolean formula, written as a conjunction of clauses (CNF), over the x1..xn. Exercise: write the example sets as CNFs A state validates a state set, written s:A or A(s), if A comes out true under s. We write A <= B to mean that A is a subset of B, i.e., for all s one has s:A implies s:B. A cube is a conjunction of literals, i.e., a state set consisting exclusively of unit clauses. In particular, every singleton state set is a cube. In our example s_free /\ ~q_{1 work} is a cube. Notice that if C is a clause then its negation, written ~C, is a cube, and can therefore be conjoined with a state set A. We write A /\ ~C for that. E.g. when C = x \/ ~y \/ z then ~C = ~x /\ y /\ z and A /\ ~C is obtained from A by adding the clauses ~x, y, z. If A is a set of states we write A' for the substitution A[x1'/x1...xn'/xn] (just as sanity' in the example above). Let B = C1 /\ .. /\ Cn be a state set and C1..Cn its clauses. We have A <= B iff for each i the formula A /\ ~Ci is unsatisfiable. We henceforth assume that CNFs I, P over x1..xn and N over x1..xn, x1'..xn' are given. We remark that it is not a priori obvious that this is possible (without exponential overhead) because we are not allowed to introduce helper variables as usual when converting into CNF. However, in practice, most system specifications turn out to be easily formulated as CNF. As we have seen, IC3 requires, among other things, the repeated computation of N_A(X) or rather the intersection of a given state set with N_A(X). This can be conveniently done with a SAT Solver as we now show. Suppose that X,Y are state sets and that we seek to compute Z = X & N_A(Y). We begin by checking for each clause C in Y whether X, N, -C is satisfiable. If it is is unsatisfiable for all C in Y then we know that X <= N_A(Y) or equivalently N(X) <= Y and thus X & N_A(Y) = X. Nothing to do. If, however, it is satisfiable then we get states s,s' such that X(s), N(s,s'), ~Ci(s') hold. These assignments may contain some "don't care" positions thus effectively manifest themselves as a conjunction of literals. For example, x /\ ~y, meaning x=1, y=0, z=?. As already mentioned, such a conjunction of literals is known as a *cube*. The negation of the cube s is a clause and we can add this clause to X. We notice that doing so really shrinks X since X(s) holds, so the negation ~s cannot be implied by X. We also note that ~s really belongs to N_A(Y) for s leads outside of Y (via s'). Thus, if we continue in this way, we really obtain the desired intersection X & N_A(Y). The good thing is that we did not have to compute N_A(Y) as such but only its intersection with X. There are two additional observations. Termination check by equality of clause sets: If we consider how the frames are actually built, we see that the clause sets of each later frame are subsets of the clause sets of earlier frames. Thus, if we perform a redundancy check before we add a new clause to a frame, we can detect equality, i.e. the case for successful termination simply by comparing clauses without performing any satisfiability checks. Pushing clauses forward. When we have complete a major iteration, i.e. invariants P1-P4 are restored, we can then add clauses from F_i to F_{i+1} so long as this does not violate invariants. Obviously, the only invariant that could possibly be violated is P4. So, if C is a clause in F_i, typically a newly added one, then we can check whether F_i,N,-C is unsatisfiable. If yes, we can add C to F_{i+1}. We can also do this for several clauses at once: if C,D,E have been newly added, then we can express the disjunction of the cubes -C,-D,-E as a CNF L and check that F_i,N,L is unsatisfiable. If yes, we can add C,D,E to F_{i+1}. Of course, it will be too expensive to do this in general, so some heuristic is needed. To sum up, we give here a description of IC3 in pseudo code. /* main() n=1; F_n = P; result = false; done = false; while(!done) { addLevel(); pushForward(); getResult(); } addLevel(){ n++; F_{n+1} = P; for(i=n downto 1){ F_i = F_i & N_A(F_{i+1}); /* using the abovedescribed clausewise procedure */ for (j=1 to i-1) F_j = F_j & F_i; /* by adding all the newly appeared clauses from F_i to F_j */ }} pushForward(){ for(i=1 to n-1){ for(C : F_i) if(F_i,N,~C' is unsatisfiable) /* N(F_i) <= C */ F_{i+1} = F_i & C; } }} getResult(){ for(i=1 to n-1){ if (!(I implies F_1)) {result=false;done=true;} /* Can avoid first test by checking that every clause added to F_1 is implied by I and breaking with result=false if not. */ else if (F_i <=> F_{i+1}) {result=true;done=true;} /* Can replace second test by equality of clause sets */ } To finish up, we carry out a few steps of our running example with two processes in IC3. For simplicity we do not write down the sanity clauses but implicitly assume that all assignments satisfy "sanity". We use the notation q1wk for q_{1 work}, etc. First level n=1: I = {q1sp, q2sp, sf} P = {~q1wk \/ ~q2wk} F_1 = P = {~q1wk \/ ~q2wk} We find that I implies P. Second level n=2: F_2 = P = {~q1wk \/ ~q2wk} F_1 to become P & N_A(P): We consider the SAT-problem P,N,q1wk',q2wk' We have a satisfying assignment q1wk,q2wt,sf,q1wk',q2wk' Thus, we add the clause ~q1wk \/ ~q2wt \/ ~sf to P leading to the SAT Problem P,~q1wk \/ ~q2wt \/ ~sf,N,q1wk',q2wk' This is unsatisfiable and the new F_1 is still implied by I so the next level is complete. F_2 = P F_1 = P,~q1wk \/ ~q2wt \/ ~sf The new clause cannot be pushed forward: s := q1wk,q2sp,sf s':= q1wk,q2wt,sf is a counterexample, since F_1(s),N(s,s') hold, but s' does not satisfy the new clause. Third round n=3. Again, we add F_3 = P. We already know that this will lead to the strengthening of F_2 to P,~q1wk \/ ~q2wt \/ ~sf Now, we need to strengthen F_1 to encompass this strengthening: we consider the SAT-problems (one for each clause in F_2). P,~q1wk \/ ~q2wt \/ ~sf,N,q1wk',q2wk' (unsatisfiable as we know) P,~q1wk \/ ~q2wt \/ ~sf,N,q1wk',q2wt',sf' A satisfying assignment is q1wk,q2sp,sf,q1wk',q2wt',sf' so we add the clause ~q1wk \/ ~q2sp \/ ~sf. It cannot be pushed forward either. To be completed in the tutorial.