General Coq Background

This is a Coq programming task. First, some general guidelines about Coq programming tasks.

Admitting holes (defer less critical/important parts)

Complex Coq programs are often written incrementally. Coq allows us to "admit" helper functions temporarily so that they can be defined later. However, in that case, you need to come up with the type of the helper function. For example, you can admit a helper function to convert a Z to a string, as follows:

Definition Ztostring (z: Z) : string. Admitted. (* TOFIXLATER *)

This mechanism allows you to get the higher-level details right before implementing the low-level obvious details. Do not forget the "TOFIXLATER" comment, as this will be used to find the holes to fill in later.

Sometimes you cannot leave some holes for later because that may jeopardize Coq's termination checker for recursive functions, or the hole may need to be defined simultaneously with what you are currently defining. These issues are explained later in this tutorial.

Error Messages

Use dune build path/to/filename.vo to build path/to/filename.v. Ensure there are no errors in the files you edit. Here are things to try if you run into errors:

Use qualified names to avoid conflicts
Do not make any assumptions about which notation scopes are open. For example, if you mean Z.add 1 2, write (1+2)%Z instead.
Run Coq queries like Check/Locate to investigate errors deeply

Coq Vernacular Queries instead of Hallucinations

If you are not sure about the exact name of a definition/fixpoint/inductive in the standard library or the libraries available to you, use queries to get more information about the available context. Put the query in the .v file at the place having the right context to run the query (e.g. after the defn you want to print) and then build using dune. The build output will show you the query results. Some of the widely used queries are Search/About/Check/Locate. Here are some examples of search:

Search (nat -> nat). (* search the library for items of type nat -> nat *)

Sometimes, you incorrectly issue the above query as:

Search (_ : nat -> nat). (* this is wrong. it returns everything in the current context, regardless of the type *)

Other examples:

Search (nat -> list ?a -> list ?a). (* unification variables can be used to constrain occurrences: e.g. here the 2 occurrences of ?a must match the same term *)

Search "add". (* Search all definitions whose name contains "add" *)

Search (nat -> nat -> nat) "add" -"mul". (* Search all functions of type nat->nat and exclude functions whose names contain "add" but not "mul" *)

About Nat.add (* show documentation for an item *)

Check (Nat.add 1 1). (* check type of a term *)

Locate nat. (* print fully qualified name(s) of all items whose fully qualified name ends in `.nat`. *)

Search: can return too many items unless you chose a fairly discriminative query. However, don't use the Search ... in module syntax which only searches in a particular module: instead, search in all modules by not specifying the in part.

Queries other than Locate need the references to definitions/inductives to be sufficiently qualified depending on the set of Imports. For example, you may need to say A.foo instead of just foo if you havent Imported A. You can can use Locate to figure out the missing qualifications.

Structural Recursion

Beyond the minor surface-level sytax differences, one of the main differences between Coq and other functional programming languages like Ocaml/Haskell is that functions must terminate for every possible argument(s). For example, functions of type nat -> nat cannot recurse forever. To ensure that, Coq follows a very simple approach, which unfortunately can make life harder for programmers. For any Fixpoint definition, Coq must identify one argument that is decreasing in all recursive calls to the function being defined. That argument must be of an Inductive type. A term a of an Inductive type I is a structural subterm of another term b if a is obtained by peeling off one or more constructors of the type I, possibly after computational reductions on open terms.

For example, a function to compute sum of a list of numbers can be defined as:

Fixpoint listsum (l: list nat) : nat :=
  match l with
  | [] => 0
  | h::tl => h + listsum tl
  end.

The recursive call is on tl which is a structural subterm of h::tl which is the first argument (l).

In contrast, the following definition, which is logically equivalent and also terminating, is not allowed by Coq:

Fixpoint listsum (l: list nat) : nat :=
  match l with
  | [] => 0
  | h::tl => h + listsum (rev tl)
  end.
(*
Error:
Recursive definition of listsum is ill-formed.
In environment
listsum : list nat → nat
l : list nat
h : nat
tl : list nat
Recursive call to listsum has principal argument equal to "rev tl" instead of "tl".
Recursive definition is: "λ l : list nat, match l with
                                          | [] => 0
                                          | h :: tl => h + listsum (rev tl)
                                          end".
*)

rev tl is not a structural subterm of h::tl.

Nested Inductives

Constructor arguments of some inductive types can themselves be of Inductive types, e.g. the last (children) argument below of the node constructor:

Inductive Tree (T:Type) : Type :=
| node : T -> list (Tree T) (* children *) -> Tree T
| empty : Tree T.

Arguments node {_} _ _. (* make the T argument implicit *)
Arguments empty {_}. (* make the T argument implicit *)

In such cases, Coq allows us to define nested recursive functions, e.g.

Fixpoint sum (t: Tree nat) : nat :=
  match t with
  | empty => 0
  | node nodeVal children => nodeVal + list_sum (List.map sum children)
  end.

Coq unfolds the definition of List.map, which is itself a structurally recursive function, to observe that sum is only called on subterms of children which itself is a subterm of node nodeVal children. More concretely, Coq expands nodeVal + list_sum (List.map sum children) in the definition of sum to:

nodeVal + list_sum ((fix map (l : list (Tree nat)) : list nat := match l with
                                                                   | [] => []
                                                                   | a :: l0 => sum a :: map l0
                                                                   end) children)

and then observes the only recursive call is on a, which is a subterm of children.

This unfolding of List.map is essential for Coq to figure out that this is valid structural recursion. If List.map were opaque or an Axiom, Coq would not be able to figure that out and Coq would reject that definition. Thus, when leaving out admitted holes in the strategy in the previous section, please be mindful to not leave out holes around the recursive calls.

Well-founded recursion

Coq provides the Program Fixpoint mechanism to workaround the rather restrictive syntactic criteria above. With Program Fixpoint a user can define an arbitrary measure function but then Coq asks the user to prove that all recursive calls are on an argument with a smaller measure. You are already aware of this mechanism but you tend to overuse it. Program Fixpoint is not a primitive construct in Coq. Instead it encodes the user-supplied recursive function into a typicall 100x larger function that actually does structural recursion on proofs of measure decrements. Functions defined using Program Fixpoint often do not reduce well or at all, especially when the proof of measure reduction in recursive call arguments are opaque or have opaque subterms. Program Fixpoint does prove unfolding equational lemmas (propositional equality) but that is not as convenient as reduction (definitional equality). So only Program Fixpoint if a structurally recursive formulation would be much much more complex. Definitely PREFER IMPLEMENTING ADMITTED HOLES if they get in the way of Coq seeing structural recursion, instead of switching to Program Fixpoint.

Mutual Inductives and Fixpoints (recursion)

If you want to define a function that recurses on inductive data, you typically use the Fixpoint keyword. If the inductive type is mutually indutive with other types, often the needed recursion is also mutually recursive. In such cases, you need to define your function using mutual Fixpoints. Below is an exampe:

Inductive Expr : Type :=
| EConst : nat -> Expr
| EAdd : Expr -> Expr -> Expr
| ELet : string -> Com -> Expr -> Expr

with Com : Type :=
| CSkip : Com
| CAssign : string -> Expr -> Com
| CSeq : Com -> Com -> Com
| CIf : Expr -> Com -> Com -> Com.


Definition env := string -> nat.

(* Update environment *)
Definition update (ρ : env) (x : string) (v : nat) : env :=
  fun y => if bool_decide (x = y) then v else ρ y.
  
Fixpoint eval_expr (ρ : env) (e : Expr) : nat :=
  match e with
  | EConst n => n
  | EAdd e1 e2 => eval_expr ρ e1 + eval_expr ρ e2
  | ELet x c e' =>
      let ρ' := eval_com ρ c in
      eval_expr ρ' e'
  end

with eval_com (ρ : env) (c : Com) : env :=
  match c with
  | CSkip => ρ
  | CAssign x e => update ρ x (eval_expr ρ e)
  | CSeq c1 c2 => let ρ' := eval_com ρ c1 in eval_com ρ' c2
  | CIf e c1 c2 =>
      if Nat.eqb (eval_expr ρ e) 0 then eval_com ρ c2 else eval_com ρ c1
  end.

In the above example, eval_expr calls eval_com (ELet case) and eval_com calls eval_expr (CAssign case). Thus, neither of these functions can be defined before the other and they need to be defined together, using mutual recursion. WHILE DECIDING WHAT TO LEAVE ADMITTED FOR LATER, ENSURE THAT WHAT YOU DECIDE TO LEAVE FOR LATER DOESNT NEED TO BE DEFINED MUTUALLY RECURSIVELY WITH WHAT YOU ARE CURRENTLY DEFINING.

Common mistakes

String Escaping

In Coq, string escaping works non-intuitively. You would expect the following to define a string containing just the double quote character.

Definition doubleQuote : string := "\"".

But that is not valid Coq syntax. Instead, the following works:

Definition doubleQuote : string := """".
Compute (length doubleQuote). (* returns 1 *)

If this is confusing, you can just add the above doubleQuote definition and use it when producing strings.

Prop vs bool

Prop is a type of mathematical assertions, e.g. 1=2. bool is a datatype with 2 elements: true and false. Many logics do not distinguish between them but the difference in crucial in constructive logics as some Props may be undecidable. In Gallina functions, which are computable unless you add axioms, you can NOT do a case analysis on a Prop. You can do a case analysis on a bool. If P:Prop is decidable and there is an instance of the Decidable P typeclass in the current context, bool_decide P is a bool which is true iff P.

A common mistake beginners do is to write:

Definition foo (n:nat): nat:= if (0<n) then 1 else 0.

That doesnt typecheck: if is sugar for match and the discriminee must be an element of an Inductive type like bool/nat... But Prop is not an Inductive type. The following is one way to correctly write the above definition:

Definition foo (n:nat): nat:= if bool_decide (0<n) then 1 else 0.

confusing Arguments output of Print on Inductives

Below an example showing the output of the Print query on Inductive types.

Inductive pairnat: Set :=
  pairn (l: nat) (r: nat).
Print pairnat.

(*
Inductive pairnat : Set :=  pairn : Corelib.Init.Datatypes.nat → Corelib.Init.Datatypes.nat → monad.proofs.execproofs.test.pairnat.

Arguments monad.proofs.execproofs.test.pairn (l r)%_nat_scope
 *)

The (l r)%_nat_scope part in may be confusing to you. It should actually be read as:

Arguments monad.proofs.execproofs.test.pairn l%_nat_scope r%_nat_scope

When printing, Coq puts consequetive arguments in a pair of parentheses if they have the same notation scope, so as to only print the notation scope once for all of them. But DO NOT GET CONFUSED into thinking that the contents inside the parentheses are a single (e.g. tuple) argument: they are different arguments. Keep this in mind when generating pattern matching on constructors of inductive types. For example, the following 2 pattern matches are WRONG:

match p with 
| pairn (l,r) =>...
end

match p with 
| pairn lr  => let '(l,r) := lr in ...
end

The correct way to write the pattern match is:

match p with 
| pairn l r  => ...
end

Prefer bool_decide instead of functions like N.leb, Q.leb, N.eqb ...

Given P:Prop, bool_decide P is a bool that is equal to true iff P. bool_decide expands to stdpp.decidable.bool_decide P must have a Decision instance (stdpp.base.Decision). Using this makes the proofs easier as one does ot need to hunt for correctness lemmas for things like N.leb.

aa755/coqprogramming.md

Select an option

No results found