Evaluating Lambda Calculus
July 3, 2013

Evaluating Lambda Calculus in Scala

Java Tools
Java Application Development

In this third post on the untyped lambda calculus, we’ll look at implementing a lambda calculus evaluator in Scala. We already did a parser and the framework for a REPL last time in Parsing Lambda Calculus. Hopefully, the conclusion to this post also has a better answer to the question we asked in What is Lambda Calculus and should you care?

This post will be rather code heavy, and you can also see code for each section in separate commits in a GitHub repo. Some familiarity with Scala is expected to be able to follow the code. In case you don't care for the implementation details, you can skip to final thoughts (or the section on Recursion) when you start seeing embedded code.

Evaluating Lambda Calculus by Variable Substitution

As mentioned in the first post, evaluation in lambda calculus happens by variable substitution -- we replace references to the lambda’s argument variable with whatever the lambda is called with.

The main rule of variable substitution is called β-reduction -- if we have an application with a lambda on the left hand side, we can substitute the right hand side for the argument to the lambda. This kind of reducible expression is called a “redex”, and such a reduction corresponds to a single computation step.

There are several strategies on picking which exact reduction to do next. The main strategies are call-by-name, call-by-value and call-by-need. We will use call-by-value as that is common in modern programming languages. Under call-by-value evaluation, the outermost redexes are reduced, and then only if the right hand side has already been reduced to a value. Values are expressions that cannot be reduced any further. Usually, we consider lambdas and variables to be values, but applications not. Call-by-value evaluates arguments to functions before evaluating the function.

Substitution

Substitution can be trickier than we might initially think as we have to be careful not to change the meaning of any expressions in case of a naming conflict (and naming conflicts usually happen). For example, with just name-based substitution, we would get the following

(λx.(λx.x)) y → λx.y

changing the meaning of the program. A usual approach to solving such conflicts is to rename variables, because programs that only differ in variable names are still equivalent. If you don't care about the implementation details, now is a good point to skip most of the post.

Implementing Evaluation

We decided to try to get by without any renaming of variables. We keep track of a lexical scope of each lambda/variable. Each lambda defines a new scope, which refers to its parent scope. Each variable reference also refers to the scope of the lambda that defines the variable. There is a TOP scope that has no parent. For debugging purposes, we’ll add numeric ID’s to scopes.


object Scope {
  var id = 0
  def nextId = { val i = id; id += 1; i }
  val TOP = new Scope(None, Set())
}
class Scope(val parent: Option[Scope], val boundNames: Set[String]) {
  val id = Scope.nextId

  def closestBinding(name: String): Option[Scope] =
    if (boundNames contains name)
      Some(this)
    else
      parent flatMap (_ closestBinding name)
}


Scope also has a set of names that are bound in it, even though we only have one name in each scope for now. There is also a function for finding the closest lexical scope that binds a given name in the parent hierarchy. Here is the modified Var in the AST with references to Scope (other nodes need not change):


object Var {
  def apply(name: String): Var = Var(name, Scope.TOP)
}
case class Var(name: String, scope: Scope) extends Expr


Note: we now also have to change one line the parser: ident ^^ Var -> ident ^^ Var.apply

We don’t create Scopes during parsing, instead we have a "Binder” that makes a copy of the parsed tree while adding Scopes (the trees are immutable so we need to copy to modify).


class Binder() {
  def apply(term: Expr) = bind(term, Scope.TOP)

  def bind(term: Expr, parent: Scope): Expr = term match {
    case Lambda(arg, body) =>
      val λScope = new Scope(Some(parent), Set(arg.name))
      Lambda(arg.copy(scope = λScope), bind(body, λScope))
    case v @ Var(name, _) =>
      (parent closestBinding name) match {
        case Some(scope) => v.copy(scope = scope)
        case None        => sys.error("Undefined variable: " + name)
      }
    case Apply(fun, arg) =>
      Apply(bind(fun, parent), bind(arg, parent))
  }
}


For a Lambda, we create a new Scope. For a Var we look up the closest Scope binding its name, and set it’s scope to that. For now, we’ll raise an exception (using sys.error) if the variable is not bound.

We also need to invoke Binder in LambdaREPL, and change the PrettyPrinter because it uses Var in pattern matches and we added an argument to Var. We can also temporarily change the variable printing rule to s"$name${scope.id}" in order to see if our Scopes are assigned correctly.

λ> \x.x Parsed: λx1.x1 λ> \x.\x.x Parsed: λx2.(λx3.x3)

Seems correct. Now we can get to actual evaluation. As said, the main rule of evaluation is variable substitution. But when do we apply it? Under call-by-value, we apply it to an application which has a lambda on the left, and a value on the right. So we need a function that knows whether an expression is a value. We’ll consider Lambdas and Vars to be values (they will not be evaluated further if appearing by themselves):


def isValue(term: Expr): Boolean = term match {
  case _: Lambda => true
  case _: Var    => true
  case _         => false
}


Now we can write the pattern for when we should apply substitution:


case Apply(Lambda(argV, body), arg) if isValue(arg) =>
  // substitute arg for all references to argV in body


What if the arg to the lambda is not a value? We will add another pattern, for when the left side is a value, but the right hand side is not:


case Apply(fun, arg) if isValue(fun) =>
  // recursively call evaluation again to evaluate arg to a value


Finally, if the left hand side is not yet a value, we recursively evaluate the left-hand side:


case Apply(fun, arg) =>
  // recursively call evaluation again to evaluate fun to a value


Note that the only things being reduced during evaluation are Applys -- a Lambda or Var by itself is never reduced further (that’s why we say they are values). Combining these patterns and assuming we have an implementation for substitution, we could create a function that takes a single computation step if one can be taken.

However, to fully evaluate a program, we need to keep taking single steps until no more can be taken, so we need a wrapper function, which recursively takes a single step, until it gets a MatchError (in case none of our patterns above matched):


def apply(term: Expr): Expr =
  try {
    val term2 = evalStep(term)
    apply(term2)
  } catch {
    case _: MatchError => term
  }


I created the evaluation code in a new package lambda.eval just to have some separation instead of everything in the same package.


package lambda
package eval
// having two package statements like this means the package is still “lambda.eval”,
// but we automatically import everything from the first package “lambda”.

// When parameter debug is true, print each evaluation step
class Evaluation(debug: Boolean = false) {
  lazy val pretty = new PrettyPrinter()

  def apply(term: Expr): Expr =
    try {
      val term2 = evalStep(term)
      if (debug)
        println(s"step: ${pretty(term)} → ${pretty(term2)}")
      apply(term2)
    } catch {
      case _: MatchError => term
    }

  def evalStep(term: Expr): Expr = term match {
    case Apply(Lambda(argDef, body), arg) if isValue(arg) =>
      new Substitution(argDef, arg)(body)
    case Apply(fun, arg) if isValue(fun) =>
      Apply(fun, evalStep(arg))
    case Apply(fun, arg) =>
      Apply(evalStep(fun), arg)
  }

  def isValue(term: Expr): Boolean = term match {
    case _: Lambda => true
    case _: Var    => true
    case _         => false
  }
}


For substitution we wrote new Substitution(argV, arg)(body), and let’s implement that in a separate class:


class Substitution(argV: Var, replacement: Expr) {
  val binder = new Binder()

  def apply(term: Expr): Expr = term match {
    case Var(argV.name, argV.scope) => binder.bind(replacement, argV.scope.parent.get)
    case Var(_, _)                  => term
    case Apply(fun, arg)            => Apply(apply(fun), apply(arg))
    case Lambda(arg, body)          => Lambda(arg, apply(body))
  }
}


When we apply substitution to the lambda body, we apply it recursively for a Lambda or an Apply, but when we find a Var that is a reference to the same variable we are substituting, then we just replace it with the replacement Expr. We use Binder to create a copy of the replacement, under the parent scope of the lambda which we are eliminating by the substitution -- we need to make copies so that each instance of the replacement would have its own Scopes.

Notes on Variable Renaming

In the code above, we don’t do any variable renaming. It seems that when we have call-by-value evaluation, where the outermost redexes are reduced first, we can’t run into situations where we have a free variable in the replacement.

For example, I don’t think there is a situation where we could be substituting x for y in \y. (\x.y).

In the program (\y. (\x. y)) x the variable x is not bound -- but this is not a valid program because we don’t allow unbound variables in a complete program (at least not yet).

Assume the expression is wrapped in another lambda that actually binds x e.g. \x. (\y. (\x. y)) x. This lambda by itself will never be evaluated because it is already a value. So it must be given an argument to get evaluated and for substitution to take place inside it. If the argument is also named x, then we have the same situation as one step earlier -- we’d have an unbound x and would need to wrap it in another lambda. And so on.

We can give the lambda an argument that is a value, as in (\x. (\y. (\x.y)) x) (\x.x). In this case, the evaluation (and substitution) starts with the outer lambda:

  1. (\x. (\y. (\x.y)) x) (\x.x) matches the first rule of our evalStep function: Apply(Lambda, arg) if isValue(arg) and thus \x.x is substituted for x: (\x. (\y. (\x.y)) x) (\x.x) -> (\y. (\x.y)) (\x.x)
  2. (\y. (\x.y)) (\x.x) matches the same rule, and (\x.x) is substituted for y: (\y. (\x.y)) (\x.x) -> \x.(\x.x)
  3. Now we have only a lambda and no further reduction can be done.

The inner reference to x variable is bound correctly (because of how the bound name lookup works) and it seems to me that there is no way we could have an unbound variable x in the replacement during substitution, when we are limited to call by value evaluation. Please let me know in case I’m horribly wrong about all this.

Putting the E into REPL

Let’s plug the evaluator into our REPL:


val eval = new Evaluation(debug = true)
  ...
  println(pretty(eval(bind(expr))))
  ...


With the debug argument set to true, applying evaluation will print each evaluation step.

λ> (\x. (\y. (\x.y)) x) (\x.x) step: (λx1.((λy2.(λx3.y2)) x1)) (λx4.x4) → (λy2.(λx3.y2)) (λx5.x5) step: (λy2.(λx3.y2)) (λx5.x5) → λx3.(λx6.x6) λx3.(λx6.x6)

Before we go any further, we should fix something we got wrong in the parser last time: we had function applications as right-associative, but we actually want them to be left-associative:

a b c = (a b) c a b c ≠ a (b c)

This means changing around our rules just a bit, the affected lines are printed below:


lazy val expr: P[Expr]         = application | notApp
lazy val notApp                = variable | number | parens | lambda
lazy val application: P[Apply] = expr ~ notApp ^^


With this fix, we now actually have a functioning pure lambda calculus evaluator! Let’s try to evaluate some expressions from the last post. We can turn off the printing of the Scope ID’s (only wanted them to verify that variables had correct scopes), so let’s change the pretty printer’s Var rule back to s"$name".

How about some Church numerals:

λ> \s.\z.s z λs.(λz.(s z))

All the numbers would evaluate to themselves because they are already just a lambda and that is already a value. We could call the succ function to increment a number by one. We'll have to inline the function and its argument in a single expression because our REPL only takes single expressions:

λ> (λn.λs.λz. s (n s z)) (\s.\z.s z) // succ 1 step: (λn.(λs.(λz.(s ((n s) z))))) (λs.(λz.(s z))) → λs.(λz.(s (((λs.(λz.(s z))) s) z))) λs.(λz.(s (((λs.(λz.(s z))) s) z)))

Only a single step of evaluation could be taken here and while the result should be 2, it’s hard to verify just by looking at it that the printed result actually is equivalent to 2. It would be good if we could somehow evaluate this to an actual number.

It would also be great if we could write all the functions such as succ only once, and use them by name as many times as we want.

Another problem at the moment is that if we make a mistake in the code on the semantic level (e.g. use of an unbound variable), our whole REPL will be terminated with an exception. We should strive for better error reporting, and ideally report the exact location of the error.

So, it seems like we have a small TODO list:

  • Report unbound variable errors with exact location
  • Allow saving aliases (definitions) for functions e.g. succ = λn.λs.λz. s (n s z)
  • Turn the Church numerals into human-readable numbers

The code so far is in a single commit “Evaluation”, and we can now tackle the remaining issues separately.

Scala Error Reporting

To get good error messages with positions, we should record positions with each AST node. We can reuse some code from the Scala parser combinator library:


import scala.util.parsing.input.Positional
sealed trait Expr extends Positional


And wrapping the parsers in positioned(...):


lazy val lambda: P[Lambda]     = positioned(("λ" | "\\") ~> variable ~ "." ~ expr ^^
                                 { case v ~ "." ~ e  => Lambda(v, e) })
lazy val application: P[Apply] = positioned(expr ~ notApp ^^
                                 { case left ~ right => Apply(left, right) })
lazy val variable: P[Var]      = positioned(ident ^^ Var.apply)


Besides this, we need a way to actually report errors. It would be nice to do this in a more complete and functional way so that any “phase” of our evaluator could report errors the same way. In our simple case, the only semantic errors we’ll get are about unbound variables from the bind() function. So let’s cheat and just give Binder a mutable list of messages.


import scala.util.parsing.input.Position
case class Message(pos: Position, msg: String)

class Binder() {
  val messages = ListBuffer[Message]()
  ...
    case None =>
      messages += Message(v.pos, "Unbound variable: " + name)
      v
  ...
}


And we can take advantage of the fact that the Position class can already print the position as a caret below the line where the error occurred, in LambdaREPL:


val bound = bind(expr)
if (bind.messages.isEmpty)
  println(pretty(eval(bound)))
else {
  for (m <- bind.messages)
    println(m.pos.longString + m.msg)
  bind.messages.clear()
}


Now, running it with a bad example, we get

λ> \x.y z
\x.y z
   ^Unbound variable: y
\x.y z
     ^Unbound variable: z


The code for this section is found in commit “Error Positions”.

Saving Lambda Calculus 

We are now going add a way to save lambda expressions under an alias, so that they can be used later by referring to them by name. This is not really part of the pure lambda calculus language, so we are not implementing it at the AST level. We’ll add something to the parser, so that it could parse for example

succ = \n.\s.\z. s (n s z); add = \a.\b.\s.\z. a s (b s z);

and turn this into Map("succ" -> Expr, "add" -> Expr)

We will then give this map to the Binder, which will replace any occurrences of these names with the expressions from the map. The main parser additions are below, and we need to add ";" and "=" as delimiters for the lexer.


lazy val defs = repsep(defn, ";") <~ opt(";") ^^ Map().++ lazy val defn = ident ~ "=" ~ expr ^^ { case id ~ "=" ~ t => id -> t }

def definitions(str: String): ParseResult[Map[String, Expr]] = {
  val tokens = new lexical.Scanner(str)
  phrase(defs)(tokens)
}


We’ll add an argument for Binder and a new pattern:


class Binder(defs: Map[String, Expr]) {
  ...
    case v @ Var(name, _) if (defs contains name) =>
      bind(defs(name), parent)
  ...
}


Now what we can do in the REPL is recreate the Binder each time a definition is entered, adding it to the defs map of the previous Binder. So now we have to distinguish definitions from expressions. A simple way is to see if the input contains "=". While we’re at it, the main() method of the LambdaREPL is getting kind of complex by now, so let’s refactor a bit.


def main(args: Array[String]) {
  while (true) {
    val input = readLine("λ> ")
    if (input contains "=")
      handleDef(input)
    else
      handleExpr(input)
  }
}

def handleDef(input: String) =
  parseInput(parser.definitions, input) { defs =>
    println("Defined: " + defs.keys.mkString(", "))
    bind = new Binder(bind.defs ++ defs)
  }

def handleExpr(input: String) =
  parseInput(parser.parse, input) { expr =>
    val bound = bind(expr)
    if (bind.messages.isEmpty)
      println(pretty(eval(bound)))
    else {
      for (m <- bind.messages) println(m.pos.longString + m.msg) bind.messages.clear() } } def parseInput[T](p: String => parser.ParseResult[T], input: String)(success: T => Unit): Unit = {
  import parser.{ Success, NoSuccess }
  p(input) match {
    case Success(res, _)   => success(res)
    case NoSuccess(err, _) => println(err)
  }
}


Try it out:

λ> succ = \n.\s.\z. s (n s z) Defined: succ λ> succ \s.\z.z step: (λn.(λs.(λz.(s ((n s) z))))) (λs.(λz.z)) → λs.(λz.(s (((λs.(λz.z)) s) z))) λs.(λz.(s (((λs.(λz.z)) s) z)))

However, it will be rather annoying if we have to redefine all these basic functions each time we run the REPL, so let’s create a small library as well!


object Library {
  val source = """
    id = \x.x;

    true  = λt. λf. t;
    false = λt. λf. f;
    if    = λc. λt. λe. c t e;
    or    = λa. λb. a true b;
    and   = λa. λb. a b false;

    pair   = λf. λs. λb. b f s;
    first  = λp. p true;
    second = λp. p false;

    succ = λn.λs.λz. s (n s z);
    add  = λa.λb.λs.λz. a s (b s z);
    mul  = λa.λb.λs. a (b s);
    pow  = λa.λb. b a;
  """

  def load() = {
    val parse = new LambdaParser()
    import parse.{ Success, NoSuccess }
    parse.definitions(source) match {
      case Success(lib, _)   => lib
      case NoSuccess(err, _) => println(err); Map[String, Expr]()
    }
  }
}


We define several functions from the first post and some new ones, all inside a multi-line String representing the source code of the library. The load() method parses that String of input and returns the definitions as a Map. So in LambdaREPL, we will initialize the Binder with these definitions:


var bind = new Binder(Library.load())


Now we can use functions such as succ and add whenever we run the REPL.

The code for this section is in commit “Definitions & Library”.


Encoding and Decoding Church Numerals

So now that we have a somewhat functioning REPL, we could try some more interesting functions with numbers, but we have the problems of (1) understanding that λs.(λz.(s (((λs.(λz.z)) s) z))) is the number 1 and (2) writing the numbers, e.g. 3 in the form \s.\z.s (s (s z))).

Clearly, we would like to have some way of turning the Church numeral encoding into human-readable numbers. One part of that is making the parser parse number literals and output the proper Church numeral form. For this, I created a new file Encodings.scala, where we actually do most of the work, and then use that from the parser:


object CNumber {
  def apply(n: Int) = {
    var cn: Expr = Var("z")
    for (i <- 1 to n) cn = Apply(Var("s"), cn) Lambda(Var("s"), Lambda(Var("z"), cn)) } } ... // in LambdaParser lazy val notApp            = variable | number | parens | lambda lazy val number: P[Lambda] = numericLit ^^ { case n => CNumber(n.toInt) }
...


Try it out:

λ> 1 λs.(λz.(s z)) λ> 0 λs.(λz.z)

The other part is actually evaluating a lambda expression representing a Church numeral to a human-readable number. There are several possible solutions, and what I decided on was to introduce some magic variable names S and Z. When we pass S and Z to a Church numeral, it forces the evaluation of that numeral. S and Z are not lexically bound in any Scope, but I decided to create a hack of treating unbound variables starting in uppercase as magic variables of the TOP scope.

For this, we need another case in bind() where a variable is not bound, but has a name starting with an uppercase letter.


...
  case None if name(0).isUpper =>
    v.copy(scope = Scope.TOP)
  case None =>
...


Already this simple hack lets us evaluate some numbers to a bit more readable form:

λ> 2 S Z step: ((λs.(λz.(s (s z)))) S) Z → (λz.(S (S z))) Z step: (λz.(S (S z))) Z → S (S Z) S (S Z)

Now we simply count the S-s in the output and that’s our number. And we can easily make the computer count the S-s.


object CNumber {
  ...
  def unapply(expr: Expr): Option[Int] = expr match {
    case Var("Z", Scope.TOP)             => Some(0)
    case Apply(Var("S", Scope.TOP), arg) => unapply(arg) map (_ + 1)
    case _                               => None
  }
}


The magic unapply method turns CNumber into an extractor object that we can use in pattern matching. Basically it extracts an Int from a church numeral expression (if it can), returning Some(Int) if the “pattern” matched or None if it didn’t. In the PrettyPrinter we can then use the pattern:


case CNumber(i) => i.toString

 

λ> succ 1 S Z ... step: S ((λz.(S z)) 0) → 2 2

Some expressions result in more complex structure that doesn’t evaluate to a simple row of S-s under the current evaluation rules. For example

λ> pow 2 2 S Z ... step: (λz.(S (S z))) ((λz.(S (S z))) 0) → (λz.(S (S z))) (2) (λz.(S (S z))) (2)

Evaluation stops here because 2 (aka S (S Z)) is not a value. We can simply change the isValue function to treat any S Z form as a value:


def isValue(term: Expr): Boolean = term match {
  ...
  case CNumber(_) => true
  ...
}


Now we have

λ> pow 2 2 S Z ... step: (λz.(S (S z))) (2) → 4 4

We can do the same for Church booleans. We only need to implement unapply because our Library already included definitions of true and false, so we don’t need to treat them as literals in the parser. We’ll use magic variable names T and F:


object CBoolean {
  def unapply(expr: Expr): Option[Boolean] = expr match {
    case Var("T", Scope.TOP) => Some(true)
    case Var("F", Scope.TOP) => Some(false)
    case _                   => None
  }
}


An in the PrettyPrinter:


case CBoolean(b) => b.toString


We likely don’t want to remember the names of the magic variables, so we can add library functions that call an encoded value with the magic arguments:

bool   = λb. b T F; // magic values T & F are recognized as booleans number = λn. n S Z; // magic values S & Z are recognized as numbers

Now we can call “number (church-numeral)” or “bool (church-boolean)” to turn them to human-readable values.

λ> number (pow 2 3) 8 λ> bool (and false true) false

The code for this section is in commit “Church encodings”.

Recursion

Now that we can be a bit more comfortable with using numeric functions, maybe we can do something more interesting, like a factorial function. Just as a reminder, factorial of n is the product of all positive integers less than or equal to n: fact(5) = 5 * 4 * 3 * 2 * 1 = 120. A simple implementation of factorial is recursive, so we need some way to do recursion.

But first, let’s add some more functions to the library that we need for a factorial. First we need to test whether a number is zero to handle the case of fact(0) = 1. This one is easy: we give a number two arguments: successor = \x.false, zero = true. If the number has at least one successor, it is not zero, and returns false -- iszero 1 would be (\x.false) true, which evaluates to false.

iszero = λn. n (\x.false) true

You could also write λn. n (and false) true if that makes more sense: we make the number represent n applications of (and false) to true.

We also need a predecessor function, which subtracts 1 from a number. This is somewhat difficult with Church numerals, but one way to do it is with pair encodings:

zz = pair 0 0; ss = \p. pair (second p) (succ (second p)); pred = \n. first (n ss zz);

The first element of the pair will have the number that is 1 less than the actual number (except in the case of zz where both elements are 0).

Now we could write factorial as:

fact = λn. (iszero n) 1 (mul n (fact (pred n)))

If n is zero, return 1 otherwise return n * fact(n - 1). If we try to evaluate this, we will get a StackOverflowError during binding, because we will keep replacing fact into itself infinitely.

For recursion in lambda calculus, there are some known functional combinators. Y-combinator is perhaps the best-known, but cannot be directly applied under call-by-value evaluation. There is a call-by-value Y-combinator, also known as Z-combinator. The combinator tries to find a fixed point of a recursive function. We will just call it fix combinator. It’s definition is this:

fix = λf.(λw.f (λv.w w v)) (λw.f (λv.w w v));

Explaining this combinator is difficult, and I’m going to skip it (read about it elsewhere if you are interested). We can use it in our factorial, and give factorial another argument, which is essentially the function to compute the next step of the factorial:

fact = fix (λf.λn. (iszero n) 1 (mul n (f (pred n))))

However, if we run it, it will diverge and keep evaluating forever or until we run out of resources. This is because both branches to the hidden if-then-else construct in there are always evaluated. We need to delay this evaluation -- make both branches lazy by wrapping them in lambdas. Finally we must pass a dummy argument to the whole if-then-else construct, to force evaluation of the branch that was chosen.

fact = fix (λf.λn. (iszero n) (\d.1) (\d.mul n (f (pred n))) \d.d)

All the variables named d are dummies never used for anything except making the branches lazy. This is now a final definition of factorial we can actually use!

If you try to evaluate number (fact 5) (or a larger number), you might notice that there are a lot of evaluation steps, so maybe we want to turn off the debugging of each step at this point (new Evaluation(debug = false)).

λ> number (fact 5) 120

The code for this section is in commit “Recursion”.

Note on Implementation

Unfortunately, we cannot use this factorial function for very big numbers. In fact, we cannot use any of the numeric functions for large numbers. The Church representation of numbers is extremely inefficient and we are likely to run out of heap or stack space with computations involving numbers larger than a couple of thousand. For this reason, most implementations of lambda calculus would extend it with a concept of primitive values for numbers and booleans, but here we are dealing with the pure lambda calculus only. There are also ways to make the evaluator at least somewhat more efficient, but we focused on ease of implementation, not performance.

The full code for this post is at the GitHub repo, along with some bonus commits, such as REPL commands :quit, :step (for debugging a single evaluation step), :debug off/on.

Final Thoughts

What we have created in the last two posts is a relatively simple REPL for the pure untyped lambda calculus. If you are interested in programming language theory at all, something like this could be a minimal starting point that you can add language features on top of (of course, there are other possible starting points as well).

For example, a next step could be to add integer, boolean, function types and a type checker -- so far our programs were not typechecked at all and it’s possible to write some weird programs. Actually, the Church encodings embrace this weirdness somewhat, for example with the iszero function that directly applies a number to boolean arguments.

Once we add simple types, we could extend the calculus with polymorphism, generic types, dependent types, etc. The Lambda Cube describes these forms of abstraction as orthogonal axes.

You might also do various experiments within the lambda calculus, like “how could I encode objects in such a system?”, or “could I implement monads?”. You could also extend it with “native” functions that perform I/O when applied.

The lambda calculus should be a good starting point for getting into programming language theory (or just creating a small language for fun) exactly because it’s made of so few elements: anonymous functions, variable references, function applications.

Whatever kind of features you might want to add to the language, you have a very simple core to build on top of -- only 3 different kinds of AST nodes to deal with. To be honest, you would likely have to add a few new AST nodes before you start getting to really interesting places.

The computation with lambda calculus is nothing like the underlying computation of C-like languages where the model of programming is very related to the underlying processor models. I think it’s useful to get a change of perspective every once in awhile -- I wonder what kind of computers we would have today if they didn’t build on von Neumann architecture, but had efficiently implemented a model closer to functional programming, which has its roots in lambda calculus.


Looking for additional insights into Lambda calculus?

Our article, Parsing Lambda Calculus in Scala starts at the basics.

Read the Article