The Church Encoding of Simple Algebraic Data Types
Representing data as functions
01 Dec 2021When first learning about the lambda calculus, students are frequently introduced to Church numerals and Churchencoded booleans. These enable the expression of simple data types as pure functions. When I was first learning about them these encodings felt arbitrary, because I did not understand that they were expressions of a more general underlying principle. This post explores this more general form: the Church encoding of algebraic data types. To keep the length of this post manageable, only nonrecursive algebraic data types will be covered. Recursive data types, such as lists and the natural numbers, will be covered in a future post. This post will cover the following topics:
 The expression of booleans and tuples as algebraic data types
 How to encode arbitrary nonrecursive algebraic data types into the lambda calculus
 Some potential benefits of using Church encodings in practical TypeScript development
This post assumes basic knowledge of algebraic data types and the lambda calculus.
Examples will be given in TypeScript with a tiny sprinkling of traditional lambda calculus syntax. The abbreviation “ADT” will be used to mean “algebraic data type”, not “abstract data type”.
Booleans as algebraic data types #
In TypeScript, we use a builtin primitive type for boolean. However, booleans may be expressed as algebraic data types as well. They are one of the simplest ADTs, a choice between two variants which each carry no additional data. Here’s booleans as an ADT in TypeScript:
We can use our ADT booleans in the same way as builtin booleans, with just a bit of added verbosity. For example, let’s declare some basic logical functions on booleans:
These show the basic form of interacting with an ADT: when we interact with the value of an ADT, we inspect its tag, and then perform branching logic based on the tag. An ADTBoolean
has two variants, so when we consume it we will specify logic in two branches. In this case, these branches correspond to True
and False
.
We can also move between our ADT and primitive representations easily:
The ability to losslessly map booleans between their primitive form and their ADT form tells us that these two forms are isomorphic, and contain equivalent information. In many functional languages booleans are represented as ADTs rather than as specialcased, builtin data structure.
Churchencoded booleans #
I think of the connection between ADTs and Church encodings as the connection between what data is on the inside and how data can be used by pure functions from the outside. When we branch on the tag of an ADT we’re checking what the data is. As an alternative, let’s focus on how we use the data, by specifying the contents of the branches without inspecting the data directly. Here’s a type that will let us do just that:
This is a Church encoding of booleans, albeit a slightly nontraditional one. A Church boolean is a function which takes two arguments^{1}, one for “true”, one for “false”, and then returns the one which matches the boolean’s “internal value”. Note that this “internal value” is encapsulated: We can’t inspect a Churchencoded boolean to see whether it’s true or false. We can only apply it to arguments and see how it behaves.
Note that the generic argument A
is on the right side of the =
in the type definition, not on the left. This is important, but unfortunately a full explanation is beyond the scope of this post.^{2}
Here’s some sample functions on Church booleans:
First, we have the functions tru
and fals
, which are our basic building blocks, representing the ADT variants True
and False
. We use these in our implementation of churchAnd
and churchOr
. With these, we can create and combine encoded boolean values and branch on them by providing one value for the “true” branch and one for the “false” branch.
The use of Church booleans to compute our message
value shows that Church booleans are like “an if statement as a function”. Rather than inspecting a boolean in an if
statement and assigning a value in each branch of the if
, we call the Church boolean as a function and pass in the value corresponding to each branch.
Finally we have churchDecodeBoolean
, which, together with churchEncodeBoolean
, shows that our ADTBoolean
type and our Churchencoded booleans are isomorphic. Anything we can do with one, we can do with the other, because we have the ability to move between these two encodings as needed.
We can generalize the approach we took with booleans to more complex types. This gives us rule #1 for Churchencoding ADTs:
To Churchencode an algebraic data type with multiple variants, create a function which accepts an argument for each variant. The function must return the argument corresponding to the variant which the function encodes.
Caveats on my approach #
Previously I described these encodings as “nontraditional”, and here’s why. The standard Church encodings of booleans in the (untyped) lambda calculus are as follows:
true: λt.λf.t
false: λt.λf.f
In TypeScript, these would look like this:
Traditionally, Church booleans are fullycurried functions, as is standard for the lambda calculus. In the direct TypeScript encoding of this style tru
takes two values and returns the first, while fals
takes two values and returns the second. In contrast, I’ve taken the approach of using an object to simulate named arguments. It is trivial to map between these two approaches, and they both serve the purpose of representing data types as functions. However, in practical TypeScript development the use of positional parameters is less explicit and is more likely to cause confusion and maintainability issues. As we’ll see later, these issues get worse as the data types involved get more complex. Because the “named argument” approach still captures the essential aspects of a Church encoding, I consider it valid to refer to it as such.
A lambda calculus purist might also express concern over the application of Church encodings to data types other than functions, like how we called oneIsOdd
with a number and undefined
as arguments. While it is true that such data types do not exist in the most basic form of the lambda calculus, it is common to enrich the lambda calculus with additional data types when putting it to practical use. This does not compromise the calculus or its patterns in any way. It can be useful to treat a functional subset of JavaScript/TypeScript as one such enriched lambda calculus, and this is the approach I take here.
With these considerations out of the way, let’s continue to Churchencoding more complex data types.
Churchencoded tuples #
ADTs consist of sum and product types. We’ve covered the simplest sum type, Booleans. Now we’ll look at one of the simplest product types, the tuple. A tuple (or 2tuple, to be very precise) is the simultaneous combination of two component data types. Here’s what a tuple might look like as an ADT:
The Tuple
type is generic, because it can be used to package any two types, without restriction. We can, for example, create a Tuple
of numbers and strings:
Once you have a tuple, what can you do with it? You can take the values out of it and use them to compute a new value. Here’s a type and function for doing just that:
Here’s some examples comparing the use of ADTbased and Churchencoded tuples.
We use an ADT tuple by looking inside of it and pulling its values out; we use a Church tuple by passing it a function specifying what it should “do” with its contents, and let the tuple itself compute and return the result.
As before, we can decode a Church tuple, to get back an ADT tuple:
This shows us that our types are equivalent. Tuples give us rule #2 for Churchencoding algebraic data types:
To Chuchencode a product type, create a higherorder function which takes a second function as an argument. This second function should accept multiple arguments, one for each field in the product type. The higherorder function must call the provided function with the encoded arguments and return the result.
Putting the pieces together: Slaying a UI antipattern with Church encodings #
We’ve seen the rules for representing sum and product types via Church encodings. In this section we’ll use both of them to Churchencode a more complicated data type, and we’ll see how this can help us solve practical problems. Our example will come from the blog posts How Elm Slays a UI Antipattern and Slaying a UI Antipattern in FantasyLand.
Say that we have a UI which will display a user’s country, represented as a string. Our UI needs to make a request to the backend to retrieve this string, and needs to accurately reflect the state of this request when doing so. There are four states such a request might be in:
 Not yet made
 In progress
 Succeeded with a result
 Failed with an error
We can represent these possibilities with an ADT. Here’s our type:
These ADTs are the sum type of four case types. The “Succeeded” and “Failed” types are product types, which carry additional data.
When we define the UI displaying the user’s country, we need to provide a definition for each of these four cases. If we forget one, we’ve fallen into the antipattern of miscommunicating the state of the request. We could us an if
statement, and check inband which state we’re dealing with, but in TypeScript it’s easy to forget to cover one of the four cases. Instead, we can Churchencode our data type and see how that changes things.
Recall our two rules for Churchencoding ADTs:
 To Churchencode an algebraic data type with multiple variants, create a function which accepts an argument for each variant. The function must return the argument corresponding to the variant which the function encodes.
 To Chuchencode a product type, create a higherorder function which takes a second function as an argument. This second function should accept multiple arguments, one for each field in the product type. The higherorder function must call the provided function on the encoded values and return the result.
Let’s use these to work out the type of our Churchencoded RequestState
ADT.
First, we’ll encode the outer sum type:
Rule #1 tells us the outer shape of this function. notYetSent
and inProgress
are variants of the union type, so they get the generic type A
. However, if we were to use A
as the type of succeeded
or failed
then our type would lose the information that there is additional data stored in the Succeeded
and Failed
ADT variants. We would no longer have a proper encoding, because we wouldn’t be able to turn our Churchencoded function back into an ADT value. To capture this information, we need to use rule #2 to encode these two variants as product types:
Now, we can plug these into the sum type where they belong:
To demonstrate that our encoding carries equivalent information to our original ADT, here are the corresponding encode and decode functions:
In our hypothetical frontend application, we could make our HTTP request method return data in Churchencoded form. This gives the following benefits:
 The Churchencoded form requires that callers provide an implementation for every ADT variant. This is not the case for concrete data types which are consumed via a series of
if
statements.  The Churchencoded form is more succinct, and declaratively maps each variant tag to its corresponding value.
 Using Church encodings prevents regressions when additional variants are added to the represented data type. When adding variants to an ADT which is manually inspected using
if
statements it is possible for existingif
chains to silently ignore the added variant. Adding a variant to a Churchencoded sum type is a breaking change to a function’s type, and will cause the TypeScript compiler to identify every location in the codebase where additional handling is required.  Using Church encodings keeps us in the functional, expressionbased world, which can help readability and type inference (in some cases).
To illustrate these differences, here’s a contrived example of how UI code might handle data in each form:
As with most coding style discussions, there are tradeoffs and contextual factors to consider. That said, in this case (and in many others) I believe that a codebase’s quality is improved by adding the safety and succinctness of the Churchencoded approach.
As an aside, were we using a traditional Church encoding (consisting solely of fullycurried functions and without record types) our type would look like this:
This loses the explicit, declarative mapping from tags to their handlers and would be nearly unusable in practice. While it may be more “conceptually pure”, conceptual purity does not inherently bring clarity or maintainability to a codebase.
Conclusion #
When operating on algebraic data types we can choose to represent these type directly as concrete data or as encapsulated functions using Church encodings. These two approaches are equivalent in expressive power but they can have different effects on code’s clarity and maintainability. When we understand how to move between them, we can select the best approach for each situation.
I’ve only scratched the surface of Church encodings here. In a future post I’ll show how we can Churchencode recursive data types, and explore how Church (and the related Scott) encodings can provide performance optimizations, as well as how they relate to objectoriented programming.
For a deeper dive on this topic, the paper “Programming in the λCalculus: From Church to Scott and Back” by Jan Martin Jansen explores Church encodings (and more) of algebraic data types in the pure lambda calculus.
All code samples in this post are released into the public domain under a Creative Commons 0 license. Code samples and full license text may be found here.

In this case I am taking these two “arguments” as named fields on a single object rather than as fullfledged arguments. This is for readability’s sake, more on this later. ↩

If we had the generic argument on the left side of the
=
, giving us the generic typeChurchBoolean<A>
, then we would have to fix the type of values that a given Church boolean could be used with when that Church boolean was created. This is the difference between a generic type, which is a function, and the type of a generic function. The way we’ve actually implemented it, any value with the concrete typeChurchBoolean
is a generic function, which can be used with different type arguments at different times. The ability to move generic arguments to the right is called “higherrank polymorphism” (or sometimes “rankNTypes”) and is a relatively rare programming language feature. More advanced usages of this feature may be found in the post Existential Quantification in TypeScript, which uses higherorder polymorphism to encode existential types. ↩