An intro to decoders

👉 Latest post:
"Why .every() on an empty list is true"

By Vincent Driessen
on Tuesday, September 25, 2018

Today, I’m thrilled to publicly announce a new open source project that we’ve been using in production for months: ✨decoders ✨.

To get started:

$ npm install decoders

Here’s a quick example of what decoders can do:

import { guard, number, object, optional, string } from 'decoders';

// Define your decoder
const decoder = object({
  name: string,
  age: optional(number),
});

// Build the runtime checker ("guard") once
const verify = guard(decoder);

// Use it
const unsafeData = JSON.parse(request.body);
//                            ^^^^^^^^^^^^ Could be anything!
const data = verify(unsafeData);
//    ^^^^ Guaranteed to be a person!

// Now, Flow/TypeScript will _know_ the type of data!
data.name;   // string
data.age;    // number | void
data.city;
//   ^^^^ 🎉 Type error! Property `city` is missing in `data`.

Why? ¶

When writing JavaScript programs (whether that’s for the server, or the browser), one tool that has become indispensable for maintainable code bases is a type checker like Flow or TypeScript. Disclaimer: I’m mainly a Flow user, but everything in this post also applies to TypeScript (= also great). Using a static type checker makes making changes to large JS possible in ways that weren’t possible before.

One area where Flow (or TypeScript) coverage is typically hard to achieve is when dealing with external data. Think any form of user input, an HTTP request body, or even the results of a database query are “external” from your app’s perspective. How can we type those things?

For example, say your app wants to do something with data coming in from a POST request with some JSON body:

const data = JSON.parse(request.body);

The type of data here will be “any”. The reason is of course that we’re dealing with a static type checker. So even though Flow will know that the input to JSON.parse() must be a string, it doesn’t know which string and the type of JSON.parse()’s return value will be defined by the value of the string at runtime. In other words, it could be anything.

For example:

typeof JSON.parse('42');              // number
typeof JSON.parse('"hello"');         // string
typeof JSON.parse('{"name": "Joe"}'); // object

Statically, it’s impossible to know the return type. That’s why Flow can only define this type signature as:

JSON.parse :: (value: string) => any;

Worse, even, is that using these any-typed values may implicitly (unknowingly) turn off type checking, even for code that’s type-safe otherwise.

For example, if you could feed an implicitly-any value to a type-safe function like:

function greet(name: string): string {
  return 'Hi, ' + name + '!';
}

const data = JSON.parse(request.body);
greet(data.name);

Then Flow will just accept this, because data is any, and thus data.name is any. But of course this isn’t safe! In this example, data cannot and should not be trusted. Flow lets arbitrary values get passed into greet() anyway, despite its type annotation!

Especially in real applications this puts a significant practical cap on Flow’s usefulness. Using any (whether implicit or explicit) is completely unsafe, and should be avoided like the plague.

Decoders to the Rescue ¶

How, then, can we statically type these seemingly dynamic beasts? We can do so if we change our perspective on the problem a little bit.

Rather than trying to let Flow infer the type of a dynamic expression (which is impossible), what if we would have a way to instead specify the type we are expecting, and have an automatic type-check injected at runtime that will verify those assumptions? This way, Flow is able to know, statically, what the runtime type will be.

As you might have guessed, this is exactly what the decoders library offers.

You can use decoders’ library of composable building blocks that allow you to specify the shape of your expected output value:

import type { Decoder } from 'decoders';
import { guard, number, object, string } from 'decoders';

type Point = {
    x: number,
    y: number,
};

const pointDecoder = object({
    x: number,
    y: number,
});
const asPoint = guard(pointDecoder);

const p1: Point = asPoint({ x: 42, y: 123 });
const p2: Point = asPoint({ x: -3, y: 0, z: 1 });

There are a few interesting pieces to this example.

First of all, you’ll notice the similarity between the Point type, and the structure of the decoder.

Also note that, by wrapping any value in an asPoint() call, Flow will know—statically—that p1 and p2 will be Point instances. And therefore you get full type support in your editor like tab completion, and full Flow type safety like you’re used to elsewhere.

How? Because if the data does not match the decoder’s description of the data, the call to verify() will throw a runtime error. This will be the case in the unhappy path, for example:

const p3: Point = asPoint({ x: 42 });
//                ^^^^^^^^^^^^^^^^^^ Runtime error: Missing "y" key
const p4: Point = asPoint(123);
//                ^^^^^^^^^^^^ Runtime error: Must be object

Composition ¶

Decoders come with batteries included and these base decoders are designed to be infinitely composable building blocks, which you can assemble into complex custom decoders.

The simplest decoder you can create are the scalar types: number, boolean, and string. From there, you can compose them into higher order decoders like object(), array(), optional(), or nullable() to create more complex types.

For example, starting with a decoder for Points:

const point = object({
  x: number,
  y: number,
});

In terms of types:

point            // Decoder<Point>
array(point)     // Decoder<Array<Point>>
optional(point)  // Decoder<Point | void>
nullable(point)  // Decoder<Point | null>

Decoders also comes with a special regex() decoder which is like the string decoder, but will additionally perform a regex match and only allows string values that match:

const hexcolor = regex(
    /^#[0-9a-f]{6}$/,
    'Must be hex color',  // Shown in error output
);

You can then reuse these new decoders above by composing them into a polygon decoder. Notice the reuse of the hexcolor and the point decoders here.

const polygon = object({
  fill: hexcolor,
  stroke: optional(hexcolor),
  points: array(point),
});

You can then reuse that complex definition in a list:

const polygons = array(polygon);

You get the point. The final output type decoders of this type produce will be:

Array<{|
    fill: string,
    stroke: string | void,
    points: Array<{|
        x: number,
        y: number,
    |}>,
|}>;

Notice how the fill and stroke fields here end up as normal strings. Statically, Flow only knows that they are going to be string values, but at runtime, they will only contain hex color values that matched the regex. (Decoders are therefore more expressive than the type system in describing what values are allowed.)

Note: It is not recommended to go fully overboard with this feature. Decoders are best kept simple and straightforward, staying close to the values they express, and not perform too much "magic" at runtime.

The best way to discover which ones are available, is to look at its reference docs.

Error messages ¶

Human readable and helpful error messages are considered important. That’s why decoders will always try to emit very readable error messages at runtime, inlined right into the actual data. An example of such a message would be:

Decode error:
{
  "firstName": "Vincent",
  "age": "37",
         ^^^^
         Either:
         - Must be undefined
         - Must be number
}
^ Missing key "name"

This is a complex error message, but optimized to be very readable to the human eye when outputted to a console.

The same error information can also be presented as a list of error messages for outputting in API responses. In this case, the input data isn't echoed back as part of the error message:

[
  'Value at key "age": Either:\n- Must be undefined\n- Must be number',
  'Missing key: "name"'
]

(For those interested, this inline object annotation is performed by debrief.js.)

Decoders vs Guards? ¶

When you have composed your decoder, it’s often useful to turn the outmost decoder into a “guard”. A “Guard” offers a slightly more convenient API, but is very much like a decoder. It’s also callable on unverified inputs, but it will throw a runtime error if validation fails. They are therefore typically easier to work with: using the guard, you can focus on the happy path and handle any validation errors in normal exception handling mechanism.

Invoking the decoder directly on an input value will not throw a runtime error and instead return a so called “decode result”. A “decode result” is a value that represents either an OK value or an Error, both of which you'll need to “unpack” to do anything useful with it.

For example, given this decoder definition:

const decoder = object({
  name: string,
  age: optional(number),
});
const verify = guard(decoder);

decoder('invalid data');  // Won't throw
verify('invalid data');   // Throws

If you want to programmatically handle the decode result, you can use a decoder directly and inspect the decode result. If you're just interested in the data and not in handling any decoding errors, use a guard.

In terms of types:

type Decoder<T> = any => DecodeResult<T>;
type Guard<T> = any => T;

// The guard() helper builds a guard for a decoder of the same type
guard: <T>(Decoder<T>) => Guard<T>;

(For those interested, the DecodeResult type is powered by lemons’ Result type.)

Give it a whirl! ¶

Please try it out and let me know about your experiences.

Why? ¶

Decoders to the Rescue ¶

Composition ¶

Error messages ¶

Decoders vs Guards? ¶

Give it a whirl! ¶

Other posts on this blog