What to expect
This post is a high-level overview of the discriminated union technique used in a lot of functional programming languages like Haskell and OCaml, multi-paradigm languages like TypeScript and Rust, and even object-oriented languages like C++.
I will go over what a discriminated union is, show the TypeScript pattern and then compare that with Rust's enums. As an example I will be creating a WebGL Shader Uniform discriminated union type, don't worry if you don't have any experience with WebGL or OpenGL, it isn't needed to understand the exercise.
What is a discriminated union?
Also called:
Discriminating Union
Tagged Union
Sum Type
Disjoint Union
and many other names
It is a data structure that is used to hold many different types of values while being easily 'typed' or 'tagged' and made discernable from each other. You can also think of it as like a mix between an enum and a union from C.
We can split this idea into 2 parts, the tag and the data. The tag is the unique identifier that can be used to infer the type of the data. Different combinations of tags and data types are known as variants and the sum of the variants is our discriminated union.
TypeScript
In TypeScript you can create a type alias from a literal value. This is can be a string, number, or a boolean literal value. You can even use literal values as types when typing object attributes, or in tuples:
type Zero = 0;
type Status = 'failure';
type Tuple = [Status, false, 'banana'];
type Obj = { status: 'success'; data: Tuple; };
It is not possible to use symbol literals at this point, maybe in the future?
TypeScript: Handbook - Literal Types
Also in Typescript we can create a type that is a union of literal types. This means that the type could be any of the literals, but only one of the literals at a time.
type Status = 'success' | 'in-progress' | 'failure';
type Bit = 0 | 1;
type IpV4Address = { version: 'v4', data: [number, number, number, number]; };
type IpV6Address = { version: 'v6', data: string; }
type IpAddress = IpV4Address | IpV6Address;
const host: IpAddress = { version: 'v4', data: [127, 0, 0, 1], };
The union of strings can be used for type-narrowing and to make sure consumers of your code are checking to see if the string values are actually correct. In the case above we can check to see what type the IP version is, 'v4' or 'v6', to figure out whether we need to do something with an v4 or v6 IP address. That means we can look at the version attribute as our tag. This IpAddress type is the discriminated union pattern that we are aiming for and is the union, or sum type, of the 2 different IpAddress variants.
One major downside to JavaScript and extending to TypeScript here is the lack of number types, in JS numbers are all typed as float64. It would be nice if we had a u8 (a number between 0-255) or i32, etc. or had the ability to ensure typed numbers. Currently the only way is with typed arrays. Working at the byte level in JS/TS is a pain because of this.
TypeScript: Handbook - Discriminating Unions
Rust enums
Now that we have seen a small demo of discriminated unions in TypeScript, some of the disadvantages are the hoops you have to jump through with the boilerplate and the fact that in TS the type strictness is mostly coming from the implementation rather than the language itself leading to this pattern sometimes feeling like a hack if overused or used improperly.
What do I mean by that? Well, Rust has discriminated unions built in, called enums, and they are a great match with the rest of the language's features.
Here is an example from the Rust book, a great resource for getting into Rust.
enum IpAddr {
V4(u8, u8, u8, u8),
V6(String),
}
let home = IpAddr::V4(127, 0, 0, 1);
If we compare to the TS version, we have less boilerplate in the Rust code. We no longer need some struct to hold both the tag and the data. As you declare your variant, you are declaring the tag. For example the first variant's tag is V4 and the data it has is a tuple of 4 u8s. Another great thing here is the instant encapsulation of the variants in the IpAddr namespace. In TypeScript we had to create a type for each variant as well as a type for the discriminated union itself, and any new variant we add will mean the union will have to be updated as well. If we add an IPV7 at some point, it would mean only 1 line change in Rust, but at least 2 in TypeScript. I know it is only 1 more step, but not having to do that coordination means 1 less thing can go wrong.
I can't stress enough how amazing it is that it comes baked into Rust. It is such a core feature that it is used everywhere. Error handling, null/empty values, and more are handled with enums.
WebGL Shaders, and Uniforms
The example coming up is going to create a discriminated union type of uniforms for a WebGL shader.
In WebGL and OpenGL, uniforms are a way of passing in data from your application into the graphics card to be used in the shader program. You can pass in all types of data from integers and floats to sized arrays called vectors (think like math vectors), matrices in 2x2, 3x3 and 4x4 and even 2d/3d image data.
WebGL Fundementals - WebGL Shaders and GLSL
If you don't have any experience with WebGL or OpenGL or shaders, don't worry there isn't anything specific to them, it is just a backdrop for learning the pattern.
TypeScript Example
We will begin with creating type aliases for uniform data. We are going to need a float, a few vectors, a few matrices and a texture type.
We will take these types and then create our uniform variants.
We will take those variants and create our discriminated union.
We will create a helper function to narrow the type of our union to its variant.
We will take a look at using a switch/case to apply some logic to our union based on its variant.
Vectors and Matrices
Graphics applications make heavy use of both vectors and matrices. If you aren't familiar with them then you can think of a vector as an array and a matrix (singular of matrices) as an array of arrays. For our uniform we will need a Float type. Because of the limitations of Javascript's type system numbers are 64-bit floating point numbers, or just float64s. That means we can alias our Float type to just the built-in number type.
For our uniform types we will need 3 vector types: vector of length 2, 3 and 4. We can call these Vec2, Vec3 and Vec4. We can create tuple types using our Float type so a Vec2 type would look like [Float, Float], and so on for Vec3 and Vec4. We can also use the spread operator in your type definitions for Vec3 and Vec4 if it makes it more legible for you.
We also need to create our matrix types. We need a 2x2, 3x3 and 4x4 matrix type. Our matrices will actually be flat arrays just like the vectors. A 2x2 matrix will be a tuple of 4 Floats. We can use the spread operator here to shorten the type definition and save you from writing Float 16 times for a 4x4 matrix.
Our last type is just an alias for the HTMLImageElement type that represents an image element in the DOM.
Uniform Data Types
type Float = number;
type Vec2 = [Float, Float];
type Vec3 = [...Vec2, Float];
type Vec4 = [...Vec3, Float];
type Mat2 = [...Vec2, ...Vec2];
type Mat3 = [...Vec3, ...Vec3, ...Vec3];
type Mat4 = [...Vec4, ...Vec4, ...Vec4, ...Vec4];
type Texture = HTMLImageElement;
Next we create the individual uniform types, or variants. These will have a tag called type and the data on a field called data. This is just like the IP address example above but the tag attribute is now being called type.
Uniform Variant Types
type FloatUniform = {
type: 'float';
data: Float;
};
type Vec2Uniform = {
type: 'vec2';
data: Vec2;
};
type Vec3Uniform = {
type: 'vec3';
data: Vec3;
};
type Vec4Uniform = {
type: 'vec4';
data: Vec4;
};
type Mat2Uniform = {
type: 'mat2';
data: Mat2;
};
type Mat3Uniform = {
type: 'mat3';
data: Mat3;
};
type Mat4Uniform = {
type: 'mat4';
data: Mat4;
};
type TextureUniform = {
type: 'texture';
data: Texture;
};
The next thing we need to do is put it all together and create our union type.
Uniform Union Type
type ShaderUniform =
| FloatUniform
| Vec2Uniform
| Vec3Uniform
| Vec4Uniform
| Mat2Uniform
| Mat3Uniform
| Mat4Uniform
| TextureUniform;
Discriminated Union Utilities
type UniformTypes = ShaderUniform['type'];
type ExtractUniformType<T extends UniformTypes> = Extract<ShaderUniform, { type: T }>;
Our first utility above is the union of all of the tag values. In our example the value of UniformTypes would be:
// 'vec2' | 'vec3' | 'vec4' | 'mat2' | 'mat3' | 'mat4' | 'texture'
This can be extremely useful when you want to create a function that has a parameter that is one of the types, for some like a createUniform function.
The second utility is a generic type that returns the variant type based on a given tag. In this case, ExtractUniformType<'mat2'> would be an alias for the type Mat2Uniform.
ExtractUniformType<'mat2'> === Mat2Uniform
TypeScript: Documentation - Utility Types (Extract)
The next utility is a function that will return true if the variant is of the given type. This creates what is called a user-defined type guard. We use the ExtractUniformType<T> utility type we just created in the return signature of this function:
function uniformIsType<T extends UniformTypes>(type: T, uniform: ShaderUniform): uniform is ExtractUniformType<T> {
return uniform.type === type;
}
You can use this to help the type checker narrow the type from our union, ShaderUniform, to a given variant.
const x: ShaderUniform = { type: 'float', data: 10 };
if (uniformIsType('mat2', x)) {
console.log(`x is definitely a mat2!, here is index (1,1): ${x.data[3]}`);
}
if (uniformIsType('float', x)) {
console.log(`x is a float, add 10: ${x.data + 10}`);
}
You can also skip the utility function and check against the attribute directly
const x: ShaderUniform = { type: 'float', data: 10 };
if (x.type === 'float') {
console.log(`x is a float, add 10: ${x.data + 10}`);
}
TypeScript: Documentation - User Defined Type Guards
Handling Unions with Switch/Case
function handleUniform(uniform: ShaderUniform): void {
switch (uniform.type) {
case 'float': {
console.log('Handling a float');
uniform;
break;
}
case 'vec2': {
console.log('Handling a vec2');
uniform;
break;
}
case 'vec3': {
console.log('Handling a vec3');
uniform;
break;
}
case 'vec4': {
console.log('Handling a vec4');
uniform;
break;
}
case 'mat2': {
console.log('Handling a mat2');
uniform;
break;
}
case 'mat3': {
console.log('Handling a mat3');
uniform;
break;
}
case 'mat4': {
console.log('Handling a mat4');
uniform;
break;
}
}
}
Another way of narrowing down the type and/or checking against all of the types is a switch/case. The above shows the switch using the type attribute and inside of each case the type checker will infer the correct, narrowed, types. This is very similar to the Rust match statement, just narrowed down to matching only against the tag.
Closing Remarks
When using this pattern you have to be explicit as to how you intend on using it. I would suggest keeping to a single naming strategy, always call your tag attribute the same thing, always call your data attribute the same thing. You will have to be responsible for juggling the types around and making sure that your union includes all of the variants you intend it to include.
I will also say that this is a very powerful pattern in TypeScript and Rust as well as other languages. I wish it was a first-class citizen just like it is in Rust, but what you can do in TypeScript is still pretty amazing.