2023-04-28
Optional types are types that can hold a value or be empty, and are sometimes referred to as option types. One way to think of optional types is as container types that can hold either zero or one elements. An optional type has a way to check if it is empty or non-empty, and if it is non-empty, a way to extract the value it contains.
Optional types are a common feature of many programming languages, and are typically defined in their respective standard libraries. The following table shows some examples of optional types in various programming languages:
Language | Optional Type | Package |
---|---|---|
Java | Optional |
java.util |
C# | Nullable |
System |
C++ | std::optional |
std |
Rust | Option |
std::option |
Haskell | Maybe |
base |
Scala | Option |
scala |
While the implementation of optional types may differ slightly between programming languages, the underlying concept remains the same. An optional type is created from an (existing) underlying type, and it can hold all values of that type as well as an additional empty value.
Although many programming languages offer optional types, Go is not one of them. There is no optional type in the official standard library or its sub-repositories.
Despite the absence of an official optional type, several third-party packages provide an implementation of optional types in Go. However, none of them have gained widespread popularity. Some of these packages were written before generics were available in Go 1.18 and make use of code generation to provide optional types for common built-in types. The following packages (in alphabetical order) use this approach:
github.com/antihax/optional
github.com/dangreenhalgh/maybe
github.com/keep94/maybe
github.com/markphelps/optional
Other packages were developed more recently and utilize generics to provide optional types for any existing type. The following packages use this approach:
4d63.com/optional
github.com/calebcase/maybe
github.com/dogmatiq/ferrite/maybe
github.com/pmorelli92/maybe
zenhack.net/go/util/maybe
In my experience with the projects I have worked on or read the source code for, the Go community has not widely adopted the use of optional types. I hypothesise several possible reasons for this:
Optional types are cumbersome to implement and use without parametric polymorphism-style generics, instead relying on code generation. Go only began supporting generics in Go 1.18, which was released in March 2022 (about a year ago at the time of writing). The possibilities that generics allow have not yet been fully realized.
Anecdotally, many Go programmers say that they are satisfied with non-generics-based solutions for representing optional values. Some Go programmers believe that using generics for optional types is non-idiomatic.
Go programmers might be suffering from “lack-of-generics Stockholm syndrome”, which has not worn off despite generics now being available. They may think, “we have done fine without generics for a long time, so why do we need them now?”
Go programs need to represent optionality in various contexts, including:
Function and method input parameters: Some input parameters are semantically optional. For example, an optional search parameter.
Function and method results: A function or method may or may not return a value depending on its execution. For example, a lookup in a data store.
Default behavior overrides: Configurable values often have defaults for their behavior, which may be optionally overridden.
Serialization and deserialization: Absent values need to be considered when
serializing or deserializing, such as NULL
for SQL, missing or null
fields for JSON, and optional fields in Protocol Buffers.
Fields in entity model types: Fields that are sometimes optional, such as the “middle name” field of a type representing a person.
This blog post focuses on the last context listed in the previous section – entity types that require optional fields. The other four contexts already have good solutions that don’t involve optional types, so aren’t as interesting to discuss.
So, what is an entity type? The definition may vary depending on the development community, and different terms may refer to the same concept. For example, an entity type may also be called a model. For the purpose of this blog post, an entity type is defined as follows:
An entity type is a type that models some part of the business domain or use case, enabling it to be managed inside a computer program.
There are many things that could be modeled as entity types. For example, a library management program could have entity types for:
Another example is a Customer Relationship Management (CRM) platform, which would require entities for:
In Go, a common way to implement entities is to use a struct
. The struct
contains fields that represent details about the entity. For example, a book
can be represented using the following struct:
type Book struct {
ID int // required
Title string // required
Author string // required
PublicationDate time.Time // optional
Description string // optional
}
Note that some fields are optional because they are either not relevant to a particular book or their values are unknown.
Another example of a struct that represents an entity is the contact entity from the CRM example:
type Contact struct {
ID int // required
FirstName string // required
LastName string // required
Email string // required
PhoneNumber string // optional
PostalAddress string // optional
}
Again, note that some fields are optional because they are either not relevant to a particular contact or their values are unknown.
Developers need to know whether a field is optional or required for several reasons:
Input validation: Appropriate validation should take place when populating entities via user input. Part of this validation would involve ensuring that required fields are populated.
Proper initialization during testing: Test authors may find it helpful to ensure that all required fields of entity types are populated. To do this, they need to know which fields are required.
Absent optional fields can sometimes be a hidden special case: For example,
when searching for books published before a specific date, Book
entities
without a PublicationDate
should not be returned in the result. Depending
on how the optional PublicationDate
field is represented, this may or may
not be an explicit special case.
Proper use of fields: Some business logic relating to fields may differ
depending on whether a field is optional or required. For instance, if a
physical mail campaign is launched for each Contact
in a CRM, mail should
only be sent to contacts with an (optional) PostalAddress
.
In this section, we will explore various ways to represent optionality without using optional types. Each approach has at least some pitfalls.
In the examples given earlier, the type used to represent required and optional
fields is the same, and a comment is used to differentiate between them. The
field is left as its zero value (e.g. ""
for string
) when an optional field
is not populated. This approach is not ideal for a few reasons.
Developers may accidentally ignore comments when reading or modifying code. This could result in the omission of a comment indicating whether a field is optional or required when new fields are added. This is a particular risk in codebases that have existing poor hygiene. The compiler does not enforce the correctness of comments.
For some optional fields, the zero value might be a valid present value. For
example, when modelling the number of teaspoons of sugar a person prefers in
their hot-beverage-of-choice, 0
is a valid value that is distinct from
unknown
.
Misusing an optional field as if it were a required field is difficult to
detect and may introduce subtle bugs. This is more likely to happen when
reading from a field. For example, suppose we want to find all Book
values
that were published before a certain date. An incorrect attempt could look
like:
var found []Book
for _, b := range books {
if b.PublicationDate.Before(cutoff) {
found = append(found, b)
}
}
Since the zero value is used for the PublicationDate
field when it’s unknown,
this erroneous code would include books without a publication date.
Using a pointer is another way to represent optionality instead of relying on the zero value and comments. In the book example, a field is marked as optional by using a pointer, as shown below:
type Book struct {
ID int
Title string
Author string
PublicationDate *time.Time
Description *string
}
This approach uses the Go type system to indicate that the PublicationDate
and Description
fields are optional. Developers can understand which fields
are optional, especially when the “pointers mean optional” convention is
followed consistently in the codebase.
When creating a Book
value, developers need to provide a pointer to a time.Time
value for the PublicationDate
field. This reminds developers that the field is
optional, as they need to use the &
operator to take the address of a time.Time
value.
However, there is no hint available when using the PublicationDate
field.
This is because in Go, the .
operator implicitly dereferences pointers. For
example, the following code would panic due to an implicit point dereference if
run with a book without a publication date:
book := getBookFromSomewhere()
if book.PublicationDate.Year() >= date.Now().Year() {
// Special handling for "new" releases
...
}
Using pointers to represent optionality can also introduce readability and understandability problems in code. It’s often not clear what the use of a pointer actually intends to represent. While it could be an optional field, it could also be used for other purposes. For example, pointers are often used to implement reference semantics, allowing multiple copies of a pointer to refer to the same value. In this case, the field may not be optional, but the pointer is used to ensure updates to the value are seen by all holders of the pointer. Another reason why pointers are commonly used is to avoid the performance penalty of copying exceptionally large structs. The intention may be that a pointer field is non optional, and the pointer is simply a performance optimization. Determining the reason why a field is a pointer rather than a non-pointer type can be difficult, with comments or conventions the only way to tell.
Sentinel values are a specific kind of value that can be assigned to a field to
indicate that the field’s value is absent. This is similar to using the zero
value to indicate absence, but more general. This approach is useful when the
zero value is a valid value for the field. For example, an int
field where
the value 0
is a meaningful present value may use -1
as a sentinel value to
indicate absence.
Using sentinel values presents the same challenges as using the zero value, but with the added burden of keeping track of which value represents the sentinel.
When working with optional fields in Go, an extra boolean value can be added to
indicate whether or not the field is present. For example, the Book
entity
would be defined like this:
type Book struct {
ID int
Title string
Author string
PublicationDate time.Time
PublicationDatePresent bool
Description string
DescriptionPresent bool
}
One benefit of this approach is that it is explicit and makes it less likely for developers to accidentally use the field as if it were required. This is especially true for developers using autocomplete functionality that shows completions with common prefixes together. However, it is not foolproof, as copying, pasting, and modifying code can still lead to errors.
On the downside, this method can increase the number of fields in the struct, making the code longer and harder to read. Some developers may also find this approach unattractive.
In a previous section, I mentioned that the Go standard library doesn’t have
any optional types, but that was not entirely true. The database/sql
package
does contain several optional types, including the
sql.NullString
.
// NullString represents a string that may be null.
// NullString implements the Scanner interface so
// it can be used as a scan destination:
//
// var s NullString
// err := db.QueryRow("SELECT name FROM foo WHERE id=?", id).Scan(&s)
// ...
// if s.Valid {
// // use s.String
// } else {
// // NULL value
// }
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
While these types are designed to be used in SQL contexts, there’s nothing preventing their use in non-SQL contexts.
The fact that the value must be accessed via the String
field reminds
developers that they are working with a non-standard type and should be
cautious when using it. This can reduce the chances of accidentally using the
field as a required field.
However, there are some drawbacks to using these types in non-SQL contexts.
Their use can be confusing and blur the lines between database access and other
layers of the software application. The database/sql
package also only defines
eight Null*
types, which can be limiting. If new types are needed, they would
need to be created manually.
A slice is a container that can hold zero or more elements, and an optional type is a container that can hold zero or one elements. As a result, it’s possible to use a slice to represent an optional type by simply ignoring any elements after the first element.
The Book
entity would be defined as follows:
type Book struct {
ID int
Title string
Author string
PublicationDate []time.Time
Description []string
}
However, there are two significant issues with this approach. Firstly, it may
be challenging for readers to distinguish between an optional field and a
repeated field. The only way to differentiate between them is through either a
comment or the plurality of the field’s name (e.g., PublicationDate
vs.
PublicationDates
). Secondly, it’s unclear what should happen if there is
more than one element in the slice. Should the program ignore the additional
entries? Panic? Return an error? Ideally, illegal states should not be
representable in the first place.
An explicit optional type can help to solve many of the problems above. It should do a few different things:
Convey to the reader that it is indeed an optional type. Readers should immediately recognise that the field has a value or doesn’t have a value.
Leave the decision of whether the field has reference or value semantics up to the user. Reference vs. value semantics should be orthogonal to whether the field represents a required or an optional field.
It should be difficult for a developer to accidentally use the field as though it is present when it actually isn’t.
To meet these requirements, we define a simple generic optional type. Here’s a minimal implementation in Go:
package maybe
// M is an immutable type that represents an optional value.
// Its zero value represents the absence of a value.
type M[T any] struct {
val T
has bool
}
// Get returns the value stored in M, with a flag
// indicating if it exists (true) or not (false).
func (m M[T]) Get() (T, bool) {
return m.val, m.has
}
// Just constructs a new M that contains a value.
func Just[T any](val T) M[T] {
return M[T]{val, true}
}
We’ve named our optional type M
, echoing the package name maybe
. This is a
similar naming style to the testing.T
type in the common library. We chose
maybe
over the obvious alternative optional
because it’s terser.
Using our maybe
package, the book entity would be defined as follows:
type Book struct {
ID int
Title string
Author string
PublicationDate maybe.M[time.Time]
Description maybe.M[string]
}
Creating a book would look like:
b := Book{
ID: 123,
Title: "To Kill a Mockingbird",
Author: "Harper Lee",
PublicationDate: maybe.M[time.Time]{},
Description: maybe.Just("To Kill a Mockingbird" +
" explores themes of racial injustice and" +
" coming of age in a small town in Alabama" +
" during the 1930s."),
}
To populate an optional field, you can use the maybe.Just
function, as we did
for the book’s description. To leave an optional field absent, you can use the
zero value of maybe.M.
In the example above, we did this explicitly as
maybe.M[time.Time]{}
for the publication date, but we could also have done it
by simply not specifying the field.
Here’s an example of finding all books before a cutoff date:
var found []Book
for _, b := range books {
if date, ok := b.PublicationDate.Get(); ok && date.Before(cutoff) {
found = append(found, b)
}
}
Because the Get
method returns both the time.Time
and the boolean indicating
whether the value exists, developers are reminded to consider the case when
the publication date is unknown. This greatly reduces the likelihood that they
will use an absent publication date as though it were present (they would have
to ignore the boolean return).
Additional methods can also be implemented to enhance the functionality of the hypothetical package. These methods could include:
A Nothing
function with the signature Nothing[T any]() M[T]
that
generates an M
value without any data. Although redundant, as the zero value
of M
can perform the same function, some developers may prefer using the
Nothing
method to maintain symmetry with the Just
method.
A Must
method with the signature Must() T
that retrieves the stored value
and panics if it does not exist. This defeats the guardrails that the Get
method puts in place, but may be useful in some contexts. There is a
convention in Go that a method with “must” in its name may panic if
preconditions are not met.
A Has
method with the signature Has() bool
that returns a boolean value
indicating whether a value exists.
An Or
method with the signature Or(other T) T
that returns the contained
value if it exists, or the other provided value if it does not.
An OrZero
method with the signature OrZero() T
that returns the stored
value if it exists, or the zero value of underlying type.
A Map
function with the signature Map[T, U any](m M[T], fn func(T) M[U]) M[U]
that creates a new M
by mapping its contents using a function. This
would be more natural to implement as a method, but introducing new type
parameters is not permitted in Go methods. Adding Map
would make M
a
monad, which is a concept in functional programming.
An implementation of the fmt.Stringer
interface that wraps the inner value.
An implementation of the json.Unmarshaler
and json.Marshaler
interfaces.
These could handle null
values as absent values and delegate marshaling and
unmarshaling for present values to the wrapped type.
An implementation of the sql.Scanner
and sql/driver.Valuer
interfaces. These
would also delegate SQL interoperability to the wrapped type.
This package is hypothetical and doesn’t live anywhere yet. If developers want the simplest version, it’s only a dozen lines of code, and so can just be copied directly into a project. Some of the ideas presented as additional methods are already implemented in the optional type libraries listed at the start of this post.
In my opinion, it’s likely that a maybe
type will eventually be introduced into
the Go sub-repositories, and eventually into the standard library. However, I
don’t expect this to happen for several years at least.
In the meantime, there will continue to be many different optional type implementations available, each with its own interpretation of the concept of a maybe or optional type.