When 'when' went wonky
A surprising edge case in Kotlin's exhaustiveness checking.
Using Kotlin’s sealed
modifier, you can define something like an algebraic data type:
sealed interface TaskStatus {
object Pending : TaskStatus
class Complete(val result: Int) : TaskStatus
class Failed(val error: String) : TaskStatus
}
A when
statement lets us handle each case:
fun printTaskStatus(taskStatus: TaskStatus) {
when(taskStatus) {
TaskStatus.Pending -> println("Task is pending")
is TaskStatus.Complete -> println("Task complete with result: ${taskStatus.result}")
is TaskStatus.Failed -> println("Task failed with error: ${taskStatus.error}")
}
}
Notice there’s no else
branch.
Because TaskStatus
is sealed
, the compiler knows all its possible subtypes at compile time.
Since we’ve covered them all, no else
branch is necessary.
That’s good, because when we add a new subtype…
// Represents the status of an asynchronous task.
sealed interface TaskStatus {
object Pending : TaskStatus
+ object Skipped : TaskStatus
class Complete(val result: Int) : TaskStatus
class Failed(val error: String) : TaskStatus
}
…the compiler will helpfully remind us to update our when
statement:
error: 'when' expression must be exhaustive. Add the 'Skipped' branch or an 'else' branch.
when(taskStatus) {
^^^^
The main appeal here is that we get to model state in a type-safe manner while the compiler enforces exhaustiveness. This pattern is common in functional languages (Haskell) and languages with heavy functional inspiration (Rust).
But something strange can happen if we try to construct an instance of TaskStatus at runtime using Java reflection. Imagine invoking some Java serialization library that uses reflection under the hood:
val status = json.read<TaskStatus>("""{"status": "Pending"}""")
printTaskStatus(status)
Compilation passes, but we fail at runtime:
% kotlinc -include-runtime -d example.jar example.kt
% java -jar example.jar
Exception in thread "main" kotlin.NoWhenBranchMatchedException
at ExampleKt.printTaskStatus(example.kt:15)
at ExampleKt.main(example.kt:33)
at ExampleKt.main(example.kt)
What gives? Isn’t the compiler supposed to prevent this runtime error from happening?
At runtime, we decide whether to take the TaskStatus.Pending
branch by evaluating status == TaskStatus.Pending
, or equivalently, status.equals(TaskStatus.Pending)
. On the JVM, this invokes Object.equals
, which implements referential equality.
Normally that would be fine. Kotlin objects are represented by singletons, so every TaskStatus.Pending
refers to the same instance. But this guarantee is upheld by the Kotlin compiler, and it doesn’t extend to called Java code. And indeed using Java reflection, we can create additional instances of TaskStatus.Pending
:
val constructor = TaskStatus.Pending::class.java.getDeclaredConstructor()
constructor.isAccessible = true
val status1 = TaskStatus.Pending
val status2 = constructor.newInstance() as TaskStatus.Pending
println("${status1 == status2}") // false
println("${status1 === status2}") // false
Two distinct Pending
instances now exist, and when
falls through, leading to the runtime error.
So we have our bug: Kotlin thinks the when
statement is exhaustive because it assumes only one instance of TaskStatus.Pending
can exist at runtime. Our serialization library breaks that assumption by creating a new instance using reflection.
How can we fix this? The simplest way is to match on type:
fun printTaskStatus(taskStatus: TaskStatus) {
when(taskStatus) {
- TaskStatus.Pending -> println("Task is pending")
+ is TaskStatus.Pending -> println("Task is pending")
is TaskStatus.Complete -> println("Task complete with result: ${taskStatus.result}")
is TaskStatus.Failed -> println("Task failed with error: ${taskStatus.error}")
}
}
Now any instance of Pending
will trigger the appropriate branch.
But we can easily forget to write is
. A better solution would be to override Pending.equals
:
public sealed interface TaskStatus {
object Pending : TaskStatus {
override fun equals(other: Any?) =
other != null && other is Pending
// Should override hashCode() as well.
}
class Complete(val result: Int) : TaskStatus
class Failed(val error: String) : TaskStatus
}
Now, Pending.equals
implements structural equality. All instances compare equal.
Our original snippet behaves as expected, even if we use reflection to create additional instances of Pending
.
In fact, if we apply the data
modifier to an object, the compiler will generate appropriate equals()
(and hashCode()
) methods automatically, avoiding this strange situation entirely. This gotcha is even called out in the documentation:
The
equals()
function for adata object
ensures that all objects that have the type of yourdata object
are considered equal. In most cases, you will only have a single instance of yourdata object
at runtime, since adata object
declares a singleton. However, in the edge case where another object of the same type is generated at runtime (for example, by using platform reflection withjava.lang.reflect
or a JVM serialization library that uses this API under the hood), this ensures that the objects are treated as being equal.
I bring up this example not because I think it points to some flaw in Kotlin, but rather because it illustrates a trade-off. One reason Kotlin is attractive is because it can provide strong guarantees (exhaustiveness checking, null safety) and remain compatible with the Java ecosystem. As a developer, I get to take advantage of Kotlin’s features without giving up access to my favorite Java libraries, which can be really important in an enterprise context.
But things get weird at the border; called Java code is under no obligation to honor Kotlin’s guarantees, and that can lead to surprising results.