Delving into Temporal Java Workflow Testing Environment

Motivation

The main reason for writing this article is the different observed behaviors of the testWorkflowEnvironment.sleep() method that is supposed to simulate time passing in the workflow under test. In some situations, the method blocked the test thread for the specified duration, and in others, it was not. Of course, no one wants to have Thread.sleep() in their tests.

Method Implementation

The method testWorkflowEnvironment.sleep() down the line calls a method with the name unlockTimeSkippingWithSleep. After checking the latter method, we can find the following comment:

// UnlockTimeSkippingWhileSleep decreases time locking counter by one and increases it back
// once the Test Server Time advances by the duration specified in the request.
//
// This call returns only when the Test Server Time advances by the specified duration.
//
// If it is called when Time Locking Counter is
// - more than 1 and no other unlocks are coming in, rpc call will block for the specified duration, time will not be fast forwarded.
// - 1, it will lead to fast forwarding of the time by the duration specified in the request and quick return of this rpc call.
// - 0 will lead to rpc call failure same way as an unbalanced UnlockTimeSkipping.

Although we can all agree that comments in the code is not an ideal way to document the behavior of the code, however, this piece of information provides a hint about how the sleep() method behaves in different situations.

It's not yet very clear how the process works, but we can deduce that what decides whether the method blocks the thread and waits or not depends on whether other threads are blocking the test server or not.

If you're using the temporal-spring-boot-starter-alpha provided by temporal, you know that the test workflow environment is injected and shared across the tests within the same class.

Experimentation

After reaching the conclusion mentioned above about the behavior of the sleep method, I prepared a small experiment to test the target hypothesis. In this repo a simple workflow with two test methods was prepared. Both tests call the sleep method with a duration of 5 seconds. Interestingly, the behavior of the method was affected by the order of the execution of the tests:

when test_a runs before test_b, the latter one waits for the full 5 seconds, while the first one finishes directly.
when test_b runs before test_b, both tests finish directly without waiting.

This observation is interesting for two reasons:

We were able to prove that calling the method testWorkflowEnvironment.sleep() can block the test thread for the specified duration. So, if we're testing a workflow that is supposed to execute a task every hour, our tests are at the risk of waiting the full hour, based on a factor we still don't fully understand!
When the tests are run separately, we can see that the sleep does not take effect. Thus it is safe to say that the tests are affecting each other, which defies the first principle of unit testing, which says that the tests should be isolated from each other.

Solution

After concluding that sharing the same instance of the test workflow environment across tests is not effective, the solution would be as simple as initializing a new instance for each test. This could be simply done in the setup method that runs before each test.