Async Part 3 - How the C# compiler implements async functions
In the previous article we talked about the semantics of an async functions. In this article we will look at how the C# compiler transforms an async function into a state machine. This state machine is used to process the results of the other async functions that are awaited.
What the state machine transform does
The C# compiler defines a new class to represent the state of execution. All variables that are live across await points are spilled into this class. This frees the stack and registers to be used to execute something else while awaiting.
At each await point, the compiler generates code that saves all the live variables. It then generates code that subscribes to the completion notication of the promise it is awaiting. This completion handler will resume executing the method where it left off.
When the async function reaches a return statement, it will complete the promise that was return earlier. If an exception is thrown instead, the async function machinery will catch it and store it in the promise.
An example of the state machine transform
It’s a bit easier to see in an example. Let’s go back to our AddAsync function:
static async Task<int> GetBiasAsync()
{
return 42;
}
static async Task<int> AddAsync(int a, int b)
{
int sum = a + b;
int c = await GetBiasAsync();
return sum + c;
}
This time we have defined them as static
so we don’t have to worry about spilling the this
variable.
The code the compiler generates is a bit complicated, so we will start with a simplified version.
If you are not familiar with .NET, you cannot directly use a Task
like a promise object.
Instead the TaskCompletionSource
expose a Task
object and give you the methods to move it to a completed or faulted state.
class AddAsyncStateMachine
{
// spilled variables
int sum;
public int a;
public int b;
// generated variables
TaskCompletionSource<int> source = new();
int state;
TaskAwaiter<int> taskAwaiter;
// Expose the Task
public Task<int> Task => source.Task;
public void MoveNext()
{
try
{
while (true)
{
switch (state)
{
case 0:
// Do the calucalutation. We store the value into
// a variable defined on the state machine object.
// That way we can reference the value after we resume.
sum = a + b;
// set the state so that the next time we are run
// we will run the next step
state = 1;
// start processing the async function
taskAwaiter = GetBiasAsync().GetAwaiter();
// see if the task has already completed
if (!taskAwaiter.IsCompleted)
{
// Register a callback, so that our state machine
// resumes executing after the task completes
taskAwaiter.OnCompleted(MoveNext);
return;
}
else
{
// If the task was already done, continue executing
// without yielding to the scheduler.
}
break;
case 1:
// Record that we are in a terminal state.
state = -1;
// Get the result from the awaiter.
// This will throw if there was an error.
int c = taskAwaiter.GetResult();
// transation the task to the completed state
source.SetResult(c + sum);
break;
default:
// We are in a terminal state; nothing to do.
return;
}
}
}
catch (Exception ex)
{
// If anything goes wrong, transtion the Task
// to the faulted state.
source.SetException(ex);
}
}
}
static Task<int> AddAsync(int a, int b)
{
var sm = new AddAsyncStateMachine()
{
a = a,
b = b,
};
// start the task running
sm.MoveNext();
// return the Task so that the caller can await it
// and be notified when it completes.
return sm.Task;
}
If you want to see the actual code generated by the C# compiler, see this example on SharpLab. There are some subtleties in the generated code that differ from my idealized sketch above.
One property of C#'s implementation of async functions you can see is this example is the body of the async function begins executing before the Task is returned to the caller. This is in contrast to some other systems like Rust, where the executor is responsible for polling the future returned by the async function to drive forward progress.