0 Comments Posted in:

I'm really pleased to announce that my latest Pluralsight course has been released. In Durable Functions Fundamentals I show you everything you need to get started with developing and debugging durable workflows locally, how to implement patterns such as fan-out fan-in and waiting for human interaction, and how to deploy and monitor your Durable Functions online.

To celebrate the launch here's a quick rundown of my top 10 reasons why you should use Durable Functions for your serverless workflows:

1. Express your workflows in code

With Durable Functions, you get to define your workflows in code. This means one place to look to see the big picture of what the whole workflow does, rather than it being spread across multiple functions. Currently Durable Functions orchestrators have to be written in C#, but JavaScript support is in beta so you can define your workflows in the language you're most comfortable with.

2. Retry activities

Another great feature of Durable Functions is the support for retrying activities with back-off. That used to be awkward to implement in regular Azure Functions, but with Durable Functions it's trivially easy to add retries to both individual activity functions and to sub-orchestrations, giving your workflows much greater resilience to transient errors.

3. Run activities in parallel

In a typical multi-step workflow there are probably some activities that can be performed in parallel, but without a framework like Durable Functions, implementing the "fan-in" part of a "fan-out fan-in" pattern is complex and risks introducing race conditions.

With Durable Functions, running activities (or whole sub-orchestrations) in parallel is easy to accomplish, and combined with the power of Azure Functions to scale out, brings the potential for dramatic speedup in the end-to-end time your workflows take to complete.

4. Timeout workflows

Sometimes in a workflow, you're waiting for some kind of external event - maybe for a human to respond, or for an external system to send you a message, but you want to time out if you don't receive the event within a certain time period. Durable Functions makes this pattern straightforward to implement, allowing you to detect that an orchestration has got stuck, and take some kind of mitigating action to get it moving again, or to alert a system administrator.

5. State management for free

Workflows inherently have state associated with them - you need to know where you've got to in the workflow in order to decide what the next step is whenever an activity completes. Durable Functions transparently manages workflow state for you, meaning you can implement complex workflows without needing your own database at all. Of course, if you do have a database, you may still will want to update it during your workflows, to track the state of your business entities, but you don't need to manage the state for the orchestrations themselves.

6. Check on workflow progress with REST API

If you've ever built out a workflow with regular Azure Functions, you'll know it can be a real pain to work out where in the pipeline you currently are. This can be especially important for trouble-shooting if a workflow has got stuck. How far through did it get before failing?

With Durable Functions you can use the query status API to find out if an orchestration is still running or not. The query status API includes a showHistory flag to request to see the history of the workflow, allowing you to see exactly where it got up to before it got stuck or failed.

Even better, there is now a SetCustomStatus API allowing you to store an arbitrary JSON object at any point in your workflow representing its current status. This is a great tool for diagnosing why an orchestration is taking longer than expected to complete.

7. Cancel workflows

If you've built a workflow out of regular Azure Functions chained together with queue messages, then cancelling it is not going to be easy. But with Durable Functions, the REST API includes a cancellation method making it really straightforward to abandon an in-progress workflow.

8. Serverless pricing model

Just because your workflows run for days at a time, doesn't mean you need to pay for days of compute. In fact, in many long-running workflows, most of the time is spent just waiting around. Because Durable Functions is built on top of Azure Functions, you get all the benefits of a serverless pricing model. You only pay for the time your functions are actually running and your orchestrator function invocations will all be extremely quick as they simply wake up, decide what the next step in the workflow is, and go straight back to sleep.

9. Versioning made easier

One of the hardest problems of implementing workflows is how to deal with versioning. If I make a breaking change to the workflow, what happens to in-flight orchestrations when I perform an upgrade? Durable Functions doesn't have a magic bullet to solve the problem, but it does provide several workable strategies for dealing with this issue. Currently I'm leaning towards just making a V2 version of my orchestrator functions and retiring the V1 orchestrator later once all old workflows have finished, but you can pick the versioning strategy that works best for you.

10. Develop and test locally

Finally, it's possible to develop and test your Durable workflows locally. You can get the full local debugging experience of stepping through orchestrator and activity functions, as well as examining the contents of your "task hub" (which can also be local if you are using the Azure Storage Emulator) using Storage Explorer.

When you do publish your workflows to Azure, then the Application Insights integration gives you access to rich and powerful querying capabilities on your Function App telemetry and logs.

Summary

If you're currently building workflows out of a series of Azure Functions triggering each other, then Durable Functions is a no-brainer. It really is a game-changer that makes development and management of your serverless workflows much easier. Do give it a try and if you're a Pluralsight subscriber then my new Durable Functions Fundamentals course will teach you the key concepts and provide lots of examples of the sorts of workflows you can build with Durable Functions.

Want to learn more about how easy it is to get up and running with Durable Functions? Be sure to check out my Pluralsight course Azure Durable Functions Fundamentals.

0 Comments Posted in:

I was really pleased to see Durable Functions went GA yesterday, and continues to pick up some great new features, such as the ability to write orchestrator functions in JavaScript (still in preview). If you've not tried Durable Functions yet, it really is a game-changer, giving you a much better way to manage multiple functions that form a workflow, and greatly simplifies the implementation of complex workflows such as fan-out fan-in (map-reduce) and waiting for human interaction.

Sub-Orchestrations

In this post I want to highlight an interesting feature of Durable Functions called "sub-orchestrations". In Durable Functions an "orchestrator" function describes the order of the steps in your workflow, and "activity" functions are used to implement each of those steps.

With sub-orchestrations, an orchestrator function calls into another orchestrator function, allowing you to make workflows that are themselves built up of other workflows.

Why sub-orchestrations?

But why would you want to do this? When I first read about sub-orchestrations, I didn't initially think they would be a particularly important feature, but the more workflows I have built, the more benefits I can see for using them.

So here's a quick run-through of some of the reasons why I think you should consider using them once an orchestrator function grows to call more than about four or five activities.

1. Clean code

Real-world workflows consist of multiple steps, and tend to grow in complexity over time. If you trying to ensure that each of your activity functions has a "single responsibility" (which you should be), then you're likely to end up with a lot of them, resulting in a long and complex orchestrator function.

Also the strict "orchestrator function constraints" in Durable Functions, which stipulate that your orchestrator functions must be deterministic, have a tendency to increase the number of activity functions in use, as you need to create an activity function each time you need to perform any non-deterministic task such as fetching a value from a database or config.

Using sub-orchestrations allows you to logically group together smaller sections of your workflow, which makes for much easier to read and understand code than one giant function consisting of numerous activities.

Here's a very simple code example showing how an orchestrator function might run three sub-orchestrations in sequence:

public static async Task MultiStageOrchestrator(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();

    var output1 = await ctx.CallSubOrchestratorAsync<string>("Stage1", input);
    var output2 = await ctx.CallSubOrchestratorAsync<string>("Stage2", output1);
    await ctx.CallSubOrchestratorAsync("Stage3", output2);
}

2. Error-handling and retrying

I wrote recently about how great Durable Functions is for handling errors. It allows you to handle errors for the workflow as a whole, or for individual functions. But if you have a large and complex workflow made up of a few smaller workflows implemented as sub-orchestrations, then it becomes possible for each sub-orchestration to manage its own exception handling with any clean-up that is appropriate just for that part of the overall workflow.

Sub-orchestrations can be retried with back-offs, in exactly the same way that activity functions can. This is very powerful, as retrying a series of activities without the use of sub-orchestrations would be complex to implement.

So here we can see how the second stage in our example above could be set up to retry up to four times, with a back-off of five seconds:

var output2 = await ctx.CallSubOrchestratorWithRetryAsync<string>("Stage2", 
                         new RetryOptions(TimeSpan.FromSeconds(5),4), output1);

3. Run sub-orchestrations in parallel

Another thing you might notice if you take the trouble to break a long and complex workflow up into sub-orchestrations is that some of them could be run in parallel as they don't depend on each other's output. Just like you can run activities in parallel, implementing a fan-out fan-in pattern, you can do exactly the same with sub-orchestrations, kicking off several different sub-orchestrations (or several instances of the same sub-orchestration) and then waiting for them all to complete with Task.WhenAll.

Running orchestrations in parallel opens the door for performance boosts that would be too much of a pain to implement without the benefit of sub-orchestrations.

In this example, two sub-orchestrations ("stage1" and "stage2") are started in parallel, then we wait for both to complete, and use their outputs in a call to a third stage.

public static async Task ParallelSubOrchestrations(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();

    var stage1Task = ctx.CallSubOrchestratorAsync<string>("Stage1", input);
    var stage2Task = ctx.CallSubOrchestratorAsync<string>("Stage2", input);

    await Task.WhenAll(stage1Task, stage2Task);

    await ctx.CallSubOrchestratorAsync("Stage3", Tuple.Create(stage1Task.Result, stage2Task.Result));
}

4. Reuse across workflows

If you break complex workflows up into sub-orchestrations, you make find that the same sub-orchestration can be re-used by multiple different orchestrators. This eliminates duplication, or the need to make orchestrators that contain complex branching code. If there is some shared logic used by two different workflows, put it into a sub-orchestration that they can both make use of.

In this simple example, "workflow1" and "workflow2" share a common "SharedStage" orchestrator but also perform different tasks before or after.

public static async Task Workflow1(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();
    var output1 = await ctx.CallSubOrchestratorAsync<string>("StageA", input);
    var output2 = await ctx.CallSubOrchestratorAsync<string>("SharedStage", output1);
    await ctx.CallSubOrchestratorAsync("StageB", output2);
}

public static async Task Workflow2(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();
    var output1 = await ctx.CallSubOrchestratorAsync<string>("StageD", input);
    var output2 = await ctx.CallSubOrchestratorAsync<string>("SharedStage", output1);
    await ctx.CallSubOrchestratorAsync("StageE", output2);
}

5. Simplified event sourcing history

Behind the scenes, Durable Functions uses an "event sourcing" approach to storing the history of orchestrations. Every time an activity completes, the orchestrator wakes up and must "replay" through all the prior events that have happened in this orchestration to reconstruct the current state of the workflow and work out what to do next.

The longer and more complex an orchestrator is, the more event sourcing steps must be stored and replayed, making debugging a pain (if you are using breakpoints in your orchestrators), and possibly impacting performance.

However, if sub-orchestrations are used, the event sourcing history for the parent orchestration can replace the entire call to a sub-orchestration with its serialized JSON output, greatly reducing the number of overall events stored against the parent orchestration. This leads to the final benefit I want to mention.

6. Safer upgrades

One possible gotcha with Durable Functions is what happens when you upgrade your code while orchestrations are in progress. You need to be very careful if you do this as any breaking changes to your orchestrator (such as re-ordering functions, or changing the input or output format of activities) will result in things going wrong when your event sourcing history that was generated by a previous version of the orchestrator function is replayed against the new one.

Sub-orchestrations can't save you from this, but can actually provide some protection, if breaking changes can be isolated to a single sub-orchestration, which could allow a larger workflow to recover even if one of its sub-orchestration steps fails.

Summary

Durable Functions sub-orchestrations allow you to break large and complex workflows into more granular pieces, which open the door to retries, better error handling, reuse, and parallel execution. They also make for easier to read code, and could help make upgrades more reliable.

Want to learn more about how easy it is to get up and running with Durable Functions? Be sure to check out my Pluralsight course Azure Durable Functions Fundamentals.

0 Comments Posted in:

I wrote recently about why you should use Azure Durable Functions to implement your serverless workflows rather than just manually chaining together a bunch of functions with queues. There was great news recently that Durable Functions is now in "release candidate", and in this post I want to explore in a bit more detail how it can greatly improve your error handling within workflows.

Unhandled Exceptions

First of all, a quick reminder about how Durable Functions works. You create an "orchestrator function", which defines your workflow. And then create multiple "activity functions", one for each step in your workflow. The orchestrator can call these activities either in sequence or parallel.

In an unhandled exception is thrown by an activity function, it will propagate up to the orchestrator function. This is brilliant as it means the orchestrator can make intelligent decisions on what should happen to the workflow based on an activity failing. This might involve triggering a cleaning up activity, or retrying, or maybe the workflow can carry on regardless.

Of course if the orchestrator function doesn't catch these exceptions itself, then the orchestration will terminate. However, even in this case, we'll get some useful information from the Durable Functions runtime. If we query an orchestration that has failed using the Durable Functions REST API we'll see a runtimeStatus of Failed and in the output we'll get information about which activity function the exception occurred in, and the error message.

So in this example, my Activity2 activity function threw an unhandled exception that was also unhandled by the orchestrator function, resulting in the orchestration ending. Here's the output from the Durable Functions REST API showing the orchestration status:

{
    runtimeStatus: "Failed",
    input: "hello",
    output: "Orchestrator function 'ExceptionHandlingExample' failed: The activity function 'Activity2' failed: \"Failure in Activity 2\". See the function execution logs for additional details.",
    createdTime: "2018-04-30T11:48:28Z",
    lastUpdatedTime: "2018-04-30T11:48:31Z"
}

Catching Exceptions in Activity Functions

Of course, you don't need to let exceptions propagate from activity functions all the way through to the orchestrator. In some cases it might make sense to catch your exceptions in the activity function.

One example is if the activity function needs to perform some cleanup of its own in the case of failure - perhaps deleting a file from blob storage. But it might also be to simply send some more useful information back to the orchestrator so it can decide what to do next.

Here's an example activity function that returns an anonymous object with a Success flag plus some additional information depending on whether the function succeeded or not. Obviously you could return a strongly typed custom DTO instead. The orchestrator function can check the Success flag and use it to make a decision on whether the workflow can continue or not.

[FunctionName("Activity2")]
public static async Task<object> Activity2(
    [ActivityTrigger] string input,
    TraceWriter log)
{
    try
    {
        var myOutputData = await DoSomething(input);
        return new 
        {
            Success = true,
            Result = myOutputData
        };
    }
    catch (Exception e)
    {
        // optionally do some cleanup work ...
        DoCleanup();
        return new 
        {
            Success = false,
            ErrorMessage = e.Message
        };
    }
}

Catching Exceptions in Orchestrator Functions

The great thing about orchestrator functions being able to handle exceptions thrown from activity functions is that it allows you to centralize the error handling for the workflow as a whole. In the catch block you can call a cleanup activity function, and then either re-throw the exception to fail the orchestration, or you might prefer to let the orchestration complete "successfully", and just report the problem via some other mechanism.

Here's an example orchestrator function that has one cleanup activity it runs whichever of the three activity functions the problem was found in.

[FunctionName("ExceptionHandlingOrchestrator")]
public static async Task<string> ExceptionHandlingOrchestrator(
    [OrchestrationTrigger] DurableOrchestrationContext ctx,
    TraceWriter log)
{
    var inputData = ctx.GetInput<string>();
    try
    {
        var a1 = await ctx.CallActivityAsync<string>("Activity1", inputData);
        var a2 = await ctx.CallActivityAsync<ActivityResult>("Activity2", a1);
        var a3 = await ctx.CallActivityAsync<string>("Activity3", a2);
        return a3;
    }
    catch (Exception)
    {
        await ctx.CallActivityAsync<string>("CleanupActivity", inputData);
        // optionally rethrow the exception to fail the orchestration
        throw;
    }
}

Retrying Activities

Another brilliant thing about using Durable Functions for your workflows is that it includes support for retries. Again, at first glance that might not seem like something that's too difficult to implement with regular Azure Functions. You could just write a retry loop in your function code.

But what if you want to delay between retries? That's much more of a pain, as you pay for the total duration your Azure Functions run for, so you don't want to waste time sleeping. And Azure Functions in the consumption plan are limited to 5 minutes execution time anyway. So you end up needing to send yourself a future scheduled message. That's something I have implemented in Azure Function in the past (see my randomly scheduled tweets example), but its a bit cumbersome.

Thankfully, with Azure Functions, we can simply specify when we call an activity (or a sub-orchestration) that we want to retry a certain number of times, and customise the back-off strategy, thanks to the CallActivityWithRetryAsync method and the RetryOptions class.

In this simple example, we'll retry Activity1 up to a maximum of 4 attempts with a five second delay before retrying.

var a1 = await ctx.CallActivityWithRetryAsync<string>("Activity1", 
               new RetryOptions(TimeSpan.FromSeconds(5),4), inputData);

Even better, we can intelligently decide which exceptions we want to retry. This is important as in cloud deployed applications some exceptions will be due to "transient" problems that might be resolved by simply retrying, but others are not worth retrying.

When an activity function throws an exception, it will appear in the orchestrator as a FunctionFailedException, but the inner exception will contain the exception thrown from the activity function. However, currently the type of that inner exception seems to be just System.Exception rather than the actual type (e.g. InvalidOperationException) that was thrown, so if you're making retry decisions based on this exception, you might have to just use its Message, although the actual exception type can seen if you call ToString.

Here's a very simple example of only retrying if the inner exception message exactly matches a specific string:

var a1 = await ctx.CallActivityWithRetryAsync<string>("Activity1", 
    new RetryOptions(TimeSpan.FromSeconds(5),4)
    {
        Handle = ex => ex.InnerException.Message == "oops"
    }, 
    inputData);

Summary

Durable Functions not only makes it much easier to define your workflows, but to handle the errors that occur within them. Whether you want to respond to exceptions by retrying with backoff, or by performing a cleanup operation, or even by continuing regardless, Durable Functions makes it much easier to implement than trying to do the same thing with regular Azure Functions chained together by queue messages.

Want to learn more about how easy it is to get up and running with Durable Functions? Be sure to check out my Pluralsight course Azure Durable Functions Fundamentals.