Why you should use Durable Functions sub-orchestrations

May 8. 2018 Posted in:

I was really pleased to see Durable Functions went GA yesterday, and continues to pick up some great new features, such as the ability to write orchestrator functions in JavaScript (still in preview). If you've not tried Durable Functions yet, it really is a game-changer, giving you a much better way to manage multiple functions that form a workflow, and greatly simplifies the implementation of complex workflows such as fan-out fan-in (map-reduce) and waiting for human interaction.

Sub-Orchestrations

In this post I want to highlight an interesting feature of Durable Functions called "sub-orchestrations". In Durable Functions an "orchestrator" function describes the order of the steps in your workflow, and "activity" functions are used to implement each of those steps.

With sub-orchestrations, an orchestrator function calls into another orchestrator function, allowing you to make workflows that are themselves built up of other workflows.

Why sub-orchestrations?

But why would you want to do this? When I first read about sub-orchestrations, I didn't initially think they would be a particularly important feature, but the more workflows I have built, the more benefits I can see for using them.

So here's a quick run-through of some of the reasons why I think you should consider using them once an orchestrator function grows to call more than about four or five activities.

1. Clean code

Real-world workflows consist of multiple steps, and tend to grow in complexity over time. If you trying to ensure that each of your activity functions has a "single responsibility" (which you should be), then you're likely to end up with a lot of them, resulting in a long and complex orchestrator function.

Also the strict "orchestrator function constraints" in Durable Functions, which stipulate that your orchestrator functions must be deterministic, have a tendency to increase the number of activity functions in use, as you need to create an activity function each time you need to perform any non-deterministic task such as fetching a value from a database or config.

Using sub-orchestrations allows you to logically group together smaller sections of your workflow, which makes for much easier to read and understand code than one giant function consisting of numerous activities.

Here's a very simple code example showing how an orchestrator function might run three sub-orchestrations in sequence:

public static async Task MultiStageOrchestrator(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();

    var output1 = await ctx.CallSubOrchestratorAsync<string>("Stage1", input);
    var output2 = await ctx.CallSubOrchestratorAsync<string>("Stage2", output1);
    await ctx.CallSubOrchestratorAsync("Stage3", output2);
}

2. Error-handling and retrying

I wrote recently about how great Durable Functions is for handling errors. It allows you to handle errors for the workflow as a whole, or for individual functions. But if you have a large and complex workflow made up of a few smaller workflows implemented as sub-orchestrations, then it becomes possible for each sub-orchestration to manage its own exception handling with any clean-up that is appropriate just for that part of the overall workflow.

Sub-orchestrations can be retried with back-offs, in exactly the same way that activity functions can. This is very powerful, as retrying a series of activities without the use of sub-orchestrations would be complex to implement.

So here we can see how the second stage in our example above could be set up to retry up to four times, with a back-off of five seconds:

var output2 = await ctx.CallSubOrchestratorWithRetryAsync<string>("Stage2", 
                         new RetryOptions(TimeSpan.FromSeconds(5),4), output1);

3. Run sub-orchestrations in parallel

Another thing you might notice if you take the trouble to break a long and complex workflow up into sub-orchestrations is that some of them could be run in parallel as they don't depend on each other's output. Just like you can run activities in parallel, implementing a fan-out fan-in pattern, you can do exactly the same with sub-orchestrations, kicking off several different sub-orchestrations (or several instances of the same sub-orchestration) and then waiting for them all to complete with Task.WhenAll.

Running orchestrations in parallel opens the door for performance boosts that would be too much of a pain to implement without the benefit of sub-orchestrations.

In this example, two sub-orchestrations ("stage1" and "stage2") are started in parallel, then we wait for both to complete, and use their outputs in a call to a third stage.

public static async Task ParallelSubOrchestrations(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();

    var stage1Task = ctx.CallSubOrchestratorAsync<string>("Stage1", input);
    var stage2Task = ctx.CallSubOrchestratorAsync<string>("Stage2", input);

    await Task.WhenAll(stage1Task, stage2Task);

    await ctx.CallSubOrchestratorAsync("Stage3", Tuple.Create(stage1Task.Result, stage2Task.Result));
}

4. Reuse across workflows

If you break complex workflows up into sub-orchestrations, you make find that the same sub-orchestration can be re-used by multiple different orchestrators. This eliminates duplication, or the need to make orchestrators that contain complex branching code. If there is some shared logic used by two different workflows, put it into a sub-orchestration that they can both make use of.

In this simple example, "workflow1" and "workflow2" share a common "SharedStage" orchestrator but also perform different tasks before or after.

public static async Task Workflow1(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();
    var output1 = await ctx.CallSubOrchestratorAsync<string>("StageA", input);
    var output2 = await ctx.CallSubOrchestratorAsync<string>("SharedStage", output1);
    await ctx.CallSubOrchestratorAsync("StageB", output2);
}

public static async Task Workflow2(
    [OrchestrationTrigger] DurableOrchestrationContext ctx)
{
    string input = ctx.GetInput<string>();
    var output1 = await ctx.CallSubOrchestratorAsync<string>("StageD", input);
    var output2 = await ctx.CallSubOrchestratorAsync<string>("SharedStage", output1);
    await ctx.CallSubOrchestratorAsync("StageE", output2);
}

5. Simplified event sourcing history

Behind the scenes, Durable Functions uses an "event sourcing" approach to storing the history of orchestrations. Every time an activity completes, the orchestrator wakes up and must "replay" through all the prior events that have happened in this orchestration to reconstruct the current state of the workflow and work out what to do next.

The longer and more complex an orchestrator is, the more event sourcing steps must be stored and replayed, making debugging a pain (if you are using breakpoints in your orchestrators), and possibly impacting performance.

However, if sub-orchestrations are used, the event sourcing history for the parent orchestration can replace the entire call to a sub-orchestration with its serialized JSON output, greatly reducing the number of overall events stored against the parent orchestration. This leads to the final benefit I want to mention.

6. Safer upgrades

One possible gotcha with Durable Functions is what happens when you upgrade your code while orchestrations are in progress. You need to be very careful if you do this as any breaking changes to your orchestrator (such as re-ordering functions, or changing the input or output format of activities) will result in things going wrong when your event sourcing history that was generated by a previous version of the orchestrator function is replayed against the new one.

Sub-orchestrations can't save you from this, but can actually provide some protection, if breaking changes can be isolated to a single sub-orchestration, which could allow a larger workflow to recover even if one of its sub-orchestration steps fails.

Summary

Durable Functions sub-orchestrations allow you to break large and complex workflows into more granular pieces, which open the door to retries, better error handling, reuse, and parallel execution. They also make for easier to read code, and could help make upgrades more reliable.

Comments

January 30. 2019 09:56

Thanks for your articles, it really helps me to better understand Durable Functions. I have a question on the matter : I need to do a "do while" loop in an sub-orchestrator function where I am calling an activity until the activity has nothing more to return to me. Do you think it is fine to do such things, as from what I understood the sub orchestrator function will each time replay the while loop ?

Alexandre

January 31. 2019 11:53

That's a good question. I guess I'd want to know how many times you expect it to go round the loop? If its typically under 20 then sure, no problem, but if you get into the hundreds or thousands of iterations, then I'd suggest re-thinking the workflow to use something like ContinueAsNew

Mark Heath

January 31. 2019 12:37

I expect to go round the loop around 300 or 350 max. The activity executed in the loop calls an API with pagination, so I am requesting each time a different page. I don't know exactly how could such a case work with ContinueAsNew.

Alexandre

January 31. 2019 13:08

OK, there would be a small perf hit but hopefully not too great. It's hard to suggest changes without seeing more of the code, but one option would be that you use a sub-orchestration that can do lets say up to 50 pages. Then the top-level orchestrator keeps calling that so it only needs to call it 6 or 7 times. That would be a way of preventing very large history replays. But without doing timing tests, I don't know whether that optimization would even be worth attempting

Mark Heath

January 31. 2019 13:26

Thank for your answer. I am already in the context of a sub-orchestrator because this is part of a bigger workflow so I prefer to avoid nesting too much sub-orchestrators which would make my code harder to understand.
So I think I will start with my big loop, and if I encounter some performance issues I will try what you suggested me.

Alexandre

June 5. 2019 15:17

Hey Mark - are you aware of any standard patterns that are being used around how best to cancel sub-orchestrations when a parent is called? From my research I have seen this is a feature request, but doesn't exist yet. So is it completely up to me to keep track of any sub-orchestration ids so I can manually terminate them?

Andrew Moreno

June 5. 2019 15:26

Yes, there's a few strategies you could use. Depending on what the sub-orchestration is doing - you might not care and be fine for it to carry on to completion. If you really do need to cancel sub-orchestrations, then I'd use the custom state of the parent orchestration to store the ids of the sub-orchestrations that are active. You can also specify the ids of the sub-orchestrations when you create them (they are just strings), so you might be able to use a convention so that sub-orchestrations have predictable ids that you could just attempt to cancel. Obviously that depends on how many sub-orchestrations there might be.

Mark Heath

June 5. 2019 15:42

Thanks for that blazing fast response! ;) Ok, yeah that pretty much confirms some of the ideas our team has been tossing around. We're trying to look for patterns on how best to perform compensating transactions when the parent and/or sub-orchestration(s) are terminated. The thought we have right now is to maybe hook into the lifecycle events that are mentioned here, and have functions subscribed to those events that would take the compensating action. Is this overkill?

Andrew Moreno

June 5. 2019 15:47

Another thought was to maybe not terminate it, but instead call RaiseEvent on the orchestration indicating it is being terminated, and use your approach from the 'Awaiting Multiple Events' article.

Andrew Moreno

June 5. 2019 15:56

That really depends on what your activities are actually doing. If it's critically important that they get "undone" with compensating actions then that might be a reasonable approach. If you are initiating the cancel yourself, you could also trigger a cleanup action at the same time, so you wouldn't necessarily need to use the lifecycle events (I haven't tried them myself yet).

Mark Heath

January 26. 2020 16:59

Hello Alexandre, we have similar needs for our use case, could you achieve do while loop in sub orchestrator the way you explained above? If yes, could you share a sample of it highlevel?

Tanuj Narula

January 27. 2020 08:17

In my sub-orchestrator I have this kind of code :
while(doesPageRemains) { var pagedResults = await context.CallActivityAsync<list<result>>(nameof(GetPagedResults), page); doesPageRemains = HasPageMaxNumberOfResults(pagedResults); results.AddRange(pagedResults); page++; }
In my activity function GetPagedResults, I am querying a specific page on the API. The private method HasPageMaxNumerOfResults tells me if it's the last page.
But having thought about this, if I had to implement it again or work again on this piece of code, instead of using a loop I think I would have used a recursive pattern and ContinueAsNew as Mark Heath suggested.

Alexandre

March 5. 2020 15:05

Just finished your Pluralsight course on durable functions and found it very helpful--thanks! I've got a question about sub-orchestrations. Is there any reason not to call a sub-orchestration from within a sub-orchestration? Imagine a complex workflow with an orchestration that calls multiple sub-orchestrations. One of the sub-orchestrations has multiple complex sub-tasks itself that could fit into their own sub-orchestrations. And those sub-tasks might in turn call several activities.

Drew Forsberg

March 5. 2020 15:17

I can't see any reason why not. It will work just fine and I think makes sense in very complex workflows. Glad to hear you enjoyed the course.

Mark Heath

May 19. 2022 09:55

I have this peculiar problem
This is the code in orchestrator
a = context.callActivity(Activity1)
do{
x = context.CallActivityAsync(Activity2)
context.createTimer(1 minute)
}while( a > x)
what I want is for Activity2 to get from my database server while a>x. that can be done(forced) by giving it a different input such as random guid
but if a>x is false, I want the next replay of Activity2 to get from table storage history.
Any suggestions ?

zzzxtreme