Containerized Activities in Durable Workflows - Part 1
In this series I want to show how we can build a serverless workflow using Azure Durable Functions, but implement some of the activities in that workflow using containers with Azure Container Instances. This is something I promised to write about a long time ago but it turned out to be fairly complex to implement. I've finally got enough working to be able to share a basic demo app (special thanks to Noel Bundick for some awesome sample code that got me through some sticky bits).
In this first post I'll explain what the motivation for this is, and some of the technologies we'll be using. Then over the coming days I'll explain in more depth how each part of the demo application works.
Current table of contents:
- Part 1 (this post): Introduction - What are we building and why do we need it?
- Part 2: Creating infrastructure with the Azure CLI
- Part 3: Creating an Event Grid Subscription
- Part 4: Creating Container Instances with the Azure .NET SDK
- Part 5: Creating ACI Instances in a Durable Functions Workflow
What are we building?
The demo app we will build (the code is available on GitHub at markheath/durable-functions-aci) shows how we can have a step in an Azure Durable Functions workflow that is implemented as an ACI container. Basically, when the workflow starts, it calls an "activity function" that uses the Azure .NET SDK to create a new ACI "container group" running our container.
Next, our durable workflow needs to wait for that container to finish executing. Unfortunately, we currently have to do that by polling, but I've included code that listens for Event Grid events so in the future we might be able to simplify this part of the code.
Finally, once the container has finished its task, we want to delete the ACI container group. I don't think that ACI container groups are charged while they are stopped, but I'm not 100% sure, so I'd rather be on the safe side!
One of my ideas is that maybe in the future, the code in this application could be converted into a generic extension for Durable Functions to greatly simplify the work involved in implementing workflow steps as container instances.
Why do we need it?
Azure Durable Functions makes it really easy to create serverless workflows, but sometimes the steps in the workflow cannot be straightforwardly implemented as "activity functions" in the Azure Function App itself. This might be because it is a long-running process (Azure Functions are limited to 5 minutes execution time by default), or because it requires custom software to be installed that cannot easily run on Azure App Service, or because we want to mount an Azure File Share (which Azure Functions does not currently support).
By using Azure Container Instances to implement these long-running custom activities, we can still get the serverless benefits of paying only for the compute we need (i.e. avoiding having Virtual Machines on standby waiting to implement these tasks), and benefit from automatic scaleout - we can simply create as many ACI instances as we want (within the constraints of the ACI service) to manage demand. But we also get additional benefits of containers - running whatever code we like in a sandboxed environment, and the ability to specify exactly how much memory and CPU we require for the task at hand.
The example scenario that motivated this is my need to perform custom media transcoding on demand. In my job I often need to transcode and process video files in a variety of obscure CCTV formats, which require custom software that can't be installed onto App Service. Some of the media files are extremely large (multiple GB) and so the transcoding process can take several hours to complete (9 hours is the longest one so far). Also, there can be sudden influxes of very large amounts of media, which all need to be transcoded as quickly as possible, so I need rapid scaleout. The ability to mount Azure File Shares is also useful as often the same file goes through multiple processing stages, each of which might be performed by a different container.
What will we use?
We're going to be integrating several of my favourite Azure technologies in this demo, and quite a few of them I've created Pluralsight courses about, so here's the bits and pieces we'll be using including links to my courses if you're interested in diving in a bit deeper.
- Azure Functions for serverless web API
- Azure Durable Functions for serverless workflows
- Azure Container Instances for containerized activities
- Azure Event Grid to notify us of resources getting created or deleted
- I'm hoping that in the future Event Grid will support ACI as an Event Source so we can easily detect when a container instance has stopped running
- System Assigned Azure App Service Managed Identities to grant our function app permission to create ACI container groups
- Azure CLI to automate the creation of the resources we need
- Azure Files to create file shares we can mount as volumes on containers for easy file sharing
- Azure Fluent .NET SDK to allow us to write C# code
Read part 2 here
Hi Mark, thanks for this very useful post. This is going to save us a lot of time, we're looking to use some containerised tasks in Data Factory. Unfortunately there isn't a good native activity for this, so this function is exactly what we need.Sam Harvey