0 Comments Posted in:

Six years ago I blogged about how to calculate the total size of blobs in an Azure blob storage container.

Since that post, there have been two(!) new versions of the Azure Blob storage C# SDK. This post uses Azure.Storage.Blobs. And I also wanted to break down the storage by access tier.

Here's the code. You just need to provide your connection string and the name of the container:

var service = new BlobServiceClient(myConnectionString);
var containerName = "mycontainer"; 
var containerClient = service.GetBlobContainerClient(containerName);

var sizes = new Dictionary<AccessTier, long>
{
    { AccessTier.Archive, 0 },
    { AccessTier.Hot, 0 },
    { AccessTier.Cool, 0 }
};

var files = 0;

await foreach (var blob in containerClient.GetBlobsAsync()) 
{
    files++;
    sizes[blob.Properties.AccessTier.Value] += blob.Properties.ContentLength.Value;
}

Console.WriteLine($"{files} files");
foreach (var kvp in sizes)
{
    Console.WriteLine($"{kvp.Key}: {SizeSuffix(kvp.Value)}");
}

A few things of note.

First of all, we can make use of the really nice async enumerable support in C# 8 with the await foreach construct. This has been something I've wanted for a long time, and it's great to see libraries making use of this.

Second, this can be quite slow to run on a very large container. It took over three hours for one container with nearly 20 million blobs. So it's not something you necessarily want to do more times than you need to.

As always in the cloud, you should also consider cost implications. I think I'm right in saying that the only endpoint this code will call is the list blobs operation to retrieve pages of blobs which already have the size and access tier information to hand. Although storage operations are pretty cheap, it can add up when the total number of operations runs into the millions.

Vote on HN