Let's Talk About Blob Storage Container Role Assignments
A colleague of mine reached out to me asking a question about Terraform. This happens from time to time to me at Microsoft. Sometimes its somebody I know, sometimes its somebody that is totally new and that is totally OK. This time, it was a colleague from my city of residence, Columbus, Ohio, which is always fun when I get to interact with colleagues that are in the local area.
He was trying to get Role-Based Access Control (RBAC) working on a Storage Account but at the container level. In Azure, we assigne Role Definitions to Principals in order to grant access to a set of actions that can be performed within a given scope. The Scope aligns with Azure Resource Management topology, starting at the apex: Tenant and down through Management Group, Subscription, Resource Group, Resource and event to the sub-Resource level. This is where an Azure Storage Container resides.
A Brief History of Azure Storage
An Azure Storage Container is an oddly worded term that is a remnant of the “Windows Azure” era. This was an era firmly placed before Docker, containers and Kubernetes were widely used. Container was an analogous to the AWS term Bucket. We should couldn’t call it a bucket back then could we? The explosion of Docker and containerization aside, its not a terrible name, just unlucky. I say all this because unless you know Azure Storage vernacular it probably isn’t super clear that an Azure Storage Container is a construct that works with exactly one type of Azure Storage: Blob Storage. Blob Storage has top-level containers that hold a flat list of blobs. Blob names can container slash characters (e.g., /
) in order to give the perception of hierarchy.
Control Plane vs. Data Plane
Azure Storage is one of those services that has what is called a Data Plane. A Data Plane is essentially another operational layer at the server level that allows you to provision service-instance specific resources. This is contrasted to the Azure Control Plane (or just, the Control Plane) because the Data Plane is hosted on the instance of the Azure Service itself and not hosted as part of the Azure Resource Management Control Plane (the primary REST API that the Azure Portal and tooling all talk to). This means that each Azure Storage Account has its own Data Plane. This can be a little bit confusing because we sometimes refer to this collectively as just “the Data Plane” but I assure you, there is one for each Storage Account. As a result, each Storage Account’s “Data Plane” is hosted at the same level as the service itself, in the case of Azure Storage that means regionally. Therefore it is susceptible to regional outages but the hurdle you will most likely run into is if you disable public endpoints for your Storage Account. This essentially disables access to the Data Plane unless you have network line-of-sight to the Azure Storage Account via a private network.
Sometimes its not so obvious which resources are Control Plane vs Data Plane because the azurerm
provider is designed to abstract this away from us. However, we need to be aware of its existence otherwise we will run into issues that cause confusion and delay. Each storage type supported (e.g., Blob, Table, Queue) all have Data Plane Resources. In the case of Blob Storage that includes Blob Storage Containers (or just “Containers”) and Blobs. So how can we grant access to them?
Access Control
Because Azure Storage is such an old and a foundational one service it has evolved significantly over time. In the good old days, it didn’t support blob-level or even container-level access controls. All we could do was set a container to be either fully Public, fully private, or something in between where you could technically access a blob if you knew its name. We could also use SAS tokens which did grant access to specific resources but was intended to be for short term or temporary access, not long-lived persistent access based on identity. Long-lived identity-based access was later introduced using Hierarchical Namespace (i.e., HNS) but even this isn’t done using Azure Role Assignments, it’s done with Access Control Lists (ACLs), another Data Plane resource type for only specific types of Storage Accounts, mainly used in Data Analytics workloads.
All of these operations are Data Plane access controls, be it Container Accessibility, SAS Tokens, or ACLs. Azure Control Plane access control is done using Azure RBAC via Role Assignments. Luckily, now-a-days, we can assign Role Assignments to the Blob Storage Container in addition to the Storage Account to give us a little bit more granular access control via the control plane. Let’s look at how we do that.
The Role Assignment
My colleague already knew which Role Definition he wanted to use and he knew generally how the scope
and principal_id
attributes worked on the azurerm_role_assignment
resource.
resource "azurerm_role_assignment" "foo" {
scope = "???"
role_definition_name = "Storage Blob Data Contributor"
principal_id = "???"
}
The principal_id
was set to some Entra ID user. The scope
on a Role Assignment is set to an Azure Resource ID which looks like this:
/subscriptions/{subscription-id}/resourceGroups/{resource-group-name}/providers/Microsoft.Storage/storageAccounts/{storage-account-name}
Great! So why was he looking for help? Well, he declared his storage account and his container. He was able to assign a Role Assignment to it’s scope using the Storage Account’s Resource ID. Pretty much all resources in the azurerm
Terraform provider have an output attribute called id
. This attribute 99% of the time contains the partial-URL resource ID that we saw above.
resource "azurerm_storage_account" "main" {
name = "st${random_string.suffix.result}"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
account_tier = "Standard"
account_replication_type = "GRS"
}
resource "azurerm_storage_container" "my_cool_container" {
name = "stuff"
storage_account_name = azurerm_storage_account.main.name
container_access_type = "private"
}
By convention, you would expect the following to work.
resource "azurerm_role_assignment" "foo" {
scope = azurerm_storage_container.my_cool_container.id
role_definition_name = "Storage Blob Data Contributor"
principal_id = "???"
}
However, when he attempted to use the id
from the azurerm_storage_container
resource he rant into issues because Terraform was throwing errors saying it wasn’t a properly formatted Resource ID. How could this be?!
Well it all goes back to history. Hence, my long winded introduction to Azure Storage and its long, long history. This history has led us to this place. As the Data Plane grew and evolved and the way we could control access to Azure Storage changed the Terraform provider has to be updated and maintained along the way. As a result, when the azurerm
Terraform provider was originally authored, Blob Storage Containers were strictly Data Plane resources, they didn’t even have an Azure Control Plane Resource ID like they do today. That meaant the authors of the Terraform provider had to decide if they wanted to break everybody that was using this resource by changing the behavior of the id
attribute of the azurerm_storage_container
resource or just add a new output attribute to hold the newly added Azure Resource ID. They chose the latter. As a result, in order to add an Azure Role Assignment to a Blob Storage Container we need to use an attribute called resource_manager_id
.
resource "azurerm_role_assignment" "foo" {
scope = azurerm_storage_container.my_cool_container.resource_manager_id
role_definition_name = "Storage Blob Data Contributor"
principal_id = "???"
}
Hope this helps!
Until Next Time–Happy Azure Terraforming!!!