Service endpoints and personal endpoints hands-on: including Azure Backbone, storage account firewall, DNS, VNET and NSGs
Storage accounts play an important role in a medallion architecture for establishing an enterprise data lake. They act as a centralized repository, enabling seamless data exchange between producers and consumers. This setup empowers consumers to perform data science tasks and construct machine learning (ML) models. Moreover, consumers can use the info for Retrieval Augmented Generation (RAG), facilitating interaction with company data through Large Language Models (LLMs) like ChatGPT.
Highly sensitive data is usually stored within the storage account. Defense in depth measures have to be in place before data scientists and ML pipelines can access the info. To do defense in depth, multiple measurement shall be in place resembling 1) advanced threat protection to detect malware, 2) authentication using Microsoft Entra, 3) authorization to do positive grained access control, 4) audit trail to watch access, 5) data exfiltration prevention, 6) encryption, and last but not least 7) network access control using service endpoint or private endpoints.
This text focuses on network access control of the storage account. In the subsequent chapter, the various concepts are explained (demystified) on storage account network access. Following that, a hands-on comparison is finished between service endpoint and personal endpoints. Finally, a conclusion is drawn.
A typical scenario is that a virtual machine must have network access to a storage account. This virtual machine often acts as a Spark cluster to research data from the storage account. The image below provides an summary of the available network access controls.
The components within the image could be described as follows:
Azure global network — backbone: Traffic all the time goes over Azure backbone between two regions (unless customer forces to not do it), see also Microsoft global network — Azure | Microsoft Learn. That is no matter what firewall rule is utilized in the storage account and regardless whether service endpoints or private endpoints are used.
Azure storage firewalls: Firewall rules can restrict or disable public access. Common rules include whitelisting VNET/subnet, public IP addresses, system-assigned managed identities as resource instances, or allowing trusted services. When a VNET/subnet is whitelisted, the Azure Storage account identifies the traffic’s origin and its private IP address. Nevertheless, the storage account itself shouldn’t be integrated into the VNET/subnet — private endpoints are needed for that purpose.
Public DNS storage account: Storage accounts will all the time have a public DNS that could be access via network tooling, see also Azure Storage Account — Public Access Disabled — but still some level of connectivity — Microsoft Q&A. That’s, even when public access is disabled within the storage account firewall, the general public DNS will remain.
Virtual Network (VNET): Network through which virtual machines are deployed. While a storage account is rarely deployed inside a VNET, the VNET could be whitelisted within the Azure storage firewall. Alternatively, the VNET can create a non-public endpoint for secure, private connectivity.
Service endpoints: When whitelisting a VNET/subnet within the Storage account firewall, the service endpoint have to be turned on for the VNET/subnet. The service endpoint must be Microsoft.Storage when the VNET and storage account are in the identical region or Microsoft.Storage.Global when the VNET and storage are in numerous regions. Note that service endpoints can also be used as an overarching term, encompassing each the whitelisting of a VNET/subnet on the Azure Storage Firewall and the enabling of the service endpoint on the VNET/subnet.
Private endpoints: Integrating a Network Interface Card (NIC) of a Storage Account throughout the VNET where the virtual machine operates. This integration assigns the storage account a non-public IP address, making it a part of the VNET.
Private DNS storage account: Inside a VNET, a non-public DNS zone could be created through which the storage account DNS resolves to the private endpoint. That is to be sure that that virtual machine can still hook up with the URL of the storage account and the URL of the storage account resolves to a non-public IP address moderately than a public address.
Network Security Group (NSG): Deploy an NSG to limit inbound and outbound access of the VNET where the virtual machine runs. This could prevent data exfiltration. Nevertheless, an NSG works only with IP addresses or tags, not with URLs. For more advanced data exfiltration protection, use an Azure Firewall. For simplicity, the article omits this and uses NSG to dam outbound traffic.
In the subsequent chapter, service endpoints and personal endpoints are discussed.
The chapter begins by exploring the scenario of unrestricted network access. Then the small print of service endpoints and personal endpoints are discussed with practical examples.
3.1 Not limiting network access — public access enabled
Suppose the next scenario through which a virtual machine and a storage account is created. The firewall of the storage account has public access enabled, see image below.
Using this configuration, a the virtual machine can access the storage account over the network. For the reason that virtual machine can also be deployed in Azure, traffic will go over Azure Backbone and can be accepted, see image below.
Enterprises typically establish firewall rules to limit network access. This involves disabling public access or allowing only chosen networks and whitelisting specific ones. The image below illustrates public access being disabled and traffic being blocked by the firewall.
In the subsequent paragraph, service endpoints and chosen network firewall rules are used to grant network access to storage account again.
3.2 Limiting network access via Service endpoints
To enable virtual machine VNET access to the storage account, activate the service endpoint on the VNET. Use Microsoft.Storage for throughout the regions or Microsoft.Storage.Global for cross region. Next, whitelist the VNET/subnet within the storage account firewall. Traffic is then blocked again, see also image below.
Traffic is now accepted. When VNET/subnet is faraway from Azure storage account firewall or public access is disabled, then traffic is blocked again.
In case an NSG is used to dam public outbound IPs within the VNET of the virtual machine, then traffic can also be blocked again. It’s because the general public DNS of the storage account is used, see also image below.
In that case, private endpoints shall be used to be sure that that traffic doesn’t leave VNET. That is discussed in the subsequent chapter.
3.3 Limiting access via Private endpoints
To reestablish network access for the virtual machine to the storage account, use a non-public endpoint. This motion creates a network interface card (NIC) for the storage account throughout the VNET of the virtual machine, ensuring that traffic stays throughout the VNET. The image below provides further illustration.
Again, an NSG could be used again to dam all traffic, see image below.
That is nevertheless counterintuitive, since first a non-public endpoint is created within the VNET after which traffic is blocked by NSG in the identical VNET.
Enterprise all the time requires network rules in place to limit network access to their storage account. On this blog post, each service endpoints and personal endpoint are considered to limit access.
Each is true for service endpoints and personal endpoints:
For service endpoints, the next hold:
- Requires to enable service endpoints on VNET/subnet and whitelisting of VNET/subnet in Azure storage account firewall.
- Requires that traffic leaves the VNET of the virtual machine that’s connecting to the storage account. See above, the traffic stays on the Azure backbone.
For personal endpoints, the next hold:
- Public access could be disabled within the Azure Storage firewall. See above, public DNS entry of storage account will remain.
- Traffic doesn’t leave the VNET through which the virtual machine also runs.
There are numerous other things to think about whether to make use of service endpoints or private endpoints (costs, migration effort since service endpoints have been on the market longer than private endpoints, networking complexity when using private endpoints, limited service endpoint support of newer Azure services, hard limit of number private endpoints in storage account of 200).
Nevertheless, in case it’s required (“will need to have”) that 1) traffic shall never leave VNET/subnet of virtual machine or 2) it shouldn’t be allowed to create firewall rules in Azure storage firewall and have to be locked down, then service endpoint shouldn’t be feasible.
In other scenarios, it’s possible to think about each solutions, and the very best fit must be determined based on the particular requirements of every scenario.