How to get Tenant ID from Subscription ID in Azure using MSAL

This is a series of blog posts:

First you need to install AAD client NuGet package. Note this is MSAL, the modern and recommended way to communicate with AAD.

<PackageReference Include="Microsoft.Identity.Client" Version="4.36.1" />

Then use one of its helper methods:

using Microsoft.Identity.Client;

var hostName = "management.azure.com";
var apiVersion = "2020-08-01";
var requetUrl = $"https://{hostName}/subscriptions/{subscription}?api-version={apiVersion}";
var httpClient = new HttpClient();
var response = await httpClient.GetAsync(requetUrl, cancellationToken);

var authenticationParameters = WwwAuthenticateParameters.CreateFromResponseHeaders(response.Headers);

var authorizationHeaderRegex = new Regex(@"https://.+/(.+)/?", RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
var match = authorizationHeaderRegex.Match(authenticationParameters.Authority);
var tenantString = match.Success ? match.Groups[1].Value : null;

if (!Guid.TryParse(tenantString, out var tenantId))
{
    throw new InvalidOperationException($"Received tenant id '{tenantString}' is not valid guid");
}

Console.WriteLine(tenantId);

It’s not async and makes you to write less code. You still need to parse the tenant id out of the authorization uri, though.

You can find the code here: https://dotnetfiddle.net/M7paDG.

Posted in Programming | Tagged , , | Leave a comment

How to get Tenant ID from Subscription ID in Azure using ADAL

This is a series of blog posts:

In previous part we did it this using a script, this time we’ll do it using C#.

First you need to install AAD client NuGet package. Note this is ADAL, it’s now legacy and put into the maintenance mode.

<PackageReference Include="Microsoft.IdentityModel.Clients.ActiveDirectory" Version="5.2.9" />

Then use one of its helper methods:

using Microsoft.IdentityModel.Clients.ActiveDirectory;

var hostName = "management.azure.com";
var apiVersion = "2020-08-01";
var requetUrl = $"https://{hostName}/subscriptions/{subscription}?api-version={apiVersion}";
var httpClient = new HttpClient();
var response = await httpClient.GetAsync(requetUrl, cancellationToken);

var authenticationParameters = await AuthenticationParameters.CreateFromUnauthorizedResponseAsync(response);

var authorizationHeaderRegex = new Regex(@"https://.+/(.+)/?", RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
var match = authorizationHeaderRegex.Match(authenticationParameters.Authority);
var tenantString = match.Success ? match.Groups[1].Value : null;

if (!Guid.TryParse(tenantString, out var tenantId))
{
    throw new InvalidOperationException($"Received tenant id '{tenantString}' is not valid guid");
}

Console.WriteLine(tenantId);

You can find the code here: https://dotnetfiddle.net/M7paDG.

One of the drawbacks is that the helper method is async without a real need to be: underneath it calls another async helper which reads the content of the response but then it doesn’t use the content.

So you can write little more code yourself without the penalty of making it async:

var hostName = "management.azure.com";
var apiVersion = "2020-08-01";
var requetUrl = $"https://{hostName}/subscriptions/{subscription}?api-version={apiVersion}";
var httpClient = new HttpClient();
var response = await httpClient.GetAsync(requetUrl, cancellationToken);
		
var authenticationParameters = AuthenticationParameters.CreateFromResponseAuthenticateHeader(response.Headers.WwwAuthenticate.ToString());

var authorizationHeaderRegex = new Regex(@"https://.+/(.+)/?", RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
var match = authorizationHeaderRegex.Match(authenticationParameters.Authority);
var tenantString = match.Success ? match.Groups[1].Value : null;

if (!Guid.TryParse(tenantString, out var tenantId))
{
	throw new InvalidOperationException($"Received tenant id '{tenantString}' is not valid guid");
}

Console.WriteLine(tenantId);

You can find the code here: https://dotnetfiddle.net/kagSAK.

Posted in Programming | Tagged , , | Leave a comment

How to get Tenant ID from Subscription ID in Azure using PowerShell

This is a series of blog posts:

  • Part 1: using PowerShell
  • Part 2: using ADAL
  • Part 3: using MSAL

In order to do this, you’ll need:

  1. Call this Azure Resource Manager API without authentication
  2. Inspect the WWW-Authenticate header
  3. Parse the tenant id out of the authorization uri

Here’s how to do this using PowerShell:

$hostName = 'management.azure.com'
$apiVersion = '2020-08-01'
$response = try { Invoke-RestMethod -Method GET "https://$hostName/subscriptions/$subscription/?api-version=$apiVersion" } catch [System.Net.WebException] { $_.Exception.Response }
$header = $response.Headers['WWW-Authenticate']
$match = $header | Select-String -Pattern 'Bearer authorization_uri="https://.+/(.+?)"'
$tenantId = $match.Matches[0].Groups[1].Value
$tenantId 

The code above can be wrapped into a neat function:

function GetTenantidFromSubscriptionnId($subscription, $url = 'management.azure.com', $apiVersion = '2020-08-01') {
  try { Invoke-RestMethod -Method GET "https://$url/subscriptions/$subscription/?api-version=$apiVersion" } catch [System.Net.WebException] {
    ($_.Exception.Response.Headers['WWW-Authenticate'] | Select-String -Pattern 'Bearer authorization_uri="https://.+/(.+?)"').Matches[0].Groups[1].Value
  }
}

Which when called with c3c0a359-4420-4f84-8925-f642e2717296 will output e0a3d130-92db-4546-9813-45dd621f8379.

That’s it, folks!

Posted in Programming | Tagged , , , | Leave a comment

Carnation Anapa Winery, vol 3, day 153: corking

Today I’m bottling my wine. I got a 6-gallom carboy that went down to about 5 during the initial testing.

In the first batch I bottled 10 bottles. Each contains about 15g of water where I diluted about 0.8g of potassium metabisulfite total. In the second – 9 more.

Posted in Winemaking | Tagged | Leave a comment

Following circular nested profile path identified

If you’re getting the following error:

Circular nested profile definitions are not allowed. Following circular nested profile path identified: example.trafficmanager.net -> example.trafficmanager.net.

Then very likely you got an ARM template like this:

{
  "type": "Microsoft.Network/trafficManagerProfiles/nestedEndpoints",
  "apiVersion": "[variables('tmApiVersion')]",
  "name": "[concat(variables('tmName'), '/', parameters('location'))]",
  "properties": {
    "endpointStatus": "Enabled",
    "targetResourceId": "[resourceId('Microsoft.Network/trafficManagerProfiles', variables('tmName'))]",
    "weight": 1,
    "minChildEndpoints": 1,
    "geoMapping": [
      "GEO-NA"
    ]
  }
}

What means you created a Geographic traffic-routing based Traffic Manager that references itself. Hence the error.

Posted in Infrastructure | Tagged , | Leave a comment

How to get secret from Key Vault using PowerShell and Managed Identity

First you need to acquire a token using Managed Identity by calling the local IMDS endpoint:

$audience = 'https://vault.azure.net'
$token = Invoke-RestMethod -Method GET -Uri "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=$audience" -Headers @{ 'Metadata' = 'true' }

Note that audience must match the service you’re calling and is different from example calling ARM.

Then call Key Vault REST API to get the secret:

$secret = "https://$vaultName.vault.azure.net/secrets/$secretName/?api-version=7.0"
$auth = "$($token.token_type) $($token.access_token)"
Invoke-RestMethod -Method GET -Uri $secret -Headers @{ 'Authorization' = $auth }

That’s it, folks!

Posted in Programming | Tagged , , | Leave a comment

Reliable and scalable infrastructure: Traffic

This is a series of posts:

  1. Introduction
  2. Principles
  3. Layers
  4. Traffic (this post)
  5. Secrets

Now you have multiple environments, each consisting of multiple data centers, each consisting of multiple scale units. How do you wire up them all together to be well prepared for a disaster?

There are various kinds of services (stateless and stateful) so are the patterns of traffic (inbound and outbound) they serve. I’m lucky enough to work with mostly stateless services that serve inbound traffic. That it, there is no state per se and the data to be processed is the HTTP requests coming from the users over the internet. Namely, an ARM resource provider (RP) for the Azure Device Update (ADU). Thus below I’ll explain how to use Azure Traffic Manager (ATM or colloquially just TM) to route traffic to this kind of services. Other kinds might require a different model.

The model I’m proposing here is rooted in two aspects described earlier:

  • Each data center had multiple scale units
  • Each data center has its failover pair

First, the reliability of a data center. TM works just fine and routes traffic to a single scale unit in a data center, meanwhile being ready for the second one to be stood up and added to the rotation.. Thanks to the probes that run periodically (what is easily configurable), check each endpoint and mark it active (or not). The priority mode suites this option the best as the first endpoint would have priority 10 and the second would have 20. The numbers are arbitrary but you got the idea. The endpoints with higher priority kick in only when the those with lower are down.

Normally, if you have just one cluster up and running, the second endpoint will be always inactive and traffic will be always served by the cluster behind the first endpoint. In case of emergency, if you have to delete that cluster, you create another one. Its DNS/IP are known in advance and already preconfigured on the TM profile. This way you won’t need to do anything and it’ll start serving traffic immediately.

Another option is to have two endpoints and two clusters always up, running and serving traffic. It’s needed when there are any technical limitations or other considerations why one cluster is not enough. In this case the weighted mode with the same weight for both cluster works well.

You’ve secured the reliability of a single region: one cluster goes down, another takes up its place and continues to serve traffic. Now let’s shift the focus and see what happens if not just one cluster but the whole region goes down? This is less likely to happen until a really bad deployment takes place, likely of your own than not.

Azure has grouped regions into called failover pairs. What means that by the contract there won’t be a deployment to both regions simultaneously and at least one will stay healthy. For you that means that you can have another TM profile with two endpoints in the priority mode:

  1. The first endpoint is in Region A, e.g. West US with DNS westus.service.example.com
  2. The second endpoint is in Region B, e.g. Easy US with DNS eastus.service.example.com

If Region A is completely down, what would happen only when all clusters in that region by some reason became unavailable, then only traffic will be routed to Region B. What has its own complications such as increased latency, increased load on what’s now a single region with doubled traffic, what again increases latency. But serving customers slowly is better than not serving them at all.

Posted in Infrastructure | Tagged , , , | Leave a comment

How to assigned permissions for user-assigned managed identity on multiple subscriptions in bulk

First get the subscriptions you want to assign permissions on:

$subs = Get-AzSubscription |? { $_.Name.Contains("NorthAmerica") }

Then get the client id of the identity you to assign permissions for:

$id = Get-AzUserAssignedIdentity -ResourceGroupName my-shared-prod-westus2 `
                                 -Name my-shared-prod-westus2-id

Now perform the actual permissions assignment:

$subs |% { New-AzRoleAssignment -Scope "/subscriptions/$($_.Id)" `
                                -RoleDefinitionName "Contributor" `
                                -ApplicationId $id.ClientId }

That’s it, folks!

Posted in Programming | Tagged , | Leave a comment

Reliable and scalable infrastructure: Layers

This is a series of posts:

  1. Introduction
  2. Principles
  3. Layers (this post)
  4. Traffic
  5. Secrets

When designing your service’s infrastructure, you need to remember that your deployment (or scale, more below) unit can go down at any point of time for any period of time. And it doesn’t matter what’s the underlying technology is, whether it’s a Service Fabric cluster, a Kubernetes cluster, or a WebForms application running off Azure Websites aka App Service.

Usually a deployment is to blame, whether it was you or your upstream dependency. Behind a deployment usually there was a change. Then there was a mistake. And then a human being.

A maxim I learned in college (I’m paraphrasing here from Russian, thought) says:

Any found bug is at least one before the last one.

Because human engineers tend to make mistakes while making changes, there always would be one more bug out there.

What you cannot do? Change the human’s nature. What you can do though? Prepare yourself and your service’s infrastructure to a failure.

Let’s consider two scenarios when your deployment has failed:

  • It has failed and the service now is in unrecoverable state so you have to delete everything in order to start from scratch. For example, consequent deployment fail with 500 because upstream dependency fails.
  • It has failed and the service is in unrecoverable state but you cannot delete everything in order to start from scratch because something blocks you. For example, a security incident has occurred and the security team asks do not touch anything. Or the service team needs time to investigate the reasons for the failure so asks to do not change anything

What you do in either case? The answer lies in the ways how should’ve you modeled your infrastructure to be better prepared.

Let’s divide infrastructure into multiple layers, each with its role and lifecycle, also security and compliance boundaries. Often each layer also corresponds to its own set of secrets (certificates, mostly) that are shared downwards but are isolated upwards.

  • Cross-cloud
  • Cloud
  • Global
  • Environment
  • Data center
  • Scale unit

Let’s describe and explain each of them. The terminology is mine, might diverge from similar but more widely accepted in the industry. I’m happy to adjust it based on the feedback.

Cross-cloud. Super global across all clouds. Everything what’s happening over public Internet. The best example would be public DNS and email. Even sovereign (national) clouds use both public Internet and DNS, until we’re talking about air gapped solutions.

Cloud. Super global within a cloud and across its environments. Same as above but different clouds are now isolated from each other. However, there is still no isolation between environments. It should be relatively rarely used and not be considered to be a permanent solution, until it’s strictly necessary or otherwise impossible. But even so you should immediately start seeking a way to escape it. An example for would be a secret for an external monitoring mechanism when all environments and endpoints are monitored by the single external service.

Global. Considering the existence of the prior two layers, it’s not universally global. But it divides the plane into two principal parts that provide the minimum necessary separation: production and pre-production. An example would be a secret for AAD application, which has Prod and PPE versions of it. Or root DNS zone service.example.com.

Environment. Separated from one another by various physical boundaries, share nothing in common. For example, the Integration environment uses DNS zone int.service.example.com while the Test environment uses test.service.example.com.

Data center. In other words, a region in a cloud. Represents all the resources and the secrets that are necessary to serve traffic (or do other work) in particular geographical location but those which are not a part of a scale unit (see below). What means that there resources and secrets will be created before a scale unit is created and will continue to exist if a scale unit is deleted. Each environment would consist of at least one (or more) such data center. They can be further grouped into pairs or subdivided into availability zones. The candidate resource types would be Key Vaults (you don’t want to recreate secrets every time), Managed Identities (for same reason), IPs (created once will act as static), regional DNS records (e.g. westus2.int.service.example.com), Traffic Manager profiles this DNS record is a CNAME to.

Scale unit. The smallest unit of deployment. On-prem analogue would be a server, in the cloud it’s a VM scale set, a Service Fabric cluster, a Kubernetes cluster, etc. Groups all the resources needed to create such cluster. These resources should be deleted and recreated all together if something goes wrong. Each data center would consist of at least one (or more) such scale unit. The reasons for creating more than one would be: scalability, when one cluster is not enough to sustain the load, and reliability, when one goes down and you cannot failover traffic off the region.

To be continued…

Posted in Infrastructure | Tagged , , | Leave a comment

Reliable and scalable infrastructure: Principles

This is a series of posts:

  1. Introduction
  2. Principles (this post)
  3. Layers
  4. Traffic
  5. Secrets

First and foremost, you have to threat your service’s infrastructure as you threat your service’s code. In other words as infrastructure-as-code. This may include the techniques that are now common in general engineering processes such as:

  • Gated build. Each change is built and verified. If this an ARM template, you can run Test-AzResourceGroupDeployment
  • Gated deployment. Each change can not just be synthetically validated for the syntax correctness but actually deployed to a test cluster, alongside the basic infrastructure services if possible, what combined would help to ensure the changes are valid and functional
  • Continuous Integration (CI). Each change is immediately merged into the main branch and a ready-for-production build is produced
  • Continuous Delivery (CD). Each build is immediately deployed to an early test environment and the appropriate tests are performed. Then to another environment, then another.
  • Safe Deployment Practices (SDP). Each is build is not deployed to all available environments simultaneously but instead is slowly rolled out across environments and regions. They’re are grouped by kind (prod, pre-prod), geography (North America, Europe, Asia), type of customers (internal, partners, public), and so on.

You may refer to the Build and Deployment section of the Twelve-Factor App for more ideas how to the CI/CD process for both your services and infrastructure should look like.

Employing these and other techniques will help you to achieve multiple goals:

  • Increase the confidence in the changes
  • Increase the overall quality of the infrastructure by decreasing the number of errors slipping into production
  • Allow to catch issues early in the rollout
  • Increase the overall time-to-production, the total time it takes for a new feature or a fix to reach the target environment

To be continued…

Posted in Infrastructure | Tagged , , , | Leave a comment