Most Entra ID outages I get paged for trace back to one thing: a client secret that expired, leaked into a log, or got copy-pasted into a fourth pipeline. This is the soup-to-nuts build I use for a confidential OIDC client — the app registration, the authorization code flow, least-privilege Graph scopes — and then the part that actually matters, replacing every secret and certificate with workload identity federation.
1. App registration anatomy: application vs service principal
The single most common confusion in Entra ID is treating the application and the service principal as one object. They are not.
- The application object (
/applications) is the global definition: redirect URIs, exposed scopes, the requested permissions, the federated credentials. It lives in your home tenant and is referenced byappId(the client ID). - The service principal (
/servicePrincipals) is the local instance of that application in a tenant. Role assignments, admin consent grants, and sign-in policy attach here, keyed by the object ID of the SP.
When you register an app in your own tenant, Entra creates both. Multi-tenant apps get one application object in the home tenant and a service principal in every tenant that consents.
# Create the application object. This returns appId (client ID) and id (object ID).
az ad app create \
--display-name "kv-oidc-confidential-client" \
--sign-in-audience AzureADMyOrg
APP_ID=$(az ad app list --display-name "kv-oidc-confidential-client" --query "[0].appId" -o tsv)
# Materialize the service principal in this tenant.
az ad sp create --id "$APP_ID"
Pick
sign-in-audiencedeliberately.AzureADMyOrgis single-tenant.AzureADMultipleOrgsandAzureADandPersonalMicrosoftAccountwiden who can sign in and change token validation rules downstream. Default to single-tenant unless you have a real multi-tenant requirement.
Redirect URIs and the manifest
Redirect URIs are typed by platform. A confidential web app uses the web platform; SPAs use spa (which forces PKCE and forbids secrets). Set the web redirect URI and require HTTPS — Entra rejects plain http redirect URIs except for localhost.
az ad app update --id "$APP_ID" \
--web-redirect-uris "https://app.kloudvin.com/auth/callback"
The old “manifest” you edited by hand in the portal is just the JSON representation of the application object, now aligned with the Microsoft Graph schema. Prefer Graph or az ad app update over manual manifest edits; the manifest is easy to corrupt and offers no validation.
2. The OIDC authorization code flow with PKCE
For a confidential client doing interactive user sign-in, the authorization code flow with PKCE is the only flow you should ship. Implicit flow is dead; do not enable access or ID tokens on the implicit grant.
The flow, end to end:
- App generates a
code_verifier(random 43-128 chars) and derivescode_challenge = BASE64URL(SHA256(code_verifier)). - Browser is redirected to
/authorizewithresponse_type=code,code_challenge, andcode_challenge_method=S256. - User authenticates; Entra returns an authorization
codeto the redirect URI. - Backend exchanges the code at
/token, presenting thecode_verifierand its client credential.
The authorization request:
GET https://login.microsoftonline.com/{tenant}/oauth2/v2.0/authorize
?client_id={APP_ID}
&response_type=code
&redirect_uri=https://app.kloudvin.com/auth/callback
&response_mode=query
&scope=openid%20profile%20offline_access%20User.Read
&code_challenge={code_challenge}
&code_challenge_method=S256
&state={opaque_state}
The token exchange. Note this is the one place a confidential client authenticates itself — and the part we eliminate later:
curl -s -X POST \
"https://login.microsoftonline.com/${TENANT_ID}/oauth2/v2.0/token" \
-d "client_id=${APP_ID}" \
-d "grant_type=authorization_code" \
-d "code=${AUTH_CODE}" \
-d "redirect_uri=https://app.kloudvin.com/auth/callback" \
-d "code_verifier=${CODE_VERIFIER}" \
-d "scope=openid profile offline_access User.Read" \
-d "client_secret=${CLIENT_SECRET}"
PKCE protects the authorization code in transit; the client credential proves the caller is the registered confidential client. You want both. Request offline_access only if you genuinely need a refresh token.
3. Delegated vs application permissions and least privilege
Two permission models, and conflating them is a recurring security finding in reviews I run.
| Delegated | Application | |
|---|---|---|
| Acts as | Signed-in user (app + user) | The app itself, no user |
| Token via | Auth code / OBO flow | Client credentials flow |
| Effective access | Intersection of app perms and user’s rights | Exactly what’s granted, tenant-wide |
| Consent | User or admin | Admin only |
The trap with application permissions is that they ignore the user entirely. User.Read.All (application) reads every user in the tenant. Grant the narrowest scope that works, and prefer delegated where a user is present.
# Microsoft Graph resource appId is well-known and constant.
GRAPH_APP_ID="00000003-0000-0000-c000-000000000000"
# Add the delegated User.Read scope (the id below is the stable id for User.Read).
az ad app permission add --id "$APP_ID" \
--api "$GRAPH_APP_ID" \
--api-permissions e1fe6dd8-ba31-4d61-89e7-88639da4683d=Scope
# Grant admin consent (writes the grant onto the service principal).
az ad app permission admin-consent --id "$APP_ID"
=Scopedenotes a delegated permission;=Roledenotes an application permission. Get this suffix wrong and you grant the wrong class of access. Always confirm the resulting grant on the SP, not just the request on the app.
4. Why client secrets and certificates are the problem
A confidential client needs to prove who it is. The classic options:
- Client secrets — a password. They expire (Entra now caps new secrets at 24 months), they end up in CI variables,
.envfiles, and tickets, and rotation is a manual, error-prone choreography across every consumer. - Certificates — better, because you can keep the private key in a KMS/HSM. But you still own a key with a lifecycle: issuance, distribution, rotation, revocation, and a CA to trust.
Both create the same systemic failures at scale: rotation toil, credential sprawl across pipelines, and standing leakage risk — a long-lived secret in a build log is exploitable for months.
The answer for machine-to-machine and CI scenarios is to stop storing a credential at all. With workload identity federation, an external identity provider you already trust (GitHub Actions, a Kubernetes cluster, AWS, GCP, or any OIDC IdP) mints a short-lived token, and Entra exchanges it for an Entra access token. No secret on the app. Nothing to rotate.
5. Configuring workload identity federation
You attach a federated identity credential (FIC) to the application object. It declares a trust: “I will accept tokens from this issuer, with this subject, for this audience.” Up to 20 FICs per application.
The three fields that define trust:
issuer— the external IdP’s OIDC issuer URL (must serve/.well-known/openid-configuration).subject— the exactsubclaim Entra requires in the incoming token. This is the scoping lever.audiences— what the external token’saudmust be. For Entra,api://AzureADTokenExchange.
az ad app federated-credential create --id "$APP_ID" --parameters '{
"name": "github-main-deploy",
"issuer": "https://token.actions.githubusercontent.com",
"subject": "repo:kloudvin/platform:ref:refs/heads/main",
"audiences": ["api://AzureADTokenExchange"]
}'
The
subjectis matched exactly, not by prefix.repo:kloudvin/platform:ref:refs/heads/maindoes not match a PR or a different branch. That is the feature — it pins the credential to one trust boundary.
For Terraform, the same object as azuread:
resource "azuread_application_federated_identity_credential" "gha_main" {
application_id = azuread_application.oidc_client.id
display_name = "github-main-deploy"
description = "GitHub Actions, main branch"
issuer = "https://token.actions.githubusercontent.com"
subject = "repo:kloudvin/platform:ref:refs/heads/main"
audiences = ["api://AzureADTokenExchange"]
}
6. Wiring an external token to the federated credential
The exchange uses the OAuth 2.0 client credentials grant with a client_assertion — but the assertion is the external IdP’s token, not something you sign.
GitHub Actions
Grant the workflow the id-token: write permission, then let the official login action fetch the OIDC token and exchange it:
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: azure/login@v2
with:
client-id: ${{ vars.AZURE_CLIENT_ID }}
tenant-id: ${{ vars.AZURE_TENANT_ID }}
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
- run: az account show
No client-secret. The action requests a GitHub OIDC token with aud=api://AzureADTokenExchange and POSTs it to Entra. The raw exchange, if you ever do it by hand:
curl -s -X POST \
"https://login.microsoftonline.com/${TENANT_ID}/oauth2/v2.0/token" \
-d "client_id=${APP_ID}" \
-d "grant_type=client_credentials" \
-d "scope=https://graph.microsoft.com/.default" \
-d "client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer" \
-d "client_assertion=${GITHUB_OIDC_TOKEN}"
Kubernetes
For a pod, trust the cluster’s service-account-issuer and pin the subject to the namespace and service account:
az ad app federated-credential create --id "$APP_ID" --parameters '{
"name": "aks-payments-sa",
"issuer": "https://oidc.example-cluster.k8s.local/",
"subject": "system:serviceaccount:payments:checkout-sa",
"audiences": ["api://AzureADTokenExchange"]
}'
On AKS, the Workload Identity webhook projects a token and sets the client/tenant env vars when you annotate the service account:
apiVersion: v1
kind: ServiceAccount
metadata:
name: checkout-sa
namespace: payments
annotations:
azure.workload.identity/client-id: "00000000-0000-0000-0000-000000000000"
Any external IdP
The pattern is identical: take that IdP’s issuer from its discovery document, decode a sample token to read the exact sub, set audiences to api://AzureADTokenExchange, and create the FIC.
7. Validating tokens: issuer, audience, and signing keys
If you are the resource server receiving Entra access tokens, validate them properly. A JWT you do not validate is a header you trust blindly.
Pull the OIDC discovery document for the v2.0 endpoint:
curl -s "https://login.microsoftonline.com/${TENANT_ID}/v2.0/.well-known/openid-configuration" | jq '{issuer, jwks_uri, token_endpoint}'
Then enforce, in this order:
- Signature — fetch keys from
jwks_uri, match the token’skid, verify the RS256 signature. Cache JWKS and refresh on an unknownkid; keys roll. iss— must equal theissuerfrom discovery (e.g.https://login.microsoftonline.com/{tenantId}/v2.0). For multi-tenant, validate the tenant against an allowlist, not just the issuer template.aud— must equal your API’s app ID URI or client ID. Rejecting on audience is what stops a token minted for another app from being replayed at yours.exp/nbf— honor expiry and not-before, with minimal clock skew.
Validate access tokens only if they were issued for your API. Graph access tokens are opaque to you and not meant for your validation. ID tokens are for authenticating the user to the client, not for authorizing API calls.
8. Hardening
- Token lifetime — access tokens default to ~60-90 minutes (variable). Do not stretch them. Short lifetimes plus federation mean a stolen token is useless fast.
- Conditional access for workload identities — you can target service principals with CA policy: restrict sign-in to known IP ranges so federated tokens are only honored from your CI egress or cluster egress. This is the backstop if a subject pin is ever too broad.
- Audit logging — ship Entra sign-in logs (including the service principal sign-in category) and audit logs to Log Sentinel/Sentinel. Alert on FIC create/update on sensitive apps and on any new credential added to a high-privilege application.
- No standing credentials — once federation is live, delete every secret and certificate on the app so a fallback path cannot be abused.
# List then remove any leftover secrets.
az ad app credential list --id "$APP_ID" -o table
az ad app credential delete --id "$APP_ID" --key-id "<keyId>"
Enterprise scenario
A platform team I worked with moved every CI deploy to GitHub OIDC and felt done — until a regional monorepo split broke a third of their pipelines overnight with AADSTS700213: No matching federated identity record found for presented assertion subject. The root cause: their FICs pinned subject to repo:org/platform:ref:refs/heads/main, but the new repos pushed deploys through GitHub environments, which change the sub to repo:org/<repo>:environment:prod. Worse, they were hitting the 20-FIC-per-application ceiling because every repo had been getting its own credential.
The fix was to stop enumerating subjects and start matching them with a flexible federated identity credential (FIC) using a claims-matching expression, which lets one credential cover a whole class of subjects:
az ad app federated-credential create --id "$APP_ID" --parameters '{
"name": "gha-prod-environments",
"issuer": "https://token.actions.githubusercontent.com",
"claimsMatchingExpression": {
"value": "claims['"'"'sub'"'"'] matches '"'"'repo:contoso/*:environment:prod'"'"'",
"languageVersion": 1
},
"audiences": ["api://AzureADTokenExchange"]
}'
Two non-obvious constraints bit them. First, claimsMatchingExpression and a literal subject are mutually exclusive on the same FIC — you pick one. Second, flexible FICs require the repos to live under the same GitHub org as the issuer namespace; a fork in another org silently won’t match. They paired the wildcard with a Conditional Access policy scoped to the service principal and GitHub’s egress ranges, so even a matched-but-rogue subject couldn’t redeem a token. One credential, a hard org+environment boundary, and the FIC-count problem disappeared.
Verify
Confirm the build is correct before declaring victory.
# 1. Federated credentials are present and scoped as intended.
az ad app federated-credential list --id "$APP_ID" \
--query "[].{name:name, issuer:issuer, subject:subject, aud:audiences[0]}" -o table
# 2. No client secrets or certs remain (expect empty output).
az ad app credential list --id "$APP_ID" -o table
# 3. Admin consent is recorded on the service principal, not just requested.
az ad app permission list-grants --id "$APP_ID" -o table
# 4. End to end: run the pipeline/pod and confirm a token is issued with no secret.
az account show --query "{tenant:tenantId, user:user.name}" -o table
In CI, the decisive check is a green azure/login@v2 step with no client-secret set anywhere in the repo or environment.
Checklist
Pitfalls
- Subject mismatch is the #1 federation failure.
AADSTS70021: No matching federated identity record foundalmost always means the incoming token’ssubdoes not byte-for-byte match the FICsubject. Decode the actual token and copy the claim verbatim — GitHub environments, tags, and PRs all produce different subjects. - Wrong audience. The external token must carry
aud=api://AzureADTokenExchange, not your app ID URI. The login action handles this; hand-rolled exchanges often get it wrong. =Scopevs=Rolemix-ups silently grant the wrong permission class. Verify the grant on the SP.- App vs SP confusion when assigning roles — RBAC and CA attach to the service principal’s object ID, not the application’s.
- Forgetting to delete the old secret leaves a credential someone can still use. Federation is only as strong as the absence of a fallback.
Next steps: extend the same FIC pattern to every CI system and cluster touching the app, then write a recurring audit that fails if any application in a privileged group has a passwordCredentials or keyCredentials entry. Once you have proven you can run production sign-in and machine-to-machine auth with zero stored secrets, that audit becomes a hard gate.