Every org with more than a handful of repos eventually drowns in copy-pasted CI YAML: the same Node setup, the same actions/checkout, the same brittle deploy block duplicated 200 times with subtle drift. This guide shows how to build a versioned, governed GitHub Actions platform that hundreds of teams consume without you becoming a bottleneck.
1. The copy-paste pipeline problem
When each repo owns a full copy of its .github/workflows/ci.yml, you have no leverage. A CVE in a third-party action means 200 pull requests. A new mandatory SBOM step means 200 more. Worse, every copy diverges, so “our CI” stops meaning anything concrete.
A workflow platform solves four things: deduplication (one source of truth per pipeline shape), governance (you can mandate a step org-wide), safe change (semantic versioning so consumers opt into upgrades), and least-privilege auth (centralized OIDC instead of long-lived secrets sprayed across repos).
2. Choosing the right abstraction
GitHub gives you three building blocks. Picking the wrong one is the most common early mistake.
| Abstraction | What it is | Use when |
|---|---|---|
| Composite action | A bundle of steps that runs inside a job | You want to reuse a sequence of steps (setup, cache, login) within a caller’s job |
| Reusable workflow | An entire workflow called via workflow_call |
You want to own whole jobs: build, test, deploy, with their own runners and permissions |
| Starter workflow | A template copied into a repo once | You want a starting point teams then own and edit themselves |
The mental model: composite actions are functions you call from a step; reusable workflows are jobs you call from a workflow. Starter workflows are scaffolding you hand off and forget. For a governed platform you mostly want reusable workflows (to own the pipeline) plus composite actions (to share step-level logic inside them).
Callout: A reusable workflow can call composite actions, but a composite action cannot call a reusable workflow. Compose downward, not upward.
3. A versioned org-level .github repo
GitHub treats a repo literally named .github in your org as a special home for org defaults. Create one and lay it out so reusable workflows and shared actions live together.
gh repo create my-org/.github --private --clone
cd .github
mkdir -p .github/workflows
mkdir -p actions/setup-node-build
mkdir -p workflow-templates
git checkout -b main
Note the distinction: files under .github/workflows/ in this repo are the reusable workflows other repos call. Files under workflow-templates/ are starter workflows surfaced in the org’s “New workflow” UI. They are not the same thing.
A reusable workflow declares a workflow_call trigger:
# .github/workflows/node-ci.yml
name: node-ci
on:
workflow_call:
inputs:
node-version:
type: string
default: "20"
run-lint:
type: boolean
default: true
secrets:
NPM_TOKEN:
required: false
outputs:
image-tag:
description: "Built image tag"
value: ${{ jobs.build.outputs.image-tag }}
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
outputs:
image-tag: ${{ steps.meta.outputs.tag }}
steps:
- uses: actions/checkout@v4
- uses: my-org/.github/actions/setup-node-build@v1
with:
node-version: ${{ inputs.node-version }}
- if: ${{ inputs.run-lint }}
run: npm run lint
- run: npm test
- id: meta
run: echo "tag=sha-${GITHUB_SHA::12}" >> "$GITHUB_OUTPUT"
The composite action it references:
# actions/setup-node-build/action.yml
name: "Setup Node and build deps"
description: "Checkout-agnostic Node setup with cache"
inputs:
node-version:
description: "Node major version"
required: true
runs:
using: "composite"
steps:
- uses: actions/setup-node@v4
with:
node-version: ${{ inputs.node-version }}
cache: "npm"
- run: npm ci
shell: bash
Callout: Every
runstep in a composite action must declareshell:. This is the single most common composite-action failure, and the error message is not obvious.
Semantic tags and a deprecation policy
Consumers should pin to a moving major tag (@v1) that you advance, plus you publish immutable patch tags (@v1.4.2) for teams that want to freeze. Maintain the major tag as a sliding pointer:
git tag -a v1.4.2 -m "node-ci: add SBOM step"
git tag -fa v1 -m "advance v1 -> v1.4.2"
git push origin v1.4.2
git push origin v1 --force
Publish a written policy: major tags get 90 days of support after the next major ships; breaking changes only land on a new major; deprecations are announced via a pinned discussion and an annotation emitted from the workflow itself:
- run: echo "::warning::node-ci v1 is deprecated; migrate to v2 by 2026-09-01"
4. Inputs, secrets, and outputs across workflow_call
The boundary is strict and that is a feature. A called workflow sees only what the caller explicitly passes.
# consumer repo: .github/workflows/ci.yml
name: ci
on: [push, pull_request]
jobs:
ci:
uses: my-org/.github/.github/workflows/node-ci.yml@v1
with:
node-version: "20"
secrets:
NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
permissions:
contents: read
Three rules that bite people:
- Secrets are not inherited automatically. Either name each one, or use
secrets: inheritto forward all of the caller’s secrets (use sparingly; it widens blast radius). ${{ secrets.* }}and${{ env.* }}cannot be used in theuses:line, so the workflow reference itself cannot be dynamic.- Permissions in the caller can only narrow, never expand, what the workflow’s
GITHUB_TOKENis allowed to do.
Outputs flow back through the outputs: map you declared, and downstream jobs read them via needs:
deploy:
needs: ci
runs-on: ubuntu-latest
steps:
- run: echo "Deploying ${{ needs.ci.outputs.image-tag }}"
5. Enforcing standards with rulesets and CODEOWNERS
A platform nobody is required to use is just a library. GitHub required workflows (configured at the org level via repository rulesets) let you force a reusable workflow to run on every PR in scope, even if the target repo has no workflow file of its own.
In Org Settings -> Rules -> Rulesets, create a branch ruleset targeting your default branches that:
- Requires the platform CI workflow as a status check.
- Requires pull requests and at least one approving review.
- Blocks force-pushes and deletions on protected branches.
Gate changes to the platform repo itself with CODEOWNERS so only the platform team can alter shared workflows:
# .github/CODEOWNERS
/.github/workflows/ @my-org/platform-team
/actions/ @my-org/platform-team
Callout: Required workflows run with the consumer repo’s context and token. Keep them fast and side-effect-free, because they execute on every single PR across the org.
6. Keyless auth with OIDC to Azure and AWS
Stop storing cloud credentials in repo secrets. With OIDC, GitHub mints a short-lived token per run, and the cloud exchanges it for temporary credentials scoped to that repo and branch. Any job using it needs id-token: write.
Azure via a federated credential on an app registration:
az ad app federated-credential create \
--id "$APP_OBJECT_ID" \
--parameters '{
"name": "github-main",
"issuer": "https://token.actions.githubusercontent.com",
"subject": "repo:my-org/my-service:ref:refs/heads/main",
"audiences": ["api://AzureADTokenExchange"]
}'
deploy-azure:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: azure/login@v2
with:
client-id: ${{ vars.AZURE_CLIENT_ID }}
tenant-id: ${{ vars.AZURE_TENANT_ID }}
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
- run: az group list -o table
AWS via an IAM OIDC identity provider and a role whose trust policy pins the sub claim:
{
"Effect": "Allow",
"Principal": { "Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com" },
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": { "token.actions.githubusercontent.com:aud": "sts.amazonaws.com" },
"StringLike": { "token.actions.githubusercontent.com:sub": "repo:my-org/my-service:ref:refs/heads/main" }
}
}
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/gha-deploy
aws-region: us-east-1
The win for a platform: centralize this in the shared deploy workflow once. Each consumer only supplies its own client ID or role ARN as a repo variable, and the federation subject/sub condition enforces that repo X can only assume role X.
Callout: Scope the
subject/subclaim as tightly as you can.repo:my-org/*trusts the entire org; pin to a specific repo, branch, orenvironment:instead.
7. Versioning, testing, and releasing
Test workflows locally before tagging. act runs jobs in Docker against a chosen event:
act pull_request -j build --container-architecture linux/amd64
act does not perfectly emulate workflow_call chaining or OIDC token minting, so back it with a real integration smoke test: a throwaway consumer repo that pins @main, runs the full pipeline against live runners on a schedule, and fails loudly on drift.
Lint the YAML in the platform repo’s own CI:
npm install -g @action-validator/cli
action-validator .github/workflows/node-ci.yml
Cut releases deterministically. Conventional commits plus a release step that advances both the patch tag and the sliding major keeps the contract honest. Treat the major-tag move as the actual “ship” event, since that is what most consumers track.
Enterprise scenario
A fintech platform team rolled out a shared node-ci.yml to ~180 repos pinned at @v1. Weeks later, deploys to a regulated workload started failing intermittently with AssumeRoleWithWebIdentity errors, but only on PRs from forks and on release/* branches. The trust policy pinned sub to repo:org/svc:ref:refs/heads/main, so anything off main got no credentials, and the failure surfaced inside the consumer’s job context, making it look like a per-repo problem rather than a platform one.
The real gotcha: they had assumed id-token: write and a single sub condition covered every trigger. It did not. The OIDC sub claim format differs by trigger, branch vs. tag vs. environment, and fork PRs intentionally receive a read-only token with no id-token write capability at all, by GitHub design.
The fix was to stop matching on branch refs and key the trust on the deployment environment instead, which GitHub stamps into the claim and which forks can never assume:
{
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:org/svc:environment:production"
}
}
}
Consumers then gated the deploy job with environment: production, which also forced required-reviewer approval before any token was minted. Fork PRs cleanly skipped deploy instead of erroring. One trust-policy claim, scoped to an environment rather than a ref, removed an entire class of confusing cross-repo failures.
Verify
# 1. Confirm the reusable workflow resolves and a run was created
gh workflow list --repo my-org/my-service
gh run list --repo my-org/my-service --workflow ci --limit 3
# 2. Inspect a run; confirm the called workflow + OIDC job executed
gh run view --repo my-org/my-service --log | grep -E "node-ci|id-token|AssumeRole"
# 3. Confirm the required workflow is enforced by the ruleset
gh api repos/my-org/my-service/rules/branches/main \
--jq '.[].type'
# 4. Verify the major tag points where you expect
git ls-remote --tags https://github.com/my-org/.github v1
A green run, a non-empty rules list including the required workflow, and a v1 tag pointing at your latest patch SHA mean the platform is wired correctly.
Rollout checklist
Rollout strategy: migrating 50+ repos safely
Never flip the whole org at once. Stage it:
- Pilot (3-5 repos): the platform team’s own repos consume
@v1. Shake out edge cases where it costs you, not other teams. - Opt-in wave: announce, document, and let willing teams migrate. Provide a one-PR migration that deletes their old YAML and adds the
uses:call. Automate it with a script that opens PRs in bulk viagh. - Required wave: enable the required-workflow ruleset on increasing scopes (by team, then org-wide). Run it in a non-blocking mode first if you can, watching failure rates before making it a hard gate.
- Cleanup: delete starter-workflow leftovers and dead secrets once OIDC is universal.
Keep both the old and new path working during each wave. The moment migration becomes all-or-nothing, teams stop trusting the platform.
Pitfalls
- Pinning to
@mainin consumers: any platform commit can break every repo instantly. Pin to a major tag; reserve@mainfor your smoke test. secrets: inheriteverywhere: convenient, but it hands every forwarded secret to the called workflow. Name secrets explicitly for sensitive ones.- Forgetting
id-token: write: OIDC silently fails to mint a token and you fall back to nothing. The job error is cryptic. - Reusable-workflow nesting limits: GitHub caps how deep
workflow_callcan chain (a small number of levels). Deep composition hits a hard wall, so keep the call tree shallow. - Missing
shell:in composite steps: the action refuses to run. Always set it.
Build the contract first, version it like an API, and let teams upgrade on their own schedule. That is the difference between a platform people adopt and a mandate they route around.