DevOps Multi-Cloud

Building a Scalable Jenkins Pipeline Platform with Shared Libraries and JCasC

Every Jenkins estate decays the same way: each team copies a Jenkinsfile from the repo next door, mutates it, and within a year you have four hundred snowflakes and no way to roll out a CVE patch without a four-hundred-PR campaign. The fix is a platform, not a template. One versioned shared library owns the pipeline logic, repos call a single entrypoint, and the controller is rebuildable from code. This guide builds that platform end to end: library layout, custom DSL, versioning, self-onboarding folders, JCasC, ephemeral Kubernetes agents, secrets, and unit tests for the Groovy itself.

1. Structure the shared library: vars, src, resources

A Jenkins shared library is a Git repo with a fixed, magic directory layout. Jenkins only recognizes three top-level directories, and each has a distinct role:

pipeline-library/
  vars/                       # global variables -> custom DSL steps
    standardPipeline.groovy   # exposes step standardPipeline(...)
    standardPipeline.txt      # help text shown in Snippet Generator
    dockerBuild.groovy
    notifySlack.groovy
  src/                        # Groovy classes on the classpath (org.foo.*)
    com/kloudvin/ci/
      BuildConfig.groovy
      Semver.groovy
  resources/                  # non-Groovy files, loaded with libraryResource
    com/kloudvin/ci/
      pod-templates/jnlp-maven.yaml
      sonar-project.properties.tmpl

The contract that matters: every .groovy file in vars/ becomes a global step named after the file. vars/standardPipeline.groovy defines a call() method, and pipelines invoke it as standardPipeline { ... }. That is the entire trick behind a custom DSL.

// vars/standardPipeline.groovy
def call(Map config = [:]) {
    // config is the closure-populated map from the Jenkinsfile
    def cfg = new com.kloudvin.ci.BuildConfig(config)
    pipeline {
        agent {
            kubernetes {
                yaml libraryResource("com/kloudvin/ci/pod-templates/jnlp-maven.yaml")
            }
        }
        options {
            timeout(time: cfg.timeoutMinutes, unit: 'MINUTES')
            buildDiscarder(logRotator(numToKeepStr: '30'))
            disableConcurrentBuilds()
        }
        stages {
            stage('Build')   { steps { container('maven') { sh 'mvn -B clean package' } } }
            stage('Test')    { steps { container('maven') { sh 'mvn -B test' } } }
            stage('Scan')    { when { expression { cfg.scanEnabled } }
                               steps { sonarScan(cfg) } }
            stage('Publish') { when { branch 'main' }
                               steps { dockerBuild(cfg) } }
        }
        post {
            always  { junit testResults: '**/surefire-reports/*.xml', allowEmptyResults: true }
            failure { notifySlack(status: 'FAILED', config: cfg) }
        }
    }
}

Keep vars/ files thin. They are orchestration glue; anything with real logic (parsing, version math, config validation) belongs in src/ as a unit-testable class. A vars file that grows past ~80 lines is a src class trying to escape.

The BuildConfig class lives in src/ and gives you a typed, validated config object instead of a bag of untyped map keys:

// src/com/kloudvin/ci/BuildConfig.groovy
package com.kloudvin.ci

class BuildConfig implements Serializable {
    String  appName
    String  registry      = 'registry.kloudvin.internal'
    Integer timeoutMinutes = 30
    Boolean scanEnabled    = true

    BuildConfig(Map cfg) {
        this.appName = cfg.appName ?: { throw new IllegalArgumentException('appName is required') }()
        if (cfg.registry)       this.registry       = cfg.registry
        if (cfg.timeoutMinutes) this.timeoutMinutes = cfg.timeoutMinutes as Integer
        if (cfg.scanEnabled != null) this.scanEnabled = cfg.scanEnabled
    }
}

implements Serializable is not optional. Pipeline state is persisted to disk across restarts and resumed; any object that survives across a sh step or stage boundary must serialize. Forgetting this throws NotSerializableException at the worst possible moment.

2. Write custom DSL steps and a one-line Jenkinsfile

The whole point is that consuming repos do not author pipeline logic. Their Jenkinsfile declares intent and nothing else:

// A consuming repo's entire Jenkinsfile
@Library('kloudvin-pipeline@v3') _

standardPipeline {
    appName        = 'payments-api'
    timeoutMinutes = 45
    scanEnabled    = true
}

That trailing _ after the @Library annotation is required: the annotation must attach to something, and _ is the idiomatic no-op import. Now every supporting step is its own vars/ file, composed by standardPipeline:

// vars/sonarScan.groovy
def call(com.kloudvin.ci.BuildConfig cfg) {
    withSonarQubeEnv('kloudvin-sonar') {
        container('maven') {
            sh "mvn -B sonar:sonar -Dsonar.projectKey=${cfg.appName}"
        }
    }
    timeout(time: 10, unit: 'MINUTES') {
        // qualityGate aborts the build if the gate fails
        waitForQualityGate abortPipeline: true
    }
}
// vars/dockerBuild.groovy
def call(com.kloudvin.ci.BuildConfig cfg) {
    def tag = "${cfg.registry}/${cfg.appName}:${env.GIT_COMMIT.take(12)}"
    container('kaniko') {
        sh """
            /kaniko/executor \
              --context=`pwd` \
              --dockerfile=Dockerfile \
              --destination=${tag} \
              --cache=true
        """
    }
}

This composition is the source of the platform’s power. Patch dockerBuild.groovy once – add a scan, switch the builder, change the registry – and every repo on that library version gets it on the next build, zero PRs to product repos.

3. Version the library: tags, trusted vs untrusted

A platform you cannot version is a platform you cannot change safely. The @Library('name@ref') annotation pins to any Git ref: a branch, a tag, or a commit SHA. Use semantic tags and let teams opt into a major line:

Reference style Example When to use
Floating branch @Library('lib@main') Internal platform repos only; you accept breakage
Pinned major tag @Library('lib@v3') Default for product repos; moving tag tracks v3.x
Exact tag @Library('lib@v3.4.1') Repos that must freeze, e.g. during a compliance window

Make v3 a moving tag you re-point to the latest v3.x release, so consumers pin to a major line and pick up backward-compatible fixes automatically:

# Cut a patch and advance the major-line pointer
git tag -a v3.4.2 -m "fix: kaniko cache key"
git tag -f v3 v3.4.2          # move the v3 alias forward
git push origin v3.4.2
git push -f origin v3         # consumers on @v3 get this on next build

The security model is the second axis. Libraries configured at the global/folder level by an admin run as trusted – they may call internal Jenkins APIs and @Grab dependencies. Libraries loaded dynamically by a Jenkinsfile via the library step are untrusted and run inside the Groovy sandbox. The rule for a platform:

Configure the golden library as a global trusted library in JCasC, marked implicit load off and allow default version override off. That stops a product repo from pinning an arbitrary fork or an older, unpatched tag. Treat anything a repo can self-declare as untrusted and sandboxed.

4. Template multibranch and organization folders for self-onboarding

You do not want to click “New Item” four hundred times. An Organization Folder (GitHub or Bitbucket) scans an org, and for every repo containing a Jenkinsfile it auto-creates a multibranch project – branches and PRs included. Onboarding a repo becomes “add a Jenkinsfile,” nothing more.

Define it in code through the Job DSL seed job so the folder itself is reproducible:

// jobs/seed-org-folder.groovy  (Job DSL)
organizationFolder('kloudvin-services') {
    description('Auto-onboards every repo with a Jenkinsfile')
    organizations {
        github {
            repoOwner('kloudvin')
            apiUri('https://api.github.com')
            credentialsId('github-app-kloudvin')
            traits {
                gitHubBranchDiscovery { strategyId(1) }       // branches
                gitHubPullRequestDiscovery { strategyId(1) }  // PRs from origin
            }
        }
    }
    projectFactories {
        workflowMultiBranchProjectFactory { scriptPath('Jenkinsfile') }
    }
    orphanedItemStrategy {
        discardOldItems { daysToKeep(7); numToKeep(20) }
    }
    triggers { periodicFolderTrigger { interval('1d') } }
}

The GitHub App credential (github-app-kloudvin) matters at scale: a personal token shares one rate-limit bucket across the whole estate and starves at a few hundred repos, while a GitHub App gets per-installation limits and finer scopes.

5. Manage the controller with JCasC and seed jobs

Configuration-as-Code (the configuration-as-code plugin) renders the controller’s entire configuration from YAML, replacing point-and-click setup. Set CASC_JENKINS_CONFIG to a file, directory, or URL; Jenkins applies it on boot and on a reload from Manage Jenkins -> Configuration as Code -> Reload.

# jenkins.yaml -- the controller, as code
jenkins:
  systemMessage: "KloudVin CI -- managed by JCasC. Do not configure by hand."
  numExecutors: 0                     # controller runs no builds; agents only
  authorizationStrategy:
    roleBased:
      roles:
        global:
          - name: "admin"
            permissions: ["Overall/Administer"]
            assignments: ["platform-team"]
  clouds:
    - kubernetes:
        name: "k8s"
        serverUrl: "https://kubernetes.default"
        namespace: "jenkins-agents"
        jenkinsUrl: "http://jenkins.jenkins.svc:8080"
        containerCapStr: "50"

unclassified:
  globalLibraries:
    libraries:
      - name: "kloudvin-pipeline"
        defaultVersion: "v3"
        implicit: false
        allowVersionOverride: false   # repos cannot pin a fork or old tag
        retriever:
          modernSCM:
            scm:
              git:
                remote: "https://github.com/kloudvin/pipeline-library.git"
                credentialsId: "github-app-kloudvin"

jobs:
  - script: |
      pipelineJob('seed') {
        definition {
          cps {
            script(readFileFromWorkspace('jobs/seed-org-folder.groovy'))
            sandbox(false)
          }
        }
      }

Bootstrap order is the subtle part: JCasC applies first and creates the seed job; the seed job runs Job DSL and creates the org folders. Keep jenkins.yaml and jobs/ in one repo, mount it into the controller pod, and the entire Jenkins is a git revert away from any prior state.

6. Ephemeral agents on Kubernetes: pod template and resource tuning

A pool of static agents accrues state between builds and bills you while idle. The Kubernetes plugin instead launches one pod per build and deletes it on completion. The pod template – referenced earlier via libraryResource – defines the containers a build can container('name') { ... } into:

# resources/com/kloudvin/ci/pod-templates/jnlp-maven.yaml
apiVersion: v1
kind: Pod
spec:
  containers:
    - name: maven
      image: maven:3.9-eclipse-temurin-21
      command: ["sleep"]
      args: ["infinity"]
      resources:
        requests: { cpu: "500m", memory: "1Gi" }
        limits:   { cpu: "2",    memory: "2Gi" }
    - name: kaniko
      image: gcr.io/kaniko-project/executor:v1.23.2-debug
      command: ["sleep"]
      args: ["infinity"]
      resources:
        requests: { cpu: "500m", memory: "1Gi" }
        limits:   { cpu: "1",    memory: "2Gi" }

Tuning notes from production:

7. Secure secrets: credentials binding and external Vault

Secrets never live in a Jenkinsfile or a library file. They live in the credentials store, and the platform binds them into the build environment only for the steps that need them, masked in the log. Wrap this in a vars/ step so consumers cannot fumble the binding:

// vars/withRegistryCreds.groovy
def call(Closure body) {
    withCredentials([usernamePassword(
        credentialsId: 'registry-push',
        usernameVariable: 'REG_USER',
        passwordVariable: 'REG_PASS')]) {
        body()   // $REG_USER / $REG_PASS exist only here and are masked in logs
    }
}

For anything beyond low-stakes secrets, do not store them in Jenkins at all – broker them from HashiCorp Vault so rotation happens outside the CI system and Jenkins holds only short-lived leases. The HashiCorp Vault plugin authenticates the controller (AppRole or Kubernetes auth) and injects paths per build:

stage('Deploy') {
    steps {
        withVault(configuration: [vaultUrl: 'https://vault.kloudvin.internal',
                                  vaultCredentialId: 'vault-approle'],
                  vaultSecrets: [[ path: 'secret/data/ci/payments',
                                   secretValues: [[envVar: 'DB_PASSWORD', vaultKey: 'db_password']] ]]) {
            sh 'deploy --db-pass "$DB_PASSWORD"'
        }
    }
}

Prefer Vault’s Kubernetes auth method over a long-lived AppRole secret-id: the agent pod’s ServiceAccount token becomes the Vault login, so there is no static credential to leak. Pair it with short TTL leases so a compromised build log buys an attacker minutes, not months.

8. Test the pipeline code with Jenkins Pipeline Unit

Pipeline logic is code, and untested code in vars/ fails in production at 2am. The JenkinsPipelineUnit framework mocks the pipeline DSL so you can unit-test vars/ steps and src/ classes on plain JVM CI – no Jenkins required. Wire it into a Gradle/Maven build that runs on every library PR:

// test/com/kloudvin/ci/StandardPipelineSpec.groovy
import com.lesfurets.jenkins.unit.BasePipelineTest
import org.junit.Before
import org.junit.Test
import static org.junit.Assert.assertEquals

class StandardPipelineSpec extends BasePipelineTest {

    @Before void setUp() {
        super.setUp()
        // register mocks for any DSL step the library calls
        helper.registerAllowedMethod('sh', [String]) { _ -> }
        helper.registerAllowedMethod('libraryResource', [String]) { 'apiVersion: v1' }
        helper.registerAllowedMethod('container', [String, Closure]) { _, c -> c() }
    }

    @Test void buildConfigRejectsMissingAppName() {
        try {
            new com.kloudvin.ci.BuildConfig([:])
            assert false : 'expected IllegalArgumentException'
        } catch (IllegalArgumentException e) {
            assertEquals('appName is required', e.message)
        }
    }

    @Test void scanStageRunsWhenEnabled() {
        def script = loadScript('vars/sonarScan.groovy')
        // assert callstack / step invocations via printCallStack()
        assertJobStatusSuccess()
    }
}

Run it in the library repo’s own pipeline so a bad commit never reaches a moving tag:

./gradlew test           # JenkinsPipelineUnit specs, fast, no Jenkins

This closes the loop: the library that everything depends on is itself gated by tests before its tag moves.

Verify

Confirm each layer is wired correctly before declaring the platform live:

# 1. JCasC parsed cleanly (no boot errors, config visible)
curl -s -u "$JENKINS_USER:$JENKINS_TOKEN" \
  https://jenkins.kloudvin.internal/configuration-as-code/ | grep -q "Reload"

# 2. The global library is registered at the expected version
curl -s -u "$JENKINS_USER:$JENKINS_TOKEN" \
  "https://jenkins.kloudvin.internal/manage/configureTools/" | grep -q "kloudvin-pipeline"

# 3. Library unit tests are green
./gradlew test --console=plain

Enterprise scenario

A platform team running ~600 microservice repos on a single Jenkins controller hit a hard wall during a Log4Shell-class incident. The vulnerable logging dependency was baked into the Docker build step that every team had copied into its own Jenkinsfile. There was no central step to patch – the “fix” was a 600-repo PR campaign that would have taken weeks while the window stayed open.

The constraint: they could not break in-flight releases, and several regulated repos were frozen under a change-control window and legally could not take the new behavior until their next window. A flag-day forced upgrade was off the table.

They solved it by collapsing the Docker logic into a single dockerBuild step in the shared library and switching every repo to the one-line standardPipeline entrypoint pinned to a moving major tag. The patched builder shipped behind that tag, so unfrozen repos picked it up on their next build automatically – no PRs. Frozen repos stayed safe by pinning an exact tag until their window opened:

// Frozen, change-controlled repos -- pinned to an exact patch, opt in later
@Library('kloudvin-pipeline@v3.4.1') _
standardPipeline { appName = 'ledger-core' }

Crucially, JCasC had set allowVersionOverride: false on the global library, so no repo could silently pin a stale fork and dodge the fix indefinitely – the platform team could see every repo’s effective version and drive the laggards. What would have been a multi-week, multi-hundred-PR scramble became a single tagged library release plus a short list of frozen repos to track. That is the entire economic argument for the platform.

Checklist

jenkinsci-cdshared-librariesgroovyconfiguration-as-code

Comments

Keep Reading