DevOps Multi-Cloud

Building a Vendor-Neutral Feature Flag Platform with OpenFeature and flagd

Most feature-flag adoptions start as a SaaS line item and end as a lock-in problem. The SDK is proprietary, evaluation semantics are undocumented, and three years later you have 400 call sites that all import LDClient. Migrating means a rewrite. OpenFeature breaks that coupling: it is a CNCF Incubating specification that standardizes the evaluation API so application code never names a vendor, and providers plug in underneath. Pair it with flagd – the project’s own reference flag daemon – and you get a fully open stack you can run yourself, with a clean migration path to a managed provider if you ever want one.

This article builds that platform end to end: the spec, the flagd deployment topology, real targeting rules with fractional rollouts, SDK wiring with context propagation, hooks for telemetry and audit, a live provider swap, flag governance, and deterministic CI testing.

1. The OpenFeature spec: API, providers, hooks, context

OpenFeature defines four moving parts, and the whole value proposition is that your code only touches the first.

The flow: your code calls the API with a flag key, a default value, and context. The hooks run, the provider resolves, and you get a value – never null. Defaults are mandatory and returned on any error, so a flag backend outage degrades to a known-safe value rather than an exception.

import { OpenFeature } from '@openfeature/server-sdk';

const client = OpenFeature.getClient();

// The default (false) is what you get if flagd is down, the flag is
// missing, or the type mismatches. Failure is always graceful.
const newCheckout = await client.getBooleanValue('new-checkout', false, {
  targetingKey: user.id,
  plan: user.plan,
  region: user.region,
});

Note that the API never mentions flagd. That is the entire point.

2. Deploying flagd: sidecar vs. centralized, and the sync source

flagd is a daemon that reads flag definitions from one or more sync sources and serves evaluations over a gRPC (and HTTP) interface on port 8013. It holds flags in memory and re-evaluates on every call, so reads are sub-millisecond.

You have two topologies:

Topology Latency Blast radius Operational cost
Sidecar (one flagd per pod) Lowest (loopback) Per-pod N containers to schedule
Centralized (a flagd Service) Network hop Shared service One Deployment to run

Start centralized for simplicity; move latency-critical services to sidecars later. The provider interface is identical, so it is a connection-string change.

The sync source is where flagd reads flags from. The common choices:

For a GitOps shop, file sync backed by a ConfigMap is the pragmatic default. Here is a centralized Deployment that reads a mounted file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flagd
  namespace: platform-flags
spec:
  replicas: 2
  selector:
    matchLabels: { app: flagd }
  template:
    metadata:
      labels: { app: flagd }
    spec:
      containers:
        - name: flagd
          image: ghcr.io/open-feature/flagd:v0.12.8
          args:
            - start
            - --uri
            - file:/etc/flagd/flags.json
          ports:
            - { name: grpc, containerPort: 8013 }
            - { name: ofrep, containerPort: 8016 } # OFREP HTTP
            - { name: metrics, containerPort: 8014 }
          volumeMounts:
            - { name: flags, mountPath: /etc/flagd, readOnly: true }
      volumes:
        - name: flags
          configMap: { name: flagd-flags }
---
apiVersion: v1
kind: Service
metadata:
  name: flagd
  namespace: platform-flags
spec:
  selector: { app: flagd }
  ports:
    - { name: grpc, port: 8013, targetPort: 8013 }
    - { name: ofrep, port: 8016, targetPort: 8016 }

flagd watches the mounted file and hot-reloads on change. With a ConfigMap, the kubelet propagates updates within its sync period (typically up to ~60s), and flagd picks them up automatically – no restart, no redeploy.

3. Authoring targeting rules: segments, rollouts, fractional bucketing

flagd flags are plain JSON validated against a published schema. Each flag has a state, a set of variants, a defaultVariant, and an optional targeting rule. Targeting uses JsonLogic plus flagd-specific operators.

Three building blocks cover most needs:

  1. Segments – ordered if branches on context, returning a variant.
  2. Percentage rollouts – the fractional operator splits traffic by weight.
  3. Stable bucketingfractional hashes a key so the same user always lands in the same bucket.
{
  "flags": {
    "new-checkout": {
      "state": "ENABLED",
      "variants": { "on": true, "off": false },
      "defaultVariant": "off",
      "targeting": {
        "if": [
          { "in": ["beta", { "var": "groups" }] }, "on",
          {
            "fractional": [
              { "cat": [{ "var": "$flagd.flagKey" }, { "var": "targetingKey" }] },
              ["on", 20],
              ["off", 80]
            ]
          }
        ]
      }
    }
  }
}

Two things are deliberate here.

First, internal users (groups contains beta) always get on, short-circuiting the rollout. Segments evaluate top-down.

Second, the fractional operator’s first argument is the bucketing expression. The weights ["on", 20] and ["off", 80] are relative and need not sum to 100 – flagd normalizes them. flagd hashes the expression with MurmurHash3 into the [0, 100) space and assigns a bucket. By concatenating (cat) the flag key with the targetingKey, a given user gets an independent but stable assignment per flag: stable so they do not flicker between page loads, independent so they are not correlated across unrelated flags. If you omit the first argument, flagd defaults to bucketing on the targetingKey alone – correct for a single flag, but it correlates assignment across every flag, which you rarely want.

To advance a rollout, bump 20 to 50 to 100 in Git. Because bucketing is a stable hash of the same key, everyone already in the on 20% stays on as you widen – the rollout is monotonic, with no user flipped back off.

4. Wiring the SDK with context propagation

The default value is your circuit breaker; provide one on every call. The bigger architectural concern is context propagation – targeting context must follow the request through every layer without being threaded as an argument.

OpenFeature solves this with the transaction context propagator. You set the request’s identity once in middleware, and any evaluation deeper in the call stack inherits it.

import { OpenFeature, AsyncLocalStorageTransactionContextPropagator }
  from '@openfeature/server-sdk';
import { FlagdProvider } from '@openfeature/flagd-provider';

OpenFeature.setTransactionContextPropagator(
  new AsyncLocalStorageTransactionContextPropagator(),
);
await OpenFeature.setProviderAndWait(
  new FlagdProvider({ host: 'flagd.platform-flags.svc', port: 8013 }),
);

// Express middleware: identity set once, inherited everywhere downstream.
app.use((req, _res, next) => {
  OpenFeature.setTransactionContext(
    {
      targetingKey: req.user.id,
      plan: req.user.plan,
      region: req.headers['x-region'] as string,
      groups: req.user.groups,
    },
    () => next(),
  );
});

Now a deeply nested service evaluates flags without ever receiving a user object:

// Three layers down. No user plumbed in -- context is ambient.
async function priceCart(items: Item[]): Promise<number> {
  const client = OpenFeature.getClient();
  const dynamicPricing = await client.getBooleanValue('dynamic-pricing', false);
  return dynamicPricing ? priceDynamically(items) : priceStatically(items);
}

setProviderAndWait blocks until flagd is connected, so you never serve traffic with an unready provider. Use the API surface symmetrically on the client tier – the @openfeature/web-sdk mirrors the same evaluation API in the browser, against the same flagd flags via OFREP.

5. Hooks for telemetry, audit, and tracing

Hooks are where a platform team adds cross-cutting behavior without touching a single feature call site. A hook implements any of before, after, error, finally. Register globally and it runs on every evaluation, everywhere.

This hook emits an OpenTelemetry span event on each resolution and logs an audit record – enough to answer “which variant did user X get for flag Y at time T,” which is exactly the question that surfaces in an incident review.

import { Hook, HookContext, EvaluationDetails, FlagValue }
  from '@openfeature/server-sdk';
import { trace } from '@opentelemetry/api';

export const telemetryHook: Hook = {
  after(ctx: HookContext, details: EvaluationDetails<FlagValue>) {
    const span = trace.getActiveSpan();
    span?.addEvent('feature_flag', {
      // Semantic-convention attribute keys for feature flags.
      'feature_flag.key': ctx.flagKey,
      'feature_flag.provider_name': ctx.providerMetadata.name,
      'feature_flag.result.variant': details.variant ?? 'unknown',
      'feature_flag.result.reason': details.reason ?? 'unknown',
    });
    auditLog.write({
      ts: Date.now(),
      flag: ctx.flagKey,
      targetingKey: ctx.context.targetingKey,
      variant: details.variant,
      reason: details.reason,
    });
  },
  error(ctx: HookContext, err: Error) {
    trace.getActiveSpan()?.recordException(err);
    metrics.increment('flag_eval_error', { flag: ctx.flagKey });
  },
};

// Registered once at boot; every getXxxValue in the codebase is now traced.
OpenFeature.addHooks(telemetryHook);

The attribute keys above follow the OpenTelemetry feature-flag semantic conventions, so any OTel-aware backend renders them as a first-class flag dimension. OpenFeature also ships a maintained @openfeature/open-telemetry-hooks package if you would rather not hand-roll it. Either way, telemetry is now a property of the platform, not a thing 40 teams each remember to add.

6. Swapping providers with zero code changes

The migration test: change the backend, recompile, ship – without editing a single evaluation call. Because the provider is the only vendor-aware seam, this holds.

// flagd today.
await OpenFeature.setProviderAndWait(
  new FlagdProvider({ host: 'flagd.platform-flags.svc', port: 8013 }),
);

// LaunchDarkly tomorrow -- via the OpenFeature LD provider. Same client,
// same getBooleanValue calls, same hooks. Only this line changes.
import { LaunchDarklyProvider } from '@launchdarkly/openfeature-server-node';
await OpenFeature.setProviderAndWait(
  new LaunchDarklyProvider(process.env.LD_SDK_KEY!),
);

// GO Feature Flag instead -- also a drop-in provider.
import { GoFeatureFlagProvider } from '@openfeature/go-feature-flag-provider';
await OpenFeature.setProviderAndWait(
  new GoFeatureFlagProvider({ endpoint: 'https://gofeatureflag.internal' }),
);

OpenFeature also supports named providers bound to domains, so you can migrate incrementally – route the payments domain to LaunchDarkly while everything else stays on flagd:

OpenFeature.setProvider('payments', new LaunchDarklyProvider(key));
const paymentsClient = OpenFeature.getClient('payments'); // bound to LD

The one caveat worth stating plainly: the API contract is portable, but rule authoring is not. flagd’s JsonLogic targeting and LaunchDarkly’s rule builder are different surfaces. The SDK swap is free; you still re-create targeting rules in the new backend. That is a config migration, not a code migration – and it is the difference between a sprint and a quarter.

7. Governance: lifecycle, ownership, stale-flag cleanup

Flags are debt the moment they merge. Without governance you accrue hundreds of permanently-on flags whose removal nobody dares attempt. Encode ownership and intent in the flag definition itself.

flagd ignores unknown keys, so attach metadata at the flag level:

{
  "flags": {
    "new-checkout": {
      "state": "ENABLED",
      "variants": { "on": true, "off": false },
      "defaultVariant": "off",
      "metadata": {
        "owner": "team-checkout",
        "jiraEpic": "CHK-1184",
        "type": "release",
        "createdAt": "2026-05-02",
        "expiresAt": "2026-07-01"
      }
    }
  }
}

Then enforce it in CI. A scheduled job opens a ticket (or fails the build) for any release flag past expiresAt:

#!/usr/bin/env bash
# stale-flag-check.sh -- fail if any release flag is past its expiry.
set -euo pipefail
today=$(date +%F)

jq -r '
  .flags | to_entries[]
  | select(.value.metadata.type == "release")
  | select(.value.metadata.expiresAt < "'"$today"'")
  | "STALE: \(.key) owner=\(.value.metadata.owner) expired=\(.value.metadata.expiresAt)"
' flags.json | tee stale.txt

[ -s stale.txt ] && { echo "Stale release flags found"; exit 1; } || echo "No stale flags"

Distinguish flag types: release flags are temporary and must expire; ops (kill switches) and experiment flags are longer-lived. Only release flags should trip the staleness gate – a permanent kill switch is a feature, not debt. Pair this with a linter that flags evaluation call sites whose flag key no longer exists in flags.json, catching the inverse rot: dead code branching on a deleted flag.

Verify

Confirm the stack end to end before trusting it in production.

# 1. flagd is healthy and serving.
kubectl -n platform-flags get pods -l app=flagd
kubectl -n platform-flags port-forward svc/flagd 8016:8016 &

# 2. Resolve a flag over the OFREP HTTP endpoint -- expect a variant + reason.
curl -s -X POST localhost:8016/ofrep/v1/evaluate/flags/new-checkout \
  -H 'Content-Type: application/json' \
  -d '{"context":{"targetingKey":"user-123","groups":["beta"]}}' | jq .

# 3. Prove stable bucketing: same key, repeated calls, identical variant.
for i in 1 2 3; do
  curl -s -X POST localhost:8016/ofrep/v1/evaluate/flags/new-checkout \
    -H 'Content-Type: application/json' \
    -d '{"context":{"targetingKey":"user-987"}}' | jq -r .variant
done   # -> three identical lines

# 4. Hot reload works: edit the ConfigMap, confirm flagd picks it up.
kubectl -n platform-flags logs -l app=flagd | grep -i "configuration updated"

Expected: step 2 returns {"value": true, "variant": "on", "reason": "TARGETING_MATCH"} for the beta user; step 3 prints the same variant three times (stable hash); step 4 shows a reload log line with no pod restart.

Implementing deterministic CI tests

Flag-driven branches must be testable without a live flagd. OpenFeature ships an in-memory provider for exactly this – you assert both branches deterministically, with zero network.

import { OpenFeature, InMemoryProvider } from '@openfeature/server-sdk';

describe('checkout flow', () => {
  it('uses the new path when the flag is on', async () => {
    await OpenFeature.setProviderAndWait(new InMemoryProvider({
      'new-checkout': {
        disabled: false,
        variants: { on: true, off: false },
        defaultVariant: 'on',       // force ON for this test
      },
    }));
    expect(await runCheckout(cart)).toEqual(expectedNewBehavior);
  });

  it('falls back to the old path when off', async () => {
    await OpenFeature.setProviderAndWait(new InMemoryProvider({
      'new-checkout': {
        disabled: false,
        variants: { on: true, off: false },
        defaultVariant: 'off',      // force OFF
      },
    }));
    expect(await runCheckout(cart)).toEqual(expectedOldBehavior);
  });
});

Because the same evaluation API resolves against the in-memory provider, the code under test is byte-identical to production – only the provider differs. Run both branches in CI on every PR and a half-rolled-out flag can never hide a broken code path.

Enterprise scenario

A payments platform team at a mid-size fintech ran LaunchDarkly across ~60 services. After an acquisition, a data-residency mandate landed: EU customer evaluations could not transit a US-hosted SaaS, and the audit trail had to live in their own SIEM. Ripping out LaunchDarkly meant touching every call site – months of regression risk on a payments path – so the proposal kept stalling.

The constraint that broke the deadlock: they did not need to leave LaunchDarkly everywhere, only for EU traffic, and only without a code rewrite.

They adopted OpenFeature as a refactor – mechanically replacing ldClient.variation(...) with client.getBooleanValue(...), no behavior change – then used named providers to route by region. EU services bound to a self-hosted flagd; everything else stayed on the LaunchDarkly provider during a phased cutover. The audit requirement was satisfied by a single global hook streaming every resolution to the SIEM, identical across both providers.

// Region decides the backend; the 60 services' evaluation code is untouched.
const region = process.env.DEPLOY_REGION;
const provider = region === 'eu'
  ? new FlagdProvider({ host: 'flagd.eu.internal', port: 8013 })
  : new LaunchDarklyProvider(process.env.LD_SDK_KEY!);

await OpenFeature.setProviderAndWait(provider);
OpenFeature.addHooks(siemAuditHook); // same audit trail, both backends

The migration shipped in six weeks instead of two quarters. The decisive insight: once the vendor lives behind the OpenFeature seam, “which backend” becomes a deployment variable, and residency, audit, and cost become operational knobs rather than rewrites.

Checklist

openfeaturefeature-flagsprogressive-deliveryflagdplatform-engineering

Comments

Keep Reading