Architecture Azure

EDI and B2B Integration Platform on Azure Logic Apps

A national grocery distributor — the kind that moves 40,000 SKUs a day from regional DCs to 1,800 supermarket doors — gets a quiet ultimatum from its three largest retail customers: adopt our EDI program or lose the shelf. The mega-retailers will only transact purchase orders, advance ship notices, and invoices as electronic data interchange documents, over AS2, with a signed functional acknowledgement returned inside the agreed service window — or they assess chargebacks per late or malformed message that erase the margin on the order. The distributor’s incumbent setup is a 14-year-old on-prem VAN gateway running a fragile FTP-and-parse cron job that exactly one contractor understands, and it dropped 600 invoices over a long weekend last quarter when a partner rotated a certificate. The mandate from the CIO is blunt: a modern, auditable, partner-scalable EDI platform that never silently drops a document and proves delivery. This article is the reference architecture for building that on Azure — Logic Apps Standard, an Integration Account, AS2, Service Bus, and a Confluent bridge into analytics — designed so an operations lead can sleep through a partner’s certificate rotation.

The pressures here are different from a typical app integration, and naming them shapes every decision. Trading partners are sovereign: each defines its own document versions, segment quirks, identifiers, and SLAs, and you cannot make them change. Acknowledgements are contractual: a 997 (X12) or CONTRL (EDIFACT) functional ack is not telemetry — it is the legal proof that you received and could parse their document, and its absence triggers money changing hands. Volume is spiky: an ordinary Tuesday and the Monday after a holiday weekend differ by an order of magnitude, and back-pressure must not lose a message. And auditability is non-negotiable: when a partner disputes an invoice, you need the exact bytes received, the decode result, the ack returned, and the timestamps, six years later. EDI is an unglamorous, decades-old standard, but it still moves a huge fraction of global B2B commerce precisely because it is rigid, acknowledged, and auditable — and those are the properties you must preserve while modernizing the plumbing underneath.

Why not the obvious shortcuts

Three shortcuts will be proposed on this project, and each fails in a way worth naming up front.

Keep the VAN and bolt on scripts. A value-added network hides the transport but charges per kilo-character, gives you no control plane, and leaves you parsing raw EDI in brittle custom code — the exact fragility that dropped 600 invoices. You inherit the VAN’s outages and add your own. Build a custom parser microservice. Hand-rolling X12 and EDIFACT decoding looks tractable until you meet the long tail: version drift (4010 vs 5010), partner-specific segment terminators, repeating loops, and the validation rules that decide whether to send a rejecting ack. You would be reimplementing a mature standard library, badly, forever. Use a generic iPaaS with a community EDI connector. It demos well and dies on the specifics: AS2 MDN handling, certificate lifecycle, partner-agreement isolation, and the durable replay you need when a downstream system is down. The connector handles the happy path; EDI is all edge cases.

Azure’s answer is to treat EDI as a first-class protocol. Logic Apps Standard gives you durable, stateful workflows with a real runtime you control; the Integration Account holds the X12/EDIFACT schemas, the trading-partner definitions, and the agreements that encode exactly how each partner’s documents are received, validated, and acknowledged. You configure the standard instead of coding it, and you get the acknowledgement and audit semantics built in. Around that core you place durable buffering and a streaming bridge so the rigid, synchronous world of partner transport is decoupled from your own systems’ availability.

Architecture overview

EDI and B2B Integration Platform on Azure Logic Apps — architecture

The platform runs two coupled but independently-scaled paths: an inbound path that receives a partner’s documents, decodes and validates them, returns the contractual acknowledgement, and hands clean business data to internal systems; and an outbound path that takes internal events (an invoice is ready, an order is confirmed), encodes them into the partner’s exact EDI dialect, transmits over AS2, and reconciles the MDN and functional ack that come back. Holding the two apart — and remembering that the acknowledgement of an inbound document is itself a small outbound flow — is the key to reasoning about this system.

The defining property of the topology is durable decoupling: between the partner-facing transport and your internal systems sits Azure Service Bus, so a slow or down ERP never causes you to drop or reject a partner’s document. You accept it, acknowledge it, and buffer it. The partner’s SLA is satisfied at the edge; your back-office processes the business message on its own clock.

Inbound path, following the data flow:

  1. A trading partner POSTs an AS2 message to your endpoint. Traffic first hits Akamai at the edge for TLS termination, global anycast, IP allow-listing of known partner ranges, and WAF protection — partner endpoints are a public attack surface, and you want bot and flood mitigation in front of them before anything reaches Azure.
  2. The request lands on Azure API Management (APIM) in internal VNet mode, the single front door for partner traffic. APIM does mutual-TLS client-certificate checks where a partner uses them, rate-limits per partner, and routes to the AS2 receive workflow. It is also your one audit point for “who connected, when.”
  3. The AS2 decode action in a Logic Apps Standard workflow verifies the message signature against the partner’s public certificate, decrypts with your private key, checks for replay via the message-id, and synchronously returns a signed MDN (Message Disposition Notification) — the transport-level “I received your bytes intact” receipt the partner’s system is waiting on.
  4. The decoded EDI payload flows into the X12 / EDIFACT decode action, which resolves the correct agreement in the Integration Account from the sender/receiver qualifiers and IDs, validates the interchange against the partner’s schema and the agreed rules (envelope, structure, data-element), splits batched transaction sets, and — critically — generates the functional acknowledgement (997 / CONTRL) reporting accept or reject per transaction set. That ack is queued for outbound return to the partner.
  5. Each well-formed business document (an 850 purchase order, an 856 ASN) is published as a message to a Service Bus topic, partitioned by document type, with the original interchange archived to Blob Storage (immutable, the legal record of exact bytes received).
  6. Subscriber workflows — or the ERP’s own connector — pull from Service Bus, map the EDI structure into the internal canonical model, and post into SAP / Dynamics. If the ERP is down, the message waits in the subscription; nothing is lost, and the partner was already acknowledged.
  7. In parallel, a Confluent (managed Apache Kafka) bridge consumes the same business events and streams them into the analytics estate — Snowflake, a lakehouse, real-time dashboards — so supply-chain and finance teams see order and shipment flow live without touching the transactional path.

Outbound path, the mirror image: an internal system emits “invoice ready” onto Service Bus → a workflow looks up the partner agreement, runs the internal-to-EDI map to produce a compliant 810 invoice in the partner’s exact version, X12/EDIFACT encodes it with correct envelopes and control numbers, AS2 encodes (signs and encrypts) it, and transmits to the partner’s endpoint → the platform then waits for and reconciles the partner’s returned MDN and 997, marking the document delivered-and-accepted, or escalating if either is missing inside the SLA window.

Component breakdown

Component Service / tool Role in the platform Key configuration choices
Edge Akamai TLS, anycast, partner IP allow-listing, WAF for the public AS2 endpoint Allow-list partner source ranges; WAF rules tuned for AS2 POST patterns
Front door Azure API Management Single partner ingress: mTLS, per-partner rate limit, routing, audit Client-certificate validation; rate-limit-by-key on partner id; internal VNet mode
Transport AS2 (Logic Apps action) Signed/encrypted message exchange with synchronous MDN receipts Sign + encrypt; require signed MDN; message-id replay check
EDI engine Integration Account Schemas, partners, agreements, maps for X12 + EDIFACT Per-partner agreement; version pinning; validation level per element
Orchestration Logic Apps Standard Durable stateful decode/validate/ack and encode/transmit workflows Stateful runs; WS plan for VNet + predictable scale; per-action retry policy
Buffering Azure Service Bus Decouples partner transport from internal systems; back-pressure Topics by doc type; dead-letter queues; duplicate detection; sessions for ordering
Streaming bridge Confluent (managed Kafka) Fan business events to analytics/lakehouse off the hot path Topic-per-doc-type; schema registry; consumer groups per downstream
Archive / audit Blob Storage (immutable) Exact received/sent interchange bytes, 6-year legal record Immutable WORM policy; lifecycle to cool/archive; per-partner container
ERP SAP / Dynamics System of record consuming canonical business data Connector or queue-triggered post; idempotency key on document control number
Identity / SSO Okta + Microsoft Entra ID Operator/admin SSO (Okta) federated to Entra for Azure RBAC OIDC federation; conditional access; group claims to APIM portal
Secrets & certs HashiCorp Vault Partner certificates, AS2 keys, ERP credentials, signing keys PKI engine for cert issuance/rotation; dynamic ERP creds; short leases
CSPM / posture Wiz + Wiz Code Cloud posture, exposure of the partner endpoint, IaC scanning Agentless scan of Storage/Service Bus; Wiz Code gates Terraform PRs
Runtime security CrowdStrike Falcon Runtime threat detection on Logic Apps hosting and any VMs Sensor on the App Service plan hosts / self-hosted nodes; SOC feed
Observability Dynatrace / Datadog End-to-end document tracing, SLA timers, ack-reconciliation metrics Distributed trace per interchange; SLA-breach alerting; queue-depth dashboards
ITSM ServiceNow Partner onboarding approvals, change control, incident tickets Change gate for new agreements; auto-incident on missing ack or DLQ growth
CI / IaC GitHub Actions / Jenkins + Terraform / Ansible Build/test/deploy workflows and infra; config drift control OIDC to Azure; Terraform for resources; Ansible for host/agent config
Delivery Argo CD GitOps sync of Logic Apps + agreement config to environments Git as source of truth; automated promotion dev → test → prod

A few of these choices deserve the why, because they are where EDI projects go wrong.

Why agreements, not code. The single highest-leverage concept in this architecture is the trading-partner agreement in the Integration Account. An agreement binds a host partner (you) and a guest partner to a protocol (X12 or EDIFACT) and pins everything that matters per partner: schema versions, envelope settings, which acknowledgements to generate and whether to expect them back, the validation strictness, and the AS2 transport settings. When the mega-retailer says “we’re moving 850s from 4010 to 5010,” you publish a new schema and update one agreement — you do not touch a workflow or redeploy code. New partner onboarding becomes “create the partner, import their schema, define the agreement,” which is a configuration task an integration analyst can do, not a developer sprint.

Why Service Bus in the middle is non-negotiable. Without durable buffering, the availability of your ERP becomes the availability of your partner-facing SLA — and the moment SAP takes a maintenance window, you start rejecting documents you were contractually obliged to accept. Service Bus inverts this: you accept and acknowledge at the edge, the business message waits in a topic subscription, and a slow downstream simply drains slower. Dead-letter queues catch poison messages (a document that maps cleanly but the ERP rejects) for human triage instead of an infinite retry loop, and duplicate detection plus an idempotency key on the document control number means a partner’s retransmission does not create a duplicate invoice.

Why a Kafka bridge and a queue, not one or the other. Service Bus and Confluent are doing different jobs and the instinct to pick one is wrong. Service Bus is the transactional, ordered, acknowledged path into systems of record — it cares about exactly-once-ish delivery of a specific invoice to SAP. Confluent is the high-throughput fan-out path for analytics, where many independent consumers (Snowflake loader, real-time dashboard, fraud model) replay the same stream of business events at their own pace, days back if needed, with a schema registry keeping producers and consumers compatible. Putting analytics consumers on the transactional queue would couple your DC dashboard’s hiccups to your ERP delivery; keeping the ERP off Kafka avoids reinventing dead-lettering and sessions. Each tool does what it is good at.

Implementation guidance

Provision with Terraform, and stand up the Integration Account and network first. Order matters: the Integration Account is the dependency the workflows bind to, and Logic Apps Standard needs its VNet integration in place before private endpoints will resolve.

  1. A VNet with subnets for Logic Apps (delegated), the private endpoints, and APIM (its own delegated subnet).
  2. Private endpoints for Service Bus, Blob Storage, and Key Vault/Vault access, each with public_network_access disabled, plus the linked private DNS zones — forgetting a zone link is the classic silent-hang failure.
  3. The Integration Account (Standard tier for production volume), then the schemas, partners, and agreements loaded into it.
  4. The Logic Apps Standard site on a Workflow Standard (WS) plan for VNet integration and predictable scaling, linked to the Integration Account, with a system-assigned managed identity.
  5. APIM in internal mode, with Akamai pointed at its private origin and partner certificates loaded.

A minimal Terraform shape for the core pieces communicates the intent:

resource "azurerm_logic_app_integration_account" "edi" {
  name                = "ia-edi-grocer-prod-cin"
  resource_group_name = azurerm_resource_group.edi.name
  location            = "centralindia"
  sku                 = "Standard"   # production volume + agreements
}

resource "azurerm_servicebus_namespace" "edi" {
  name                          = "sb-edi-grocer-prod-cin"
  sku                           = "Premium"   # VNet, predictable throughput
  public_network_access_enabled = false
  premium_messaging_partitions  = 2
}

The pipeline that applies this runs in GitHub Actions (or Jenkins where a team already lives there), authenticating to Azure via OIDC federation so no service-principal secret is stored — the platform team has had a credential leak before and intends never to repeat it. Terraform provisions the Azure resources; Ansible configures any self-hosted gateway or agent VMs and keeps their AS2/cert config consistent. Workflow and agreement definitions are promoted dev → test → prod by Argo CD doing GitOps sync from the repo, so the running configuration always matches Git and a bad change is reverted by reverting a commit.

Identity: federate the operators, kill the keys. Human access to the platform — the integration analysts who onboard partners, the ops team who watch the dashboards — flows Okta → Entra: they sign in with the company’s Okta credentials and conditional-access policies, Okta federates to Microsoft Entra ID over OIDC, and the Entra token carries the group claims that gate the APIM developer portal and the Azure RBAC roles. The Logic Apps runtime authenticates to Service Bus, Storage, and Key Vault with its managed identity, scoped to least privilege (Service Bus Data Sender/Receiver on specific entities, Storage Blob Data Contributor on the archive container only). The secrets that cannot be managed identities — partner AS2 certificates and private keys, ERP service credentials, signing keys — live in HashiCorp Vault, whose PKI engine also issues and rotates the certificates on a schedule, so the certificate-rotation event that dropped 600 invoices last quarter becomes an automated, alerted, zero-downtime operation instead of a 2 a.m. fire.

Acknowledgement reconciliation is the feature, not a detail. The thing that distinguishes a real EDI platform from a demo is that it tracks the lifecycle of every document to its acknowledgement. For each outbound interchange, record the control number, the expected MDN, and the expected 997, then run a timer: if either is missing when the partner’s SLA window closes, raise a ServiceNow incident and alert ops, because that gap is a chargeback waiting to happen. For inbound, ensure your generated 997 actually transmitted and the MDN was returned. Model this as a small state machine per document; the Logic Apps stateful run history plus a status record in a table gives you the audit trail a partner dispute demands.

Enterprise considerations

Security & Zero Trust. Partner endpoints are internet-facing by necessity, which makes layered defense mandatory: Akamai WAF and partner IP allow-listing at the edge; APIM enforcing mutual TLS so only partners presenting a valid client certificate connect; AS2 itself signing and encrypting every message so payloads are protected end-to-end and non-repudiation holds. Inside the boundary, Service Bus, Storage, and Key Vault sit behind private endpoints with public access disabled, and the Logic Apps runtime reaches them over the VNet only. Wiz runs continuous CSPM, alerting the moment the partner endpoint or the archive storage drifts toward public exposure or an over-broad access policy, while Wiz Code scans the Terraform in pull requests so a misconfiguration is caught before it deploys. CrowdStrike Falcon sensors provide runtime threat detection on the hosting plan and any self-hosted gateway VMs, feeding the SOC. Azure Policy denies any Service Bus or Storage resource created with public network access, and Wiz independently verifies the policy is actually holding. A guardrail breach — a DLQ that crosses a threshold, a certificate nearing expiry, a partner connecting from an unexpected range — auto-raises a ServiceNow ticket so security has an actionable record, not just a log line.

Cost optimization. EDI volume is spiky, so engineer to pay for steady-state and burst gracefully.

Lever Mechanism Typical effect
Plan sizing WS1 baseline for steady volume; scale out workers on queue depth Avoid paying peak for a Tuesday’s load
Buffer, don’t over-provision Service Bus absorbs holiday spikes; downstream drains at its rate Caps ERP-side compute sizing
Archive tiering Immutable Blob to cool then archive after the dispute window Slashes 6-year retention cost
Confluent right-sizing Cluster sized to analytics throughput, not transactional peak Streaming cost tracks real consumer demand
Retire the VAN Replace per-kilo-character VAN fees with flat Azure infra Removes a per-message variable cost entirely

The hidden win is dropping the legacy VAN’s per-character billing — at high invoice volume that variable fee often dwarfs the Azure infrastructure cost, and AS2 over your own endpoint removes it.

Scalability. Each tier scales on its own signal. Logic Apps Standard scales workers horizontally on workflow concurrency and queue trigger depth. Service Bus Premium scales with messaging partitions and namespaces, and its subscriptions naturally absorb back-pressure so a 10× holiday spike queues rather than fails. Confluent scales by adding partitions and brokers as analytics throughput grows, independent of the transactional path. The Integration Account itself is configuration, not a throughput bottleneck — but watch agreement and schema counts on the tier limits as the partner roster grows. The natural ceiling is downstream ERP ingestion rate, which is exactly why the buffer exists: you scale the partner-facing edge freely and let the back office consume at a safe pace.

Failure modes, and what each one looks like. Name them before they page you.

Reliability & DR (RTO/RPO). Decide the numbers per tier. Service Bus Premium supports geo-disaster-recovery pairing for namespace failover; Blob is geo-redundant, making the archive the durable source of truth from which state can be rebuilt. Logic Apps Standard and the Integration Account are deployed identically in a paired region via the same Terraform, with APIM/Akamai steering partner traffic to the standby on failover. Because every inbound document is acknowledged only after it is safely buffered and archived, RPO approaches zero for received documents — you never lose a partner message you said you received. A pragmatic target for this platform: RTO 30 minutes, RPO near-zero for acknowledged documents, with in-flight unacknowledged interchanges naturally retried by the partner per the AS2 spec. Test the failover, including certificate availability in the standby region, because a DR site without the AS2 keys is not a DR site.

Observability. Instrument an end-to-end trace per interchange in Dynatrace (or Datadog where the team standardizes there): one trace spanning AS2 decode → EDI validate → ack generated → Service Bus publish → ERP post, with timing and the document control number on every hop, so a stuck invoice is one search away. Emit the metrics the business actually feels — ack turnaround time against each partner’s SLA, dead-letter-queue depth, decode/validation failure rate per partner, outbound 997 reconciliation lag, and archive write success. Alert on SLA-window breaches and DLQ growth, routing both to ServiceNow. New trading partners and agreement changes pass through a ServiceNow change approval before going live, giving the business a documented gate and a rollback path.

Governance. Pin schema versions explicitly per agreement (the partner’s exact 4010 or 5010), and promote a new version through test before flipping the agreement — never let a partner’s “we changed our format” reach production unvalidated. Keep agreements, schemas, maps, and workflow definitions in version control, reviewed and revertable, with Argo CD ensuring the deployed state matches Git. Retain every interchange — bytes received, bytes sent, ack, timestamps — in immutable Blob for the contractual retention period, since a dispute six years out needs the exact evidence. Apply Azure Policy to deny public network access and require diagnostic settings on every relevant resource, with Wiz as the independent check that the controls are real.

Explicit tradeoffs

Accept these or do not build it. A first-class EDI platform has real moving parts you do not get for free: an Integration Account to curate, per-partner agreements to maintain, a buffering layer to operate, and a streaming bridge to run. The acknowledgement-reconciliation machinery — the timers, the state tracking, the escalation — is genuine engineering you cannot skip, because it is the contractual product. The durable decoupling that protects your SLA costs you eventual-consistency: a partner is acknowledged before the ERP has processed the business message, so “received” and “posted to SAP” are two distinct states your ops team must understand and monitor. The private-networking posture that the security team requires costs setup complexity — delegated subnets, private DNS zones, internal-mode APIM — and the price of forgetting one piece is a silent hang, not a clear error. And running two messaging systems, Service Bus and Confluent, is more to operate than one; the justification is that they serve genuinely different jobs, not architectural fashion.

The alternatives, and when they win. If you have only a handful of low-volume partners and no streaming-analytics need, a managed VAN or a SaaS EDI provider may be cheaper than operating this platform — you trade control and per-message cost for someone else’s operations team. If your “B2B” is really modern API integration with partners who can speak REST and JSON, skip EDI entirely and build an API platform; EDI earns its complexity only when partners mandate the standard. If you need EDI but live primarily on AWS or GCP, the equivalent pattern is a managed EDI/transformation service plus a durable queue and a Kafka bridge — the shape here (transport → standardized decode/agreement → buffer → systems-of-record + streaming fan-out) transfers across clouds even though the named services change. Choose this Azure-native build when partners mandate EDI, volume is real and spiky, acknowledgements are contractual, and you want the agreement-as-configuration model that lets analysts, not developers, onboard the next retailer.

The shape of the win

For the distributor, the payoff is not “we do EDI now” — they already did. It is that when the mega-retailer rotates a certificate at midnight, Vault rotates the counterpart automatically and not a single invoice drops; that when an invoice is disputed two years later, an analyst pulls the exact received bytes, the decode result, the returned 997, and the timestamps in one search; that a holiday-weekend spike queues in Service Bus and drains safely into SAP instead of bouncing documents the contract said must be accepted; and that onboarding the next demanding retailer is an agreement an integration analyst configures in a day, not a developer project measured in sprints. That last property is what turns the platform from a cost center into a growth enabler — every new partner the sales team wins is a configuration, not a rebuild. Everything upstream — the AS2 transport, the Integration Account agreements, the Service Bus buffer, the Confluent bridge, the Vault-managed certificates, the Wiz posture scanning, the Dynatrace per-document trace — exists so that the distributor keeps its shelf space, proves every delivery, and never again loses 600 invoices to a certificate nobody was watching.

AzureLogic AppsEDIAS2B2BIntegration
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading