Back to Notes
February 2026

Your Service Catalog Will Become a Mess. Here's How to Prevent It.

Four layers, clear boundaries, no mystery failures

Everyone wants self-service cloud. Developers want to spin up resources without waiting on tickets. Finance wants to track costs. Security wants guardrails. Platform teams want to stop being a bottleneck.

The problem is that most implementations turn into a mess.

The service catalog becomes a tangled web of workflows, scripts, and manual steps. Adding a new offering takes weeks. When something breaks, nobody knows which layer failed. Costs drift from requests. The CMDB is perpetually out of sync.

We've found that the key to avoiding this is strict separation of concerns. Not one monolithic workflow that does everything, but four distinct layers with clear responsibilities and failure boundaries.

The architecture: four layers with clear boundaries

A well-designed self-service provisioning system separates into four layers:

LayerResponsibility
Request & ApprovalWho can ask for what, and who approves it
Integration & Pipeline TriggerTranslating approved requests into automation
IaC ExecutionActually deploying the infrastructure
Post-Deployment IntegrityConfirming success and maintaining traceability

Each layer has its own failure modes and its own controls. When something goes wrong, you know exactly where to look. When you add new capabilities, you know exactly what to change.

Layered Self-Service Cloud Provisioning

Separation of concerns for scalable, maintainable infrastructure automation

1
Request & Approval
ServiceNow / Service Portal
📋 Catalog
Browse offerings
📝 Form
Collect inputs
✓ Approval
Route by policy
⚠️ Failure Handling
Incomplete → Validation blocks
No approval → Stays in queue
✓ Automation only triggers on valid, approved requests
2
Integration & Trigger
API / Webhook
🔗 REST Call
Pass parameters
⚡ Pipeline
Azure DevOps / GitHub
⚠️ Failure Handling
API timeout → Retry + fail visibly
Invalid params → Reject, write back
✓ Failures visible immediately, no downstream impact
3
IaC Execution
Terraform / Bicep
🔍 Validate
Plan / What-if
💰 Cost Check
vs. approved budget
🚀 Apply
Deploy resources
⚠️ Failure Handling
Policy violation → Caught in plan
Quota exceeded → Fail fast
Partial deploy → State tracked, re-run safe
✓ Idempotent execution, no orphaned resources
4
Post-Deployment
Integrity & Traceability
📞 Callback
Status to ServiceNow
🗃️ CMDB
Record created
🏷️ Tags
Cost attribution
⚠️ Failure Handling
Callback fails → Retry, alert
CMDB drift → Reconciliation job
Missing tags → Azure Policy enforces
✓ Request-to-cost traceability maintained
Layer 1 Owns
Who can request
Who approves
Layer 2 Owns
API communication
Parameter passing
Layer 3 Owns
What's built
Security baselines
Layer 4 Owns
Traceability
Reconciliation
End-to-End Traceability
Request ID
Approval Record
Pipeline Run
Deployed Resources
Cost Data

Layer 1: Who can ask for what

This is where users interact with the system. In most organizations, this means ServiceNow, but the principles apply to any service management platform.

What this layer controls:

  • • What offerings are available to which users
  • • What information is required for each request type
  • • What approval chain a request follows
  • • What cost thresholds trigger additional review

What this layer does NOT control:

  • • How the infrastructure is built
  • • What security baselines apply
  • • Where resources are deployed

The request layer is about policy and process, not technology. A user selects a VM size, enters a project code, and submits. The workflow validates required fields, routes to the appropriate approver based on cost or data classification, and only when fully approved, hands off to the next layer.

The key principle: a request doesn't trigger automation unless it's valid and approved. Bad data gets caught here, not three layers downstream.

Layer 2: Fail visibly, not silently

This layer bridges the service management platform and the automation system. An approved request becomes a pipeline run with parameters.

Typically this is a REST call from ServiceNow to Azure DevOps or GitHub Actions, passing the request parameters as pipeline variables.

The key principle: failures here are visible and contained. If the pipeline can't start, the request system knows immediately. Nothing partially deploys. Nothing silently fails.

Layer 3: Opinionated templates, no exceptions

This is where infrastructure actually gets built. Terraform or Bicep templates execute against the cloud provider.

The IaC templates are opinionated by design. A "Standard VM" pattern includes encryption, diagnostic settings, and required tags. The requester doesn't decide whether to enable encryption. The pattern enforces it.

Pre-Deployment Validation

Before any resources are created, the pipeline runs validation:

  1. 1Terraform plan / Bicep what-if: Shows exactly what will be created or changed
  2. 2Policy compliance check: Confirms the planned resources won't violate Azure Policy
  3. 3Cost estimation: Tools like Infracost compare projected cost against approved budget

If any check fails, the pipeline stops. Nothing deploys. The request is marked as failed with a clear explanation.

The key principle: idempotent, fail-fast execution. If something goes wrong, you can re-run safely. State is tracked. Partial deployments don't leave orphaned resources.

Layer 4: Every dollar traces back

Deployment isn't the end. This layer confirms success and maintains traceability over time.

The key principle: callbacks happen on success only. The CMDB doesn't get a record for a half-deployed resource. Cost attribution doesn't start until the resource is confirmed.

The Traceability Chain

When this layer works correctly, you have an unbroken chain:

Request ID → Approval Record → Pipeline Run → Deployed Resources → Cost Data

Every dollar of cloud spend traces back to a request, an approver, and a cost center. No orphaned resources. No mystery charges.

Why one big workflow always fails

The temptation is to build one big workflow that does everything. User submits, workflow approves, workflow calls Azure, workflow creates CMDB record, workflow sends email. All in one.

This works until it doesn't.

Problems with monolithic workflows:

  • Debugging is hard. When something fails, you're stepping through a 50-node workflow trying to figure out which API call timed out.
  • Changes are risky. Touching the approval logic might break the deployment step. Adding a new offering means copying and modifying the whole thing.
  • Ownership is unclear. Is the ServiceNow team responsible for the Terraform templates? Is the cloud team responsible for the approval routing?
  • Testing is expensive. You can't test the IaC patterns without running through the whole request process.

Benefits of layered separation:

  • Clear failure boundaries. Layer 2 failed? Check the API integration. Layer 3 failed? Check the Terraform.
  • Independent evolution. Change the approval chain without touching the templates. Add a new IaC pattern without modifying ServiceNow.
  • Defined ownership. ServiceNow team owns Layers 1-2. Platform team owns Layer 3. Both own Layer 4.
  • Testable in isolation. Run Terraform plans without a ServiceNow request. Test API integration without deploying real resources.

Adding a new offering in days, not weeks

A well-layered system makes adding new offerings straightforward:

  1. 1Platform team develops the IaC pattern: Terraform or Bicep, tested in isolation
  2. 2Pattern is reviewed and merged to the infrastructure repo
  3. 3ServiceNow admin creates a catalog item with variables that map to the pattern's inputs
  4. 4Pipeline is configured (or reuses existing parameterized pipeline)
  5. 5Pattern is available in the catalog

The ServiceNow work is minimal because it's just collecting inputs. The heavy lifting is in the IaC pattern, which can be developed and tested independently.

From request to running resource: minutes

A typical request flow:

  1. 1.User browses the service catalog, selects "Standard Linux VM"
  2. 2.Form collects size, project code, cost center, environment
  3. 3.Cost estimate displays before submission
  4. 4.User submits, request routes to manager for approval
  5. 5.Manager approves, ServiceNow calls Azure DevOps REST API
  6. 6.Pipeline starts, runs Terraform plan, checks cost, validates policy
  7. 7.All checks pass, Terraform apply runs
  8. 8.Resources deploy with required tags, encryption, diagnostic settings
  9. 9.Pipeline callbacks to ServiceNow with success status and resource IDs
  10. 10.CMDB updated, user notified, request closed

Time from submission to running resource: minutes, not days.

Separate the layers. The automation follows.

Self-service provisioning becomes fragile when everything runs in one workflow. Separate the concerns:

LayerOwnsDoesn't Own
Request & ApprovalWho can ask, who approvesHow it's built
IntegrationAPI communication, parameter passingBusiness logic
IaC ExecutionWhat gets built, security baselinesApproval policy
Post-DeploymentTraceability, reconciliationDeployment mechanics

When each layer has clear boundaries, failures are contained, changes are safe, and the system can grow without collapsing under its own complexity.