Development Documentation
View as:

Emergency Procedures

This runbook covers recovery procedures for the most common operational failures. The platform is designed around three self-healing layers: Terraform for infrastructure, git sync for content, and dbt for data. Most failures can be resolved by re-running the appropriate layer.

SPN Credential Rotation

Service principal secrets expire after 1 year. Rotate before expiry to avoid CI/CD pipeline failures. See SPN Access Map — Credential Rotation for the full step-by-step procedure covering both sp-fabric-data-worker and sp-fabric-platform-admin.

Pipeline Failure Triage

Quick Diagnosis

SymptomLikely CauseAction
Auth error (401/403)SPN secret expired or permissions changedCheck secret expiry, verify service connection
Terraform errorState drift or API changeRun terraform plan locally to inspect
fabric-cicd errorGUID mismatch or TMDL syntaxCheck parameter files, validate TMDL locally
dbt errorSQL syntax or source data issueRun dbt build --target local first

Pipeline Chain Recovery

The pipeline chain runs sequentially: infra-deploy > fabric-deploy + security-deploy + functions-deploy > dbt-dev-build. If a pipeline fails mid-chain:

  1. Fix the root cause (do not re-trigger blindly)
  2. Push the fix to the appropriate branch — the chain restarts automatically
  3. Do not manually trigger Azure DevOps pipelines

All shared-resource pipelines use lockBehavior: sequential. Manually triggered runs can cause queue conflicts.

dbt Build Failure Recovery

Local Diagnosis First

Always reproduce and fix locally before pushing:

cd dbt
dbt build --target local --profiles-dir .

Common dbt Failures

ErrorCauseFix
15151 Cannot find the schemaSchema doesn't exist in WarehouseCheck dbt custom schema generation; may need CREATE SCHEMA
Case-sensitive column errorFabric is case-sensitive for quoted identifiersUse bracket notation [Column] or normalizing CTE
varchar(30) truncationBare cast(x as varchar) defaults to 30 charsAlways specify explicit length: varchar(500)
Contract mismatchColumn name in SELECT doesn't match contractEnsure aliases match contract name: field exactly
Source not foundBronze shortcut missing or renamedVerify shortcuts via git sync, check sources.yml

Full Rebuild

If Gold data is corrupted or missing:

cd dbt
dbt build --target <env> --profiles-dir . --full-refresh

This drops and recreates all tables. Use only as a last resort — normal incremental builds are preferred.

Workspace Access Emergency

If a workspace becomes inaccessible (permissions removed, workspace deleted):

Workspace Deleted

  1. Run terraform apply — Terraform recreates the workspace and all child resources
  2. Re-sync content from git (portal: Source control > Update all, or python scripts/fabric_git_sync.py --env ENV)
  3. Rebuild Gold data: dbt build --target ENV

Permissions Lost

  1. Check Terraform state: terraform plan -var-file="environments/ENV/terraform.tfvars"
  2. If roles are missing, terraform apply restores them from tfvars
  3. For emergency access, use the Azure Portal to manually add geris_fabric_admin@geris.nl as Admin

Data Refresh Failure Recovery

Symptoms

  • Reports show stale data (check last_refresh timestamps in Fabric portal)
  • dbt build succeeded but semantic model shows old data

DirectLake Models

DirectLake models auto-refresh from the warehouse — no action needed after a successful dbt build. If stale:

  1. Verify the dbt build actually completed (check pipeline logs)
  2. Check if the semantic model fell back to Import mode (Fabric portal > Model settings)
  3. If in Import fallback, check for missing columns or table schema mismatches

Import Models

Import models require manual refresh or a scheduled refresh owned by geris_fabric_admin@geris.nl:

  1. Log in to Fabric portal as geris_fabric_admin@geris.nl
  2. Navigate to the semantic model > Settings > Scheduled refresh
  3. Verify credentials are valid (Take over if needed)
  4. Trigger a manual refresh

Monitoring and Alerting

Application Insights (Function App)

The Function App logs to Application Insights. Key queries:

-- Recent function executions
traces
| where message contains 'Executed' and message contains 'Succeeded'
| project timestamp, message
| order by timestamp desc

-- Export failures
traces
| where message contains 'export_failed'
| project timestamp, message
| order by timestamp desc

If the requests table is empty, check host.json — the "Host.Results": "Error" setting suppresses successful request telemetry. Change to "Information" to see all invocations.

Pipeline Monitoring

Pipeline run status is visible in Azure DevOps (org: geris-devops, project: insights-requests). dbt does NOT produce JUnit XML — do not add PublishTestResults@2 to dbt pipelines. Use run_results.json for programmatic inspection.

CU Utilization

Currently blocked — the Fabric Admin API requires Capacity Admin role, which cannot be granted on trial capacity. Will auto-activate when paid capacity (F2+) is provisioned. No code changes needed.

Self-Healing Recovery Matrix

What FailedRecovery ToolCommand
Workspace deletedTerraformterraform apply
Gold Warehouse deletedTerraform + dbtterraform apply, then dbt build
Lakehouse deletedTerraform + git syncterraform apply, then sync from git
Shortcuts missingGit syncpython scripts/fabric_git_sync.py --env ENV
Semantic model deletedGit syncpython scripts/fabric_git_sync.py --env ENV
Report deletedGit syncRe-sync from Git in Fabric portal
Gold data missingdbtdbt build --target ENV
RBAC wrongTerraformterraform apply
Everything destroyedFull sequenceterraform apply > git sync > dbt build

Related Pages