Local Development Setup

This page walks through setting up a local development environment from scratch. After completing these steps, you will be able to build and test the entire dbt pipeline locally using DuckDB, with no Fabric connectivity required.

System Requirements

The local-first loop runs the entire dbt graph against a DuckDB file under dev-data/. A handful of models — notably fact_inventory_snapshot, dim_logistics, fact_trade — materialise intermediate results on the order of tens of GB during their build. DuckDB spills those intermediates to DBT_DUCKDB_TEMP (default dev-data/.duckdb_tmp/). Plan for enough free disk.

Resource	Minimum	Recommended	Notes
RAM	16 GB	32 GB+	DuckDB's default `memory_limit=80%` — more RAM = fewer disk spills = much faster `fact_inventory_snapshot` builds. With 16 GB you'll still build everything, just slowly.
Free disk (SSD)	80 GB	150 GB+	`dev-data/fabric_datalake.duckdb` (~1.6 GB) + Bronze parquets (~5 GB) + peak spill `dev-data/.duckdb_tmp/` can reach 60 GB during `fact_inventory_snapshot`. NVMe strongly preferred over SATA — spill read/write is the dominant wait.
CPU	4 cores	8+ cores	`dbt/profiles.yml local` runs 8 threads; DuckDB also uses intra-query parallelism within each model.
OS	Windows 11 / macOS / Linux	—	Repository is developed on Windows 11. Scripts use Unix shell (run via Git Bash on Windows).
Python	3.10	3.11	See table below.
Close DuckDB viewers before building	—	—	DBeaver or VS Code "Power User for dbt" keep the `.duckdb` file locked and will block dbt with "file being used by another process" errors. Close them (or connect read-only) during a build.

Running provisioning (fabric dev start …) additionally needs an az login session and the platform SPN credentials loaded from Key Vault (handled by scripts/_lib_spn.sh).

Prerequisites

Tool	Version	Purpose	Install
Python	3.11 (3.10-3.12 accepted; 3.13 incompatible)	dbt runtime, scripts	python.org
Node.js	LTS v20+	MCP servers (npx-based)	nodejs.org
Azure CLI	Latest	Fabric authentication, deployments	`winget install Microsoft.AzureCLI`
Terraform	>= 1.8	Feature environment provisioning	terraform.io
ODBC Driver 18	Latest	Direct Fabric Warehouse connections	Microsoft docs
Git	Latest	Version control	Included with Azure DevOps access
Docker	Latest (optional)	Terraform MCP server only	Docker Desktop

Why Python 3.11 specifically? The dbt-fabric adapter is incompatible with Python 3.13. While 3.10-3.12 all work, 3.11 is recommended for consistency across the team and CI.

Step-by-Step Setup

1. Clone the Repository

git clone https://geris-devops@dev.azure.com/geris-devops/insights-requests/_git/fabric_monorepo
cd fabric_monorepo

2. Configure Git Hooks

git config core.hooksPath .githooks

This enables the drift-check post-commit hook and the gitleaks pre-commit secret scanner.

3. Install Python Dependencies

The easiest path is to run any scripts/fabric … subcommand — the wrapper auto-bootstraps a local .venv/ with Python 3.11/3.12, installs dbt-core / dbt-fabric==1.9.8 / dbt-duckdb==1.10.1 plus runtime extras (pyarrow, azure-storage-file-datalake, deltalake), and runs dbt deps on first use. Subsequent invocations are a no-op.

sync_cloud_parquets.py reads cloud_only tables directly from OneLake Delta storage (abfss://<workspace_id>@onelake.dfs.fabric.microsoft.com/<warehouse_id>/Tables/dbo/<table>/) via the deltalake library, bypassing the Warehouse SQL endpoint. This reduces sync time for large tables like fact_inventory_snapshot from ~10 minutes to under a minute, and consumes zero Warehouse CU.

If you prefer to do it manually:

bash scripts/setup-local.sh      # creates .venv/, installs dbt, runs dbt deps
pip install uv                   # provides uvx for Python-based MCP servers

Note: dbt-fabric and dbt-duckdb are pip packages, NOT dbt packages. Adding them to packages.yml breaks dbt deps.

4. (Optional) Refresh dbt Packages

cd dbt && dbt deps --profiles-dir .

This step is already performed by setup-local.sh / the fabric wrapper; re-run it only if dbt/packages.yml changes.

5. Verify Local Build

cd dbt && dbt build --target local --profiles-dir .

This runs the full pipeline against DuckDB using Parquet seed data from dev-data/. If this succeeds, your local environment is correctly set up.

6. Authenticate for Fabric Targets (Optional)

If you need to run against the DEV Fabric Warehouse:

az login

Then set environment variables and build:

export FABRIC_SERVER="<warehouse-sql-endpoint>.datawarehouse.fabric.microsoft.com"
export FABRIC_DATABASE="Gold_Warehouse"
cd dbt && dbt build --target dev --profiles-dir .

All Fabric targets use authentication: CLI -- no SPN credentials needed locally.

Understanding the Local Build

DuckDB and Parquet Seed Data

The local target uses a DuckDB file at dev-data/fabric_datalake.duckdb. On each run, the load_parquet_sources() macro bootstraps DuckDB by reading Parquet files from dev-data/. These Parquet files are representative samples of production data, committed to the repository.

The duckdb target (used by CI smoke tests) runs in-memory for even faster execution. Both targets produce the same results -- the difference is persistence.

What `--profiles-dir .` Does

The profiles.yml file lives in the dbt/ directory, not in the default ~/.dbt/ location. Every dbt command requires --profiles-dir . to find it. Forgetting this flag produces a "profile not found" error.

Available Targets

Target	Engine	Use Case
`local`	DuckDB (file)	Local development, manual testing
`duckdb`	DuckDB (in-memory)	CI smoke tests
`dev`	Fabric Warehouse	DEV environment
`ci`	Fabric Warehouse	CI slim builds
`uat`	Fabric Warehouse	UAT environment
`prod`	Fabric Warehouse	Production (pipelines only)
`feat-NAME`	Fabric Warehouse	Feature environments (auto-generated)

MCP Server Setup

The project includes .mcp.json in the repository root with shared MCP server configurations. Claude Code (and other MCP-compatible tools) pick them up automatically. Nine servers are configured:

Server	Runtime	Auth	Purpose
microsoft-learn	Remote HTTP	None	Microsoft Learn docs search/fetch
powerbi-modeling	npx	Browser login	Semantic model editing: TMDL, DAX
fabric-prodev	npx	None	Fabric API specs, schemas, best practices
azure-devops	npx	Browser login	Pipelines, work items, PRs, wiki
dbt-core	uvx	None	Lineage, impact analysis, column tracing
terraform	Docker	None	Terraform Registry docs and module search
azure	npx	`az login`	276 tools across 57 Azure services
fabric-ops	uvx	`az login`	Read-only Fabric operational intel
duckdb	uvx	None	Query local DuckDB file

First-use authentication: Servers that require browser login (powerbi-modeling, azure-devops) will open a browser on the first tool call. Credentials are cached after that. Servers that require az login (azure, fabric-ops) need an active Azure CLI session.

Verifying MCP Servers

# No-auth servers (should start and exit cleanly):
npx -y @microsoft/fabric-mcp@latest server start --mode all < /dev/null
python -m uv tool run mcp-server-motherduck --db-path ./dbt/fabric_datalake.duckdb < /dev/null

# Auth-required servers (may prompt browser login):
npx -y @azure/mcp@latest server start < /dev/null
npx -y @azure-devops/mcp geris-devops < /dev/null

Pre-Commit Hooks (Secret Scanning)

The repository uses gitleaks via the pre-commit framework. Set up once per clone:

pip install pre-commit
pre-commit install

Every git commit will automatically scan staged files for secret patterns. The CI pipeline also runs gitleaks as a safety net.

Common Issues

Symptom	Cause	Fix
"profile not found"	Missing `--profiles-dir .`	Add `--profiles-dir .` to every dbt command
"Env var required but not provided"	dbt parses ALL targets at startup	Add empty defaults: `env_var('X', '')`
DuckDB passes, Fabric fails	Dialect differences (case, datetime2)	See Dual-Dialect Patterns
"uvx not found"	`uv` not installed	Run `pip install uv`
"npx not found"	Node.js not installed	Install Node.js LTS v20+
"docker not found" (terraform MCP only)	Docker not installed or not running	Install Docker Desktop, ensure daemon is running
Pipeline not triggering after push	YAML parse error in pipeline file	Try manual queue -- the API returns the parse error
"Cannot find the object" in security scripts	Fabric returns error code 15151	Check for 15151, not "Invalid object name"

Developer Workflow -- Full feature lifecycle overview
Feature Branch Tiers -- Provisioning isolated environments
Coding Conventions -- How code is written in this project
Dual-Dialect Patterns -- DuckDB vs Fabric SQL differences