Local Development Setup
This page walks through setting up a local development environment from scratch. After completing these steps, you will be able to build and test the entire dbt pipeline locally using DuckDB, with no Fabric connectivity required.
System Requirements
The local-first loop runs the entire dbt graph against a DuckDB file under dev-data/. A handful of models — notably fact_inventory_snapshot, dim_logistics, fact_trade — materialise intermediate results on the order of tens of GB during their build. DuckDB spills those intermediates to DBT_DUCKDB_TEMP (default dev-data/.duckdb_tmp/). Plan for enough free disk.
| Resource | Minimum | Recommended | Notes |
|---|---|---|---|
| RAM | 16 GB | 32 GB+ | DuckDB's default memory_limit=80% — more RAM = fewer disk spills = much faster fact_inventory_snapshot builds. With 16 GB you'll still build everything, just slowly. |
| Free disk (SSD) | 80 GB | 150 GB+ | dev-data/fabric_datalake.duckdb (~1.6 GB) + Bronze parquets (~5 GB) + peak spill dev-data/.duckdb_tmp/ can reach 60 GB during fact_inventory_snapshot. NVMe strongly preferred over SATA — spill read/write is the dominant wait. |
| CPU | 4 cores | 8+ cores | dbt/profiles.yml local runs 8 threads; DuckDB also uses intra-query parallelism within each model. |
| OS | Windows 11 / macOS / Linux | — | Repository is developed on Windows 11. Scripts use Unix shell (run via Git Bash on Windows). |
| Python | 3.10 | 3.11 | See table below. |
| Close DuckDB viewers before building | — | — | DBeaver or VS Code "Power User for dbt" keep the .duckdb file locked and will block dbt with "file being used by another process" errors. Close them (or connect read-only) during a build. |
Running provisioning (fabric dev start …) additionally needs an az login session and the platform SPN credentials loaded from Key Vault (handled by scripts/_lib_spn.sh).
Prerequisites
| Tool | Version | Purpose | Install |
|---|---|---|---|
| Python | 3.11 (3.10-3.12 accepted; 3.13 incompatible) | dbt runtime, scripts | python.org |
| Node.js | LTS v20+ | MCP servers (npx-based) | nodejs.org |
| Azure CLI | Latest | Fabric authentication, deployments | winget install Microsoft.AzureCLI |
| Terraform | >= 1.8 | Feature environment provisioning | terraform.io |
| ODBC Driver 18 | Latest | Direct Fabric Warehouse connections | Microsoft docs |
| Git | Latest | Version control | Included with Azure DevOps access |
| Docker | Latest (optional) | Terraform MCP server only | Docker Desktop |
Why Python 3.11 specifically? The dbt-fabric adapter is incompatible with Python 3.13. While 3.10-3.12 all work, 3.11 is recommended for consistency across the team and CI.
Step-by-Step Setup
1. Clone the Repository
git clone https://geris-devops@dev.azure.com/geris-devops/insights-requests/_git/fabric_monorepo
cd fabric_monorepo
2. Configure Git Hooks
git config core.hooksPath .githooks
This enables the drift-check post-commit hook and the gitleaks pre-commit secret scanner.
3. Install Python Dependencies
The easiest path is to run any scripts/fabric … subcommand — the wrapper auto-bootstraps a local .venv/ with Python 3.11/3.12, installs dbt-core / dbt-fabric==1.9.8 / dbt-duckdb==1.10.1 plus runtime extras (pyarrow, azure-storage-file-datalake, deltalake), and runs dbt deps on first use. Subsequent invocations are a no-op.
sync_cloud_parquets.py reads cloud_only tables directly from OneLake Delta storage (abfss://<workspace_id>@onelake.dfs.fabric.microsoft.com/<warehouse_id>/Tables/dbo/<table>/) via the deltalake library, bypassing the Warehouse SQL endpoint. This reduces sync time for large tables like fact_inventory_snapshot from ~10 minutes to under a minute, and consumes zero Warehouse CU.
If you prefer to do it manually:
bash scripts/setup-local.sh # creates .venv/, installs dbt, runs dbt deps
pip install uv # provides uvx for Python-based MCP servers
Note:
dbt-fabricanddbt-duckdbare pip packages, NOT dbt packages. Adding them topackages.ymlbreaksdbt deps.
4. (Optional) Refresh dbt Packages
cd dbt && dbt deps --profiles-dir .
This step is already performed by setup-local.sh / the fabric wrapper; re-run it only if dbt/packages.yml changes.
5. Verify Local Build
cd dbt && dbt build --target local --profiles-dir .
This runs the full pipeline against DuckDB using Parquet seed data from dev-data/. If this succeeds, your local environment is correctly set up.
6. Authenticate for Fabric Targets (Optional)
If you need to run against the DEV Fabric Warehouse:
az login
Then set environment variables and build:
export FABRIC_SERVER="<warehouse-sql-endpoint>.datawarehouse.fabric.microsoft.com"
export FABRIC_DATABASE="Gold_Warehouse"
cd dbt && dbt build --target dev --profiles-dir .
All Fabric targets use authentication: CLI -- no SPN credentials needed locally.
Understanding the Local Build
DuckDB and Parquet Seed Data
The local target uses a DuckDB file at dev-data/fabric_datalake.duckdb. On each run, the load_parquet_sources() macro bootstraps DuckDB by reading Parquet files from dev-data/. These Parquet files are representative samples of production data, committed to the repository.
The duckdb target (used by CI smoke tests) runs in-memory for even faster execution. Both targets produce the same results -- the difference is persistence.
What --profiles-dir . Does
The profiles.yml file lives in the dbt/ directory, not in the default ~/.dbt/ location. Every dbt command requires --profiles-dir . to find it. Forgetting this flag produces a "profile not found" error.
Available Targets
| Target | Engine | Use Case |
|---|---|---|
local | DuckDB (file) | Local development, manual testing |
duckdb | DuckDB (in-memory) | CI smoke tests |
dev | Fabric Warehouse | DEV environment |
ci | Fabric Warehouse | CI slim builds |
uat | Fabric Warehouse | UAT environment |
prod | Fabric Warehouse | Production (pipelines only) |
feat-NAME | Fabric Warehouse | Feature environments (auto-generated) |
MCP Server Setup
The project includes .mcp.json in the repository root with shared MCP server configurations. Claude Code (and other MCP-compatible tools) pick them up automatically. Nine servers are configured:
| Server | Runtime | Auth | Purpose |
|---|---|---|---|
| microsoft-learn | Remote HTTP | None | Microsoft Learn docs search/fetch |
| powerbi-modeling | npx | Browser login | Semantic model editing: TMDL, DAX |
| fabric-prodev | npx | None | Fabric API specs, schemas, best practices |
| azure-devops | npx | Browser login | Pipelines, work items, PRs, wiki |
| dbt-core | uvx | None | Lineage, impact analysis, column tracing |
| terraform | Docker | None | Terraform Registry docs and module search |
| azure | npx | az login | 276 tools across 57 Azure services |
| fabric-ops | uvx | az login | Read-only Fabric operational intel |
| duckdb | uvx | None | Query local DuckDB file |
First-use authentication: Servers that require browser login (powerbi-modeling, azure-devops) will open a browser on the first tool call. Credentials are cached after that. Servers that require az login (azure, fabric-ops) need an active Azure CLI session.
Verifying MCP Servers
# No-auth servers (should start and exit cleanly):
npx -y @microsoft/fabric-mcp@latest server start --mode all < /dev/null
python -m uv tool run mcp-server-motherduck --db-path ./dbt/fabric_datalake.duckdb < /dev/null
# Auth-required servers (may prompt browser login):
npx -y @azure/mcp@latest server start < /dev/null
npx -y @azure-devops/mcp geris-devops < /dev/null
Pre-Commit Hooks (Secret Scanning)
The repository uses gitleaks via the pre-commit framework. Set up once per clone:
pip install pre-commit
pre-commit install
Every git commit will automatically scan staged files for secret patterns. The CI pipeline also runs gitleaks as a safety net.
Common Issues
| Symptom | Cause | Fix |
|---|---|---|
| "profile not found" | Missing --profiles-dir . | Add --profiles-dir . to every dbt command |
| "Env var required but not provided" | dbt parses ALL targets at startup | Add empty defaults: env_var('X', '') |
| DuckDB passes, Fabric fails | Dialect differences (case, datetime2) | See Dual-Dialect Patterns |
| "uvx not found" | uv not installed | Run pip install uv |
| "npx not found" | Node.js not installed | Install Node.js LTS v20+ |
| "docker not found" (terraform MCP only) | Docker not installed or not running | Install Docker Desktop, ensure daemon is running |
| Pipeline not triggering after push | YAML parse error in pipeline file | Try manual queue -- the API returns the parse error |
| "Cannot find the object" in security scripts | Fabric returns error code 15151 | Check for 15151, not "Invalid object name" |
Related Pages
- Developer Workflow -- Full feature lifecycle overview
- Feature Branch Tiers -- Provisioning isolated environments
- Coding Conventions -- How code is written in this project
- Dual-Dialect Patterns -- DuckDB vs Fabric SQL differences