Why Terraform Workspaces Are the Wrong Tool for Environments

Articles, Azure, DevOps, Infrastructure as Code, Terraform

Why Terraform Workspaces Are the Wrong Tool for Environments

Reading time: 6 minutes, 1 seconds

Ask “how do I manage dev, staging, and prod in Terraform?” and someone will say “use workspaces.” It is the intuitive answer — one codebase, switch workspace, deploy. It is also, for environments, the wrong answer, and the reasons why are worth understanding because they are really reasons about blast radius and clarity.

I am not saying workspaces are useless. I am saying that environments are the one job people reach for them and the one job they are worst at. Let me make the case — and then, because the usual “just use a directory per environment” advice has a real weakness of its own, let me give you the two structures that actually work and a rule for choosing.

What a workspace actually is

A workspace is a separate state file inside the same backend, using the same configuration and the same credentials. terraform workspace select prod does not change your code, your backend, or your authentication — it swaps which state file is active, invisibly, based on shell state you cannot see in the code.

That sentence contains every problem.

Problem one: one backend, one credential, all environments

Workspaces all live in the same backend — the same Azure Storage account, the same container. Your production state sits in the same place as your dev state, reached with the same credentials.

This collapses the boundary that should be strongest in your whole estate. Real environment isolation means prod state lives in a prod subscription, behind prod RBAC, reachable only by prod pipelines. With workspaces, anyone and anything that can read dev state can read prod state — it is the same storage account with a different blob key. You have made the cheapest, least important environment the security peer of your most important one.

Problem two: the invisible switch

The active workspace is shell state, not code. Nothing in your configuration tells you which environment you are pointed at. The entire safety of your deployment rests on a human (or a pipeline step) having run the right terraform workspace select beforehand.

The failure mode writes itself: someone is in prod, thinks they are in dev, runs apply. Nothing in the code would have stopped them, because the code is identical across environments — that is the whole selling point of workspaces, and it is exactly what removes your last guardrail. The most dangerous command in your toolkit (apply on prod) is gated by an invisible piece of shell state.

Problem three: conditionals keyed off the workspace name

Because one configuration serves every environment, the differences have to live inside the config — and with workspaces they get keyed off the workspace name:

			
locals {
  instance_count = terraform.workspace == "prod" ? 5 : 1
  sku            = terraform.workspace == "prod" ? "Premium" : "Standard"
  enable_backups = terraform.workspace == "prod" ? true : false
}

		

Multiply that across a real estate and your modules become a thicket of terraform.workspace == "prod" ternaries. The config no longer describes any one environment — you cannot read it and know what prod looks like, because prod is scattered across dozens of conditionals keyed off a string. Reviewing a prod change means mentally evaluating every ternary. That is the opposite of what infrastructure-as-code is supposed to give you.

Hold on to this problem — it is subtler than it looks, and we come back to it, because the conditional is not the villain; what it keys off is.

First, the thing that actually matters: isolated state

Before arguing about folder layouts, name the invariant, because it is the whole point:

Each environment must have its own state, in its own backend, inside its own trust boundary.

Prod state in a prod subscription, prod RBAC, reachable only by prod pipelines. Dev state somewhere a dev mistake stays contained. That is the property workspaces destroy by putting every environment in one backend behind one credential. Everything below is just different ways to organise code on top of that invariant — and any structure that keeps separate backends per environment has already beaten workspaces on the thing that counts.

There are two good structures. They make different trade-offs, and the popular advice only tells you about one of them.

Structure A: a directory per environment

The commonly recommended model — a directory per environment, each with its own backend, calling shared modules:

			
environments/
  dev/      { main.tf  backend.tf  dev.tfvars }
  staging/  { main.tf  backend.tf  staging.tfvars }
  prod/     { main.tf  backend.tf  prod.tfvars }
modules/
  network/  app/

		

The real logic lives in modules/; each environment directory is a thin root that wires modules together with an environment-specific backend and values.

What it gets right: the boundary is physical. You cd environments/prod — there is no invisible switch to forget. State is isolated by construction. Environments can diverge intentionally (prod can call a dr module dev does not) without a single conditional.

Its real weakness — and it is real: the thin roots duplicate. Add a module, bump a version, change a wiring argument, and you must edit it in every environment directory. Miss one and dev and prod silently diverge. The duplication is small, but small duplication is exactly where inconsistency hides. If you have ever found staging running a module version prod did not, this is why.

Structure B: one shared root, per-environment backend and variables

The structure that eliminates that duplication: a single root configuration, with environments expressed only as backend config and variable files.

			
main.tf            # the ONE root, shared by every environment
variables.tf
backends/
  dev.hcl  staging.hcl  prod.hcl     # backend per env (state + subscription)
env/
  dev.tfvars  staging.tfvars  prod.tfvars   # only the values that differ
modules/  ...

		

Because a backend block cannot take variables, you use a partial backend — an empty backend "azurerm" {} — and bind everything at run time:

			
terraform init -reconfigure -backend-config=backends/prod.hcl
terraform plan  -var-file=env/prod.tfvars

What it gets right: there is exactly one copy of the logic, so every environment is consistent by construction. Differences are pure data in tfvars. Adding a module changes one file for all environments at once. On the consistency axis it is strictly better than Structure A.

Its catch — and you must see it clearly: it re-introduces the invisible switch I attacked workspaces for. With workspaces it was terraform workspace select prod; here it is -backend-config=prod.hcl plus -var-file=prod.tfvars. Which environment you are touching is again decided by command-line flags, not by where you stand. You can init against prod’s backend and apply dev’s vars; you can forget -reconfigure when switching.

Two things save it, and they are the reason it still beats workspaces:

Separate backends bound the blast radius. prod.hcl points at a prod storage account in a prod subscription behind prod RBAC. Fat-finger a flag and the worst case is “access denied,” not “dev changes applied to prod state.” Workspaces never had this — one backend, one credential, no floor under the mistake.
A pipeline removes the human choice entirely. When CI/CD hardcodes the backend config and var-file per stage, no person ever types those flags. The prod stage only ever uses prod’s backend and prod’s vars, defined in version-controlled YAML you review like any other code. The invisible switch becomes a visible pipeline definition.

So Structure B’s footgun is real for a human at a laptop and essentially gone behind a pipeline.

The distinction that makes Structure B safe: data, not the workspace name

Here is the subtle point promised earlier, and it is the one most people get wrong. Structure B still needs to express per-environment differences — including the occasional structural one, like “prod has a DR replica, dev does not.” That is a conditional. Did we not just condemn conditionals?

No — we condemned conditionals keyed off terraform.workspace. Keying off input data is completely different:

			
# variables.tf
variable "enable_dr" { type = bool }
# prod.tfvars -> enable_dr = true
# dev.tfvars  -> enable_dr = false
resource "azurerm_storage_account" "dr" {
  count = var.enable_dr ? 1 : 0
  # ...
}

		

Compare the two:

terraform.workspace == "prod" couples logic to an invisible string, scatters decisions through the code, and you cannot read one environment’s behaviour in isolation — you have to mentally run every ternary against a name set in your shell.
var.enable_dr is a declared input, set explicitly in a file you can open. prod.tfvars says enable_dr = true. The module just reacts to a boolean like any reusable module should. You read prod.tfvars and you know exactly what prod is.

The rule: a module reacting to its inputs is good design; a module sniffing the workspace name is the anti-pattern. Express every difference — values and structure — as data in tfvars, never as a workspace-name lookup. Do that and Structure B stays readable no matter how many environments you add.

So which structure should you use?

	Separate backends	Code duplication	The switch is…
Workspaces	❌ shared	none	invisible (shell)
A — directory per env	✅	thin-root (drifts)	visible (you `cd`)
B — shared root + per-env config	✅	none	flag-driven (tame with a pipeline)

My recommendation, and the rule behind it:

Pipeline-driven applies (most teams): use Structure B. Zero duplication, true isolation, and the pipeline removes the only real downside by binding backend and vars per stage. This is the best default for a team shipping through CI/CD.
Humans running Terraform locally, or environments that genuinely diverge a lot: use Structure A. When applies happen at a laptop, the physical cd environments/prod boundary is worth the small thin-root duplication, because the flag-driven switch in B is a live hazard without a pipeline in front of it.
Want B’s DRY-ness and A’s explicit directories: reach for Terragrunt. It is built for exactly this — it generates the backend config, keeps a single source module, and gives you a real per-environment directory tree with no hand-managed -backend-config flags. It costs you a dependency and a learning curve; it buys you both halves at once. If you find yourself hand-rolling Structure B’s plumbing across many stacks, this is the tool you are reinventing.

All three respect the invariant. Workspaces is the only option on the table that does not.

So when are workspaces right?

They have a real, narrow use: short-lived, structurally identical copies of the same environment, owned by the same team, in the same trust boundary. Per-developer sandboxes. Ephemeral per-pull-request environments that a pipeline spins up and tears down. Cases where “same code, same backend, throwaway parallel state” is precisely what you want and the weak isolation does not matter because it is all dev anyway.

That is the tell. Workspaces are for parallel copies within one trust boundary. Environments are different trust boundaries — different blast radius, different credentials, different people allowed to touch them. The moment your “workspaces” differ in how much a mistake costs, you have crossed out of what workspaces are for.

The takeaway

Workspaces fail as an environment model because they unify exactly what environments exist to separate: state, credentials, and blast radius. The fix is not a single blessed folder layout — it is the invariant (separate backend per environment, in its own trust boundary) plus the right structure for how you ship:

One shared root with per-environment backend and tfvars, behind a pipeline, for most teams — consistent by construction, isolated, and footgun-free once CI/CD binds the flags.
A directory per environment when humans apply locally or environments diverge structurally.
Terragrunt when you want both and will pay for a tool to get there.

And whichever you pick, express every difference as data your reviewer can read — never as a conditional that sniffs the workspace name. Get the invariant and that one habit right, and the structure debate becomes what it should be: a question of ergonomics, not of safety.

Andrey Krasikov

Senior Cloud Architect with 25+ years in IT and 10+ years designing enterprise Azure and AWS solutions. Microsoft Azure Solutions Architect Expert. Specialising in cloud-native architectures, Infrastructure as Code (Terraform, Bicep), DevOps pipelines, data platforms, and AI-powered workloads. Helped 100+ organisations migrate, modernise, and optimise their cloud environments. Based in the USA — connect on LinkedIn or explore my open-source work on GitHub.