ADVANCED
1 min DevOps

Terraform State Locking

Scenario

You are attempting to run terraform apply for a critical production update, but it fails immediately with an error: Error acquiring the state lock. The message says the lock was created by your colleague “Bob” 3 hours ago. Bob is currently on a flight and unreachable.

Question

What does this error mean, and what are the safe (and unsafe) steps to resolve this to proceed with your deployment?

Expected Answer

  1. Explanation: Terraform uses locking (e.g., DynamoDB with S3 backend) to prevent concurrent state modifications which could corrupt the state file.
  2. Safe Steps:
    • Verify no CI/CD jobs are actually running cleanly (zombie process?).
    • If confirmed valid stale lock: terraform force-unlock <LOCK_ID>.
  3. Warning: Emphasize that force-unlock is dangerous. If Bob’s process is still writing (e.g., a hung network connection), forcing unlock could corrupt the state. Always verify the process is truly dead first.