Terraform Drift Detection & Tools | DistantJob - Remote Recruitment Agency
Tech Insights

Terraform Drift Detection & Tools

Cesar Fazio
- 3 min. to read

Imagine your infrastructure has a mind of its own. That’s the challenge of Terraform drift detection! It’s a stealthy problem where your live cloud resources quietly diverge from the blueprint defined in your code. The mismatch, or drift, poses a serious risk that can lead to unexpected outages, broken automation, and dangerous security holes.

Drift occurs whenever the infrastructure is modified outside the normal IaC workflow. This includes manual changes made directly in the cloud console, changes by external automation tools (like a Kubernetes controller or an Ansible script), or even unexpected resource failures or terminations.

In this article, we will cover how to utilize Terraform drift detection and how to resolve drifts to preserve or modify your infrastructure.

What is a Terraform Drift?

Terraform Drift, also known as configuration drift, occurs when the infrastructure (resources deployed on a cloud provider, such as AWS, Azure, or GCP) differs from the infrastructure defined in the Terraform code (the .tf configuration files).

In short, your actual infrastructure doesn’t align with the desired configuration.

How Drift Occurs

Configuration drift can occur for several reasons. The most common one is a direct manual change. It happens when someone adds, removes, or changes a resource directly through the cloud provider’s web console without updating the Terraform code.

Drifts can also occur when other configuration tools (such as Ansible, Chef, Puppet, etc.) or scripts make changes to the infrastructure that are not mapped in the Terraform files. A cloud provider itself can be programmed to perform automatic changes to some resources, causing the drift.

Why Drift Is a Problem

Drift poses a significant risk, leading to inconsistencies between different environments. It can also introduce vulnerabilities or violate compliance standards (HIPAA, PCI DSS) or even result in unintended changes (reversing the manual change) or failures if the actual state is not as expected.

  • Breaks the Single Source of Truth: Your Terraform code is no longer reliable.
  • Unexpected terraform apply results: The next execution will attempt to reconcile the difference, which can lead to unintended resource deletion, creation, or configuration changes, potentially causing application downtime.
  • Compromised Security & Compliance: Manual changes can easily bypass security audits and introduce vulnerabilities or non-compliant configurations.
  • Disaster Recovery Failure: When relying on Terraform to rebuild, only the coded infrastructure will be created, missing any critical drifted resources.
  • Increased Debugging Complexity: It becomes significantly harder to troubleshoot issues when the code doesn’t match reality.

Terraform Drift Detection: Strategies and Management

Terraform has mechanisms to detect and manage drift. It can check and compare the actual state of resources and state the differences before running. For the individual user (CLI), Drift detection is manual, depending on the execution of “plan” or “apply” commands. Meanwhile, for teams (Platforms), solutions like HCP Terraform make detection automatic and continuous, being the most recommended approach for production environments.

Terraform Drift Detection can be manual or automatic, depending on the tool and platform you are using:

1. Manual Drift Detection (Terraform CLI – The Standard)

Using Terraform CLI (command line) locally or in a CI/CD pipeline, drift detection is manual and reactive.

FeatureDescription
Main Commandterraform plan
How it WorksWhenever you run terraform plan (or terraform apply), Terraform performs an implicit refresh. It queries the cloud provider’s API to fetch the real state and compares it with your configuration (.tf).
Manual CommandDetection only occurs when you run the command. If you go a week without running the plan, Terraform will not know about any manual changes made during that period.
Detection TypeReactive (you initiate the check).

Note: Many teams automate the execution of Terraform plans in CI/CD pipelines (for example, in Pull Requests), which makes detection routine, but it is still linked to an event (the PR or a scheduled pipeline).

2. Automatic Drift Detection (HCP Terraform / Terraform Cloud)

The commercial platform HCP Terraform (or Terraform Cloud/Enterprise) offers an Automatic and Continuous Drift Detection feature.

FeatureDescription
Main FeatureHealth Assessments
How it WorksHCP Terraform performs background and scheduled checks (continuous) on your workspaces. It automatically executes the refresh and comparison process, without the need for manual intervention.
Automatic and ScheduledThe platform actively monitors the infrastructure at defined intervals (for example, every few hours) and sends alerts (via email, Slack, etc.) if the real state diverges from the configuration.
Detection TypeProactive and Continuous.

3. Third-Party Tools

Standard Terraform commands, such as “terraform plan”, are excellent for detecting drift on managed resources (resources in your state file that have been changed in the cloud). However, it cannot detect resources created manually in the cloud that are not present in your Terraform code or state file.

Open-source tools such as cloud-concierge go beyond state vs. config comparison. They connect directly to the cloud provider, auditing and listing non-managed resources in Terraform. They provide a more complete view of the infrastructure.

Tool NameKey FunctionalityCore Advantage (Beyond terraform plan)Open Source / Commercial
DriftCTLCloud drift detection and listing of unmanaged resources. Compares live infrastructure against Terraform state.Directly identifies resources not in any Terraform state file (unmanaged resources) by scanning the cloud provider.In maintenance mode after Snyk acquisition, it might not work with the current Terraform versions.
cloud-conciergeDetects drift and unmanaged resources. Can also codify unmanaged resources and create matching state import blocks.Provides a full replacement for DriftCTL functionality and adds the ability to automatically codify and generate import blocks for unmanaged resources.Open Source
Snyk IaC (Post-acquisition)Drift detection and security scanning of Infrastructure as Code. Integrates some of DriftCTL’s functionality.Focuses on security and compliance by integrating drift and unmanaged resource detection into a broader security platform.Commercial
TerraformerGenerates Terraform configuration and state from existing cloud infrastructure (Reverse Terraform).Its primary use is to discover and import existing resources into Terraform management, which is a method to address unmanaged resources.Open Source
CloudQueryETL (Extract, Transform, Load) framework that converts cloud infrastructure into a SQL database for analysis.Allows for complex SQL queries against your cloud inventory to find any resource that lacks a specific Terraform-managed tag or is otherwise unexpected.Open Source / Commercial
OvermindImpact analysis and real-time dependency discovery for IaC changes. It detects relationships that span services and accounts.While focused on change impact, it discovers dependencies, including those for resources not in Terraform, to calculate a comprehensive “blast radius.”Commercial

4. Integrating Terraform Plan into CI/CD Pipelines

A crucial best practice for maintaining infrastructure-as-code (IaC) is automating “terraform plan” command within your CI/CD pipeline. A scheduled execution (often managed via a cronjob or equivalent scheduler in your CI/CD system) serves as a continuous drift detection mechanism without requiring any actual resource modification.

The core principle is to run “terraform plan” against your live infrastructure state and then monitor its exit code.

If the exit code indicates no changes are needed (usually 0), it confirms that the deployed infrastructure perfectly matches the desired state defined in your Terraform configuration files.

If the exit code indicates that changes are pending (which often triggers a non-zero exit code, typically 2 in CI/CD wrappers, signaling a difference in the plan output), it immediately signals a drift. An out-of-band change has occurred, or the Terraform code no longer reflects the deployed reality.

How to Resolve Drift

Detecting the drift is only the first step. The more critical part is deciding how to resolve it, which involves either updating your code to match reality or forcing reality back to match your code.

Once drift is detected, the team has two main options to resolve it and restore compliance: revert the manual change or update the IaC code.

Reverting to the original state

If the change was unintentional or unauthorized, running terraform apply on the current code will force the infrastructure back into compliance by provisioning the changes defined in the configuration.

Update the IaC Code in the .tf file

If the manual change was intentional and approved (such as an emergency fix), the Terraform code (.tf files) must first be updated to reflect the new infrastructure reality. Only after the code is updated will running terraform plan or apply validate the change, update the state file, and ensure the new state correctly reflects the new desired state.

Handling Unmanaged Cloud Resources

One of the most challenging types of drift is an unmanaged resource: a resource created directly in the cloud that is not present in your Terraform code or state file. Tools like DriftCTL and cloud-concierge are essential for detecting these.

To bring an unmanaged resource under Terraform’s control and resolve the drift, you must use the “terraform import” command to match an existing cloud resource to a new resource block you define in your configuration files, and then add that link to your state file.

  1. Define the Block: Write a new resource block in your .tf file that accurately describes the existing cloud resource (e.g., a specific S3 bucket).
  2. Run Import: Execute the “terraform import” command, referencing the new code path and the actual Cloud Provider ID of the resource.

For example:

terraform import aws_s3_bucket.example example-bucket-name

Then, run “terraform plan”. If successful, the plan should show no pending changes, confirming the resource is now managed, and the drift is resolved.

Conclusion

The best practices for Terraform Drift Detection rely on utilizing the “terraform plan” command for basic detection, but implementing automatic and continuous monitoring through platforms like HCP Terraform for production environments. For a complete picture, leverage third-party tools like DriftCTL or cloud-concierge to identify and audit crucial unmanaged resources.

Once detected, resolving drift means either performing a forced reconciliation (reverting the manual change) or formally updating your IaC code to accept the change. For unmanaged resources, the “terraform import” command is essential for bringing them under management.
Need someone who won’t let your infrastructure drift like a curious cat? If you’re looking to hire a Terraform developer who can keep your IaC in perfect alignment, explore our expert Terraform talent here and find the right professional to tame your infrastructure.

Cesar Fazio

César is a digital marketing strategist and business growth consultant with experience in copywriting. Self-taught and passionate about continuous learning, César works at the intersection of technology, business, and strategic communication. In recent years, he has expanded his expertise to product management and Python, incorporating software development and Scrum best practices into his repertoire. This combination of business acumen and technical prowess allows structured scalable digital products aligned with real market needs. Currently, he collaborates with DistantJob, providing insights on marketing, branding, and digital transformation, always with a pragmatic, ethical, and results-oriented approach—far from vanity metrics and focused on measurable performance.

Learn how to hire offshore people who outperform local hires

What if you could approach companies similar to yours, interview their top performers, and hire them for 50% of a North American salary?

Subscribe to our newsletter and get exclusive content and bloopers

or Share this post

Learn how to hire offshore people who outperform local hires

What if you could approach companies similar to yours, interview their top performers, and hire them for 50% of a North American salary?

Reduce Development Workload And Time With The Right Developer

When you partner with DistantJob for your next hire, you get the highest quality developers who will deliver expert work on time. We headhunt developers globally; that means you can expect candidates within two weeks or less and at a great value.

Increase your development output within the next 30 days without sacrificing quality.

Book a Discovery Call

What are your looking for?
+

Want to meet your top matching candidate?

Find professionals who connect with your mission and company.

    pop-up-img
    +

    Talk with a senior recruiter.

    Fill the empty positions in your org chart in under a month.