What Checkov Catches — and What It Misses
Checkov is excellent at catching misconfigurations deterministically and fast. But it answers WHAT is wrong — not WHY it matters for your workload or WHAT breaks if this fails. This post explains exactly where that gap sits.
If you run Checkov in CI, you are catching a real class of misconfiguration early and automatically. That is valuable — do not remove it. Checkov, tfsec, and Trivy are essential tools that every Terraform pipeline should include.
But there is a ceiling on what any static rule-based linter can evaluate. The gap between “passes Checkov” and “passes a Well-Architected review” is real, specific, and worth understanding — because the findings in that gap are often the ones that carry production blast radius.
The framing that matters:
Linters catch violations. Architecture reviews catch patterns.
Checkov asks: “Is this config valid?”
ArchGuard asks: “Is this architecture sound for your workload?”
Those are different questions. This post explains precisely what separates them.
What Checkov is actually doing
Checkov applies a library of checks — currently over 1,000 for Terraform (as of April 2026) — each of which evaluates a specific attribute on a specific resource type against a known-bad condition.CKV_AWS_79checks that http_tokens = "required" is set on aws_instance.CKV_AWS_19checks that server-side encryption is configured onaws_s3_bucket_server_side_encryption_configuration. These are deterministic: pass or fail, no judgment required.
This model is fast, reproducible, and free of false negatives for the conditions it checks. It is also necessarily bounded. A check can only evaluate what is present in the resource attributes it reads. It cannot reason about:
- ·Relationships between resources — whether an IAM role can access a specific S3 bucket, or whether a Lambda can assume a cross-account role.
- ·The workload context — whether this database is a production critical-path store or a dev scratch environment that does not need multi-AZ.
- ·The blast radius of a finding — what else becomes accessible if this specific configuration is exploited.
- ·Patterns that emerge from what is absent — a Lambda function missing a Dead Letter Queue, an RDS instance missing cross-region automated backups, a VPC with no Flow Logs.
None of these limitations are bugs in Checkov. They are architectural properties of static rule-based analysis. The tool is doing exactly what it is designed to do.
What Checkov doesn’t know about your workload
It doesn’t know if this is prod or dev
A Checkov check has the same severity regardless of whether it runs against your production customer database or a developer’s temporary sandbox. A Well-Architected review, by contrast, asks about workload context first: what is this service’s SLA? Who are the users? What is the recovery time objective?
A dev RDS instance without multi-AZ is a cost-appropriate choice. The same configuration on a production database handling payment data is a Reliability pillar finding (REL 10: “How do you use fault isolation to protect your workload?”). The Terraform is identical — only the context changes the severity.
It doesn’t know the blast radius of a finding
Not all misconfigurations are equal. An overly permissive IAM role on a Lambda function that only processes non-sensitive queue messages carries very different blast radius than the same policy on an EC2 instance with access to a KMS key that encrypts customer PII.
Checkov reports both as identical violations ofCKV_AWS_40orCKV_AWS_289. An architectural review maps the lateral movement path: if this role is compromised, which other resources are reachable, and in what sequence?
It doesn’t know which WAF pillar is most critical for your use case
A healthcare SaaS has different risk priorities than a batch data pipeline or an e-commerce checkout service. The AWS Well-Architected Framework has six pillars, and their relative weight depends on the workload. A linter applies the same ruleset to every Terraform file regardless.
Three findings Checkov passes that an architect flags
Example 1 — S3 bucket with private ACL and an account-wide bucket policy
Every standard Checkov S3 check passes on this bucket: the ACL is private, encryption is enabled, and public access is blocked. What Checkov does not check is whether the bucket policy grants access to a principal that is broader than the workload requires.
# storage.tf — all Checkov rules pass, but the bucket policy is overly permissiveresource "aws_s3_bucket" "reports" { bucket = "acme-reports-prod"}# CKV_AWS_20 → PASS (ACL is private)resource "aws_s3_bucket_acl" "reports" { bucket = aws_s3_bucket.reports.id acl = "private"}# CKV_AWS_19 → PASS (encryption enabled)resource "aws_s3_bucket_server_side_encryption_configuration" "reports" { bucket = aws_s3_bucket.reports.id rule { apply_server_side_encryption_by_default { sse_algorithm = "AES256" } }}# CKV_AWS_53 → PASS (public access blocked)resource "aws_s3_bucket_public_access_block" "reports" { bucket = aws_s3_bucket.reports.id block_public_acls = true block_public_policy = true ignore_public_acls = true restrict_public_buckets = true}# Checkov: no rule evaluates whether this principal scope is appropriate.# Any IAM principal in account 123456789012 — including a compromised CI role —# can perform s3:* on this bucket.resource "aws_s3_bucket_policy" "reports" { bucket = aws_s3_bucket.reports.id policy = jsonencode({ Version = "2012-10-17" Statement = [{ Effect = "Allow" Principal = { AWS = "arn:aws:iam::123456789012:root" } Action = "s3:*" Resource = [ aws_s3_bucket.reports.arn, "${aws_s3_bucket.reports.arn}/*", ] }] })}All S3 Checkov checks pass — but the bucket policy allows any principal in the AWS account
The architectural finding: arn:aws:iam::123456789012:root is the account root trust, meaning any IAM principal in that account can performs3:*on the bucket. A compromised CI role, a Lambda function, or a misconfigured assume-role chain can read and delete customer report data. No Checkov rule evaluates “is this principal scope appropriate for this bucket’s sensitivity?” — that requires a judgment about the workload.
# storage.tf — scoped to the specific role that needs accessresource "aws_s3_bucket_policy" "reports" { bucket = aws_s3_bucket.reports.id policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Principal = { AWS = aws_iam_role.report_generator.arn } Action = ["s3:GetObject", "s3:PutObject"] Resource = "${aws_s3_bucket.reports.arn}/*" }, { Effect = "Allow" Principal = { AWS = aws_iam_role.report_reader.arn } Action = "s3:GetObject" Resource = "${aws_s3_bucket.reports.arn}/*" } ] })}Scoped to the two specific roles that need access — nothing broader
Example 2 — RDS without multi-AZ on a production database
This RDS instance has encryption at rest, deletion protection, and a 7-day backup retention period — all checked by Checkov. The one configuration that determines whether this database survives an AZ failure is multi_az, and Checkov has no rule for it.
# database.tf — single-AZ RDS; passes all Checkov encryption and backup rulesresource "aws_db_instance" "main" { identifier = "prod-db" engine = "postgres" engine_version = "15.3" instance_class = "db.t3.medium" allocated_storage = 100 username = "admin" deletion_protection = true # CKV_AWS_293 → PASS storage_encrypted = true # CKV_AWS_17 → PASS backup_retention_period = 7 # CKV_AWS_133 → PASS multi_az = false # no Checkov rule checks this — but WAF REL 10 flags it}The architectural finding maps to AWS Well-Architected Reliability Pillar REL 10: “How do you use fault isolation to protect your workload?” For a production database, single-AZ deployment means any AWS infrastructure failure in that AZ takes the database offline. Typical RDS failover to a multi-AZ standby completes in 60–120 seconds; cold recovery from a snapshot backup takes 15–40 minutes. For a service with an SLA, this is a Reliability finding regardless of what the linter says.
# database.tf — multi-AZ enabled for production workloadresource "aws_db_instance" "main" { identifier = "prod-db" engine = "postgres" engine_version = "15.3" instance_class = "db.t3.medium" allocated_storage = 100 username = "admin" deletion_protection = true storage_encrypted = true backup_retention_period = 7 multi_az = true # automatic standby in a second AZ; WAF REL 10 compliant}Example 3 — Lambda processing async events without a Dead Letter Queue
Checkov has no rule that checks for a dead_letter_config block on Lambda functions. For synchronous Lambda invocations — where the caller receives the response and handles failures — this may be acceptable. For asynchronous invocations — S3 event triggers, SNS subscriptions, EventBridge rules — it is an Operational Excellence finding.
# lambda.tf — async Lambda with no failure handlingresource "aws_lambda_function" "order_processor" { function_name = "order-processor" runtime = "python3.12" handler = "handler.process" role = aws_iam_role.lambda_role.arn filename = data.archive_file.lambda_zip.output_path # No dead_letter_config block # No destination_config for async failures # No reserved_concurrent_executions limit # Checkov has no rule for any of these omissions}AWS retries asynchronous Lambda invocations twice on failure by default. After the third attempt, the event is silently discarded unless a Dead Letter Queue or on-failure destination is configured. For a payment processing function or an order confirmation handler, that means lost events with no visibility. TheWell-Architected Operational Excellence Pillarrequires that workloads have mechanisms to surface and recover from operational events — a Lambda that silently drops async failures has neither.
# lambda.tf — async Lambda with DLQ, failure destination, and concurrency capresource "aws_lambda_function" "order_processor" { function_name = "order-processor" runtime = "python3.12" handler = "handler.process" role = aws_iam_role.lambda_role.arn filename = data.archive_file.lambda_zip.output_path dead_letter_config { target_arn = aws_sqs_queue.order_dlq.arn } destination_config { on_failure { destination = aws_sqs_queue.order_dlq.arn } } reserved_concurrent_executions = 100 # prevents runaway concurrency}resource "aws_sqs_queue" "order_dlq" { name = "order-processor-dlq" message_retention_seconds = 1209600 # 14 days visibility_timeout_seconds = 300}How to use Checkov and architectural review together
The answer is not to replace Checkov — it is to understand what each layer covers.
Checkov belongs in CI as a fast, mandatory gate on known-bad configurations. It catches attribute-level violations early, before code review, and with zero false-negative risk for the conditions it checks. Nothing else does this as well.
Architectural review belongs at the workload level: before production deployments, before major architectural changes, and periodically as infrastructure evolves. It operates on the Terraform as a system — reading relationships between resources, mapping blast radius, and evaluating findings against the workload context.
The ArchGuard methodology explicitly runs deterministic checks first, then layers architectural reasoning on top. Checkov findings are an input — not something ArchGuard competes with.
What each layer covers
| Capability | Checkov / tfsec / Trivy | Architectural review |
|---|---|---|
| Attribute-level misconfigurations | ✓ Yes | ✓ Yes |
| Cross-resource relationships | ✗ No | ✓ Yes |
| Workload context (prod vs dev) | ✗ No | ✓ Yes |
| Blast radius per finding | ✗ No | ✓ Yes |
| Absent resource detection | ✗ Limited | ✓ Yes |
| WAF pillar alignment | ✗ Partial | ✓ Yes |
| Speed in CI pipeline | ✓ Fast | On-demand |
| Cost | Free (OSS) | Paid review |
For a deeper treatment of the five most common Terraform patterns that pass Checkov but fail a Well-Architected Security review — with HCL before/after examples — that post walks through each one with the specific WAF control it violates.
Frequently asked questions
Is this post saying Checkov is insufficient?↓
No. Checkov is an essential CI tool for catching attribute-level misconfigurations deterministically and at zero cost. This post explains the structural limits of static rule-based analysis — limits that apply to any linter, not a flaw in Checkov specifically.
Which Checkov rules cover the three examples in this post?↓
For the S3 bucket policy example: no Checkov rule evaluates principal scope breadth. For the RDS multi-AZ example: no Checkov rule checks the multi_az attribute. For the Lambda DLQ example: no Checkov rule checks for dead_letter_config presence on async Lambda functions.
Can custom Checkov policies cover some of these gaps?↓
Custom policies (Python or YAML) can extend Checkov with new attribute checks. They cannot reason about workload context, cross-resource blast radius, or the significance of absent resources relative to the workload type — those require contextual judgment that rule-based systems cannot encode generically.
How does ArchGuard know the workload context?↓
ArchGuard asks. Before analysis begins, a workload brief captures the service type, environment (production vs. development), SLA requirements, and data sensitivity. This context informs every finding's severity — the same Terraform configuration can be Low severity in a dev environment and Critical in production.
tfsec is now part of Trivy — should I migrate?↓
tfsec was merged into Trivy in 2023 and is no longer maintained as a standalone tool. Trivy now covers tfsec's IaC scanning capability alongside container and SBOM scanning. Note that in March 2026, a supply chain attack affected Trivy's GitHub Actions tags — pin to a specific verified digest rather than a floating tag when using Trivy in CI.
See What ArchGuard Finds Beyond Your Linter
Upload your Terraform and get a Well-Architected review that covers what static analysis cannot — blast radius, workload context, and cross-service patterns.
Get StartedUpload your Terraform. Get a structured findings report.