Skip to content

Instantly share code, notes, and snippets.

@rexwhitten
Created January 28, 2026 18:06
Show Gist options
  • Select an option

  • Save rexwhitten/3254ca6d44e511c737ceef7f7758fdb7 to your computer and use it in GitHub Desktop.

Select an option

Save rexwhitten/3254ca6d44e511c737ceef7f7758fdb7 to your computer and use it in GitHub Desktop.

To enable a custom name (i.e., not starting with /aws/vendedlogs/), you need two distinct sets of permissions in your Terraform.

  1. The IAM Execution Role: This is what the Step Function assumes to execute your logic (e.g., calling Lambdas). It does not handle the logging permissions.
  2. The CloudWatch Resource Policy (The "Other Change"): This is a separate resource that tells CloudWatch, "Allow the Step Function service (states.amazonaws.com) to write to this specific Log Group."

Here is the complete Terraform configuration.

The Terraform Code

variable "app_name" {
  default = "my-payment-processor"
}

# ------------------------------------------------------------------
# 1. The CloudWatch Log Group (Custom Name)
# ------------------------------------------------------------------
resource "aws_cloudwatch_log_group" "sfn_logs" {
  name              = "/custom/apps/${var.app_name}"
  retention_in_days = 30
}

# ------------------------------------------------------------------
# 2. THE CRITICAL "OTHER CHANGE": Resource Policy
# ------------------------------------------------------------------
# This explicitly allows the Step Functions SERVICE to write to your custom log group.
# Without this, your execution history will simply never appear.

resource "aws_cloudwatch_log_resource_policy" "sfn_logging_policy" {
  policy_name = "${var.app_name}-logging-policy"

  policy_document = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "AllowStepFunctionsLogging"
        Effect = "Allow"
        Principal = {
          Service = "states.amazonaws.com"
        }
        Action = [
          "logs:CreateLogStream",
          "logs:PutLogEvents",
          "logs:GetLogDelivery",
          "logs:UpdateLogDelivery",
          "logs:DeleteLogDelivery",
          "logs:ListLogDeliveries",
          "logs:PutResourcePolicy",
          "logs:DescribeResourcePolicies",
          "logs:DescribeLogGroups"
        ]
        # Scoping this is safer than "*" but requires care with ARNs
        Resource = "${aws_cloudwatch_log_group.sfn_logs.arn}:*"
      }
    ]
  })
}

# ------------------------------------------------------------------
# 3. The IAM Execution Role (For the Logic)
# ------------------------------------------------------------------
# This role allows the Step Function to DO things (like invoke Lambda).
# It does NOT need permissions to write to CloudWatch.

resource "aws_iam_role" "sfn_exec_role" {
  name = "${var.app_name}-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "states.amazonaws.com"
        }
      }
    ]
  })
}

# Example permission: Allow it to invoke a Lambda function
resource "aws_iam_policy" "sfn_lambda_policy" {
  name = "${var.app_name}-lambda-invoke"

  policy_document = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "lambda:InvokeFunction"
        ]
        Resource = "*" # Restrict this to your specific Lambda ARNs in production
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "sfn_attach" {
  role       = aws_iam_role.sfn_exec_role.name
  policy_arn = aws_iam_policy.sfn_lambda_policy.arn
}

# ------------------------------------------------------------------
# 4. The State Machine
# ------------------------------------------------------------------
resource "aws_sfn_state_machine" "sfn_state_machine" {
  name     = "${var.app_name}-machine"
  role_arn = aws_iam_role.sfn_exec_role.arn

  # Minimal Definition
  definition = <<EOF
  {
    "Comment": "A Hello World example",
    "StartAt": "HelloWorld",
    "States": {
      "HelloWorld": {
        "Type": "Pass",
        "End": true
      }
    }
  }
  EOF

  logging_configuration {
    # IMPORTANT: You must append :* to the ARN here
    log_destination        = "${aws_cloudwatch_log_group.sfn_logs.arn}:*"
    include_execution_data = true
    level                  = "ALL"
  }
  
  # Ensure the policy exists before the State Machine tries to use it
  depends_on = [
    aws_cloudwatch_log_resource_policy.sfn_logging_policy
  ]
}

Why two different permission areas?

  1. IAM Role (aws_iam_role): This is the ID card the Step Function carries. It says "I am allowed to touch these Lambdas and these DynamoDB tables." It does not have permission to write logs because it's not the Execution Role writing the logs—it is the AWS Step Functions Service backend doing it on your behalf.
  2. Resource Policy (aws_cloudwatch_log_resource_policy): This is the security guard at the door of the Log Group. It says "I will allow the states.amazonaws.com service to come in and write data."

A Warning on Limits

You are limited to 10 Resource Policies per region per account. Since one policy is consumed every time you use this aws_cloudwatch_log_resource_policy resource, you will hit a wall if you deploy 11 of these stacks.

Solution: If you have many state machines, create one single shared Resource Policy that allows states.amazonaws.com to write to * (all log groups), or list multiple specific ARNs in that one policy statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment