AWS Glue Crawler Can’t Connect to S3 Bucket on Another Account: Problem Solved

November 10, 2023 | By Gerald | Filed in: AWS Services.
AWS Glue Crawler S3 Cross-account Access

If you’ve ever encountered a situation where your AWS Glue crawler can’t connect to an S3 bucket created on another account, then this post is for you. It can be frustrating when data that should be easily accessible seems just out of reach. But don’t worry! I’ll guide you through the solution.

The Problem

You have two accounts:

  • Account A which hosts the AWS Glue service,
  • Account B containing the S3 bucket that you want to access from Account A.

The AWS Glue crawler in Account A needs to access the S3 bucket in Account B, but it’s unable to connect.

The Solution

To fix this issue, we need to set up cross-account access. This involves creating IAM policies and roles that allow services in one account to access resources in another account. Here’s how you can do this:

On Account B

On the S3 bucket permissions, apply the following IAM policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<ACCOUNT-A>:root"
            },
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::your-s3-bucket-datasets/*",
                "arn:aws:s3:::your-s3-bucket-datasets/"
            ]
        }
    ]
}

This policy permits the IAM role AWSGlueServiceRole-project-name in Account A to use s3:* actions on the specified resources. Make sure to create this AWS Glue role in Account A.

On Account A

Attach the following trust policy to your AWS Glue role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "Service": "glue.amazonaws.com",
                "AWS": "arn:aws:iam::<ACCOUNT-B>:root"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

This policy allows the Glue service to assume the IAM role. The condition includes an external ID for additional security.

AWS Glue Crawler Can't Connect to S3 Bucket on Another Account
AWS Glue Policy

By setting up these roles and policies, you ensure that the Glue crawler in Account A can access the S3 bucket in Account B securely. Now, your data should be within reach again!

Conclusion

Cross-account access issues with AWS Glue crawlers connecting to S3 buckets can certainly cause headaches, but they aren’t insurmountable. By correctly configuring IAM roles and policies, we can easily bridge the gap between our services and extract value from our data.

Feel free to return to this post anytime you need help solving similar AWS Glue and S3 cross-account connectivity issues.

SHARE THIS ARTICLE

Tags: , , , , , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *