Skip to content
Linuxbeast
  • Home
  • Today in Tech
  • Who is Hiring?
  • About Me
  • Work With Me
  • Tools
    • DevOps Onboarding
    • AWS VPC Subnet Planner
    • Tag Network
    • Pag-IBIG Housing Loan Calculator
  • Contact
Granting AWS Glue Crawler Access to a Cross-Account S3 Bucket

Granting AWS Glue Crawler Access to a Cross-Account S3 Bucket

March 9, 2026May 3, 2024 by Linuxbeast
3 min read

When your data lives in one AWS account and your Glue Crawler runs in another, you need to set up cross-account permissions on both sides. This guide shows you how to grant a Glue Crawler cross-account S3 bucket access by configuring the S3 bucket policy in the source account and the IAM role in the Glue account.

Architecture Overview

Account A (Glue) Account B (S3 Data)
Account ID 111111111111 222222222222
What it has Glue Crawler, Glue Data Catalog, IAM role S3 bucket with the source data
What you configure IAM role with S3 + Glue permissions S3 bucket policy allowing Account A

Both sides need to allow the access — the bucket policy in Account B grants permission, and the IAM role in Account A gives the Glue Crawler the ability to use it.

Step 1: Add a Bucket Policy in Account B

In Account B (the account that owns the S3 bucket), add a bucket policy that allows Account A to read the data. Go to the S3 bucket > Permissions > Bucket policy and add:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowGlueAccountAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111111111111:root"
      },
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-data-bucket",
        "arn:aws:s3:::your-data-bucket/*"
      ]
    }
  ]
}
  • Replace 111111111111 with Account A’s actual account ID
  • Replace your-data-bucket with your S3 bucket name
  • s3:ListBucket applies to the bucket itself — the Crawler needs it to list objects
  • s3:GetObject applies to /* — the Crawler needs it to read the actual files

A Glue Crawler only reads data. It doesn’t need s3:PutObject unless you also run Glue jobs that write back to this bucket.

For a tighter policy, you can scope the Principal to the specific Glue role ARN instead of the account root:

"Principal": {
  "AWS": "arn:aws:iam::111111111111:role/AWSGlueServiceRole-crawler"
}

Step 2: Create the Glue IAM Role in Account A

In Account A (where the Glue Crawler runs), create an IAM role that the Glue service can assume.

Trust policy

This trust policy allows the Glue service to assume the role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "glue.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Permissions policy

Attach an inline or managed policy that grants the role access to the cross-account S3 bucket and the Glue Data Catalog:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "CrossAccountS3Access",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-data-bucket",
        "arn:aws:s3:::your-data-bucket/*"
      ]
    },
    {
      "Sid": "GlueCatalogAccess",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:CreateDatabase",
        "glue:GetTable",
        "glue:CreateTable",
        "glue:UpdateTable",
        "glue:BatchCreatePartition",
        "glue:GetPartition",
        "glue:BatchGetPartition"
      ],
      "Resource": "*"
    },
    {
      "Sid": "CloudWatchLogs",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:111111111111:log-group:/aws-glue/*"
    }
  ]
}

This policy has three parts:

  • S3 access — matches the actions allowed in the bucket policy. Both sides must allow the same actions for cross-account access to work.
  • Glue Data Catalog — lets the Crawler create and update tables/partitions in the catalog
  • CloudWatch Logs — lets the Crawler write logs for debugging (optional but recommended)

Step 3: Create the Glue Crawler

In the AWS Glue console in Account A:

  1. Go to Crawlers > Create crawler
  2. Set the data source to S3 and enter the path: s3://your-data-bucket/path/to/data/
  3. For the IAM role, select the role you created in Step 2
  4. Choose the target database in the Glue Data Catalog (or create a new one)
  5. Run the crawler

If the crawler fails with an “Access Denied” error, double-check that both the bucket policy and the IAM role policy use the same S3 actions and resource ARNs.

Troubleshooting

Error Likely Cause
Access Denied on S3 Bucket policy in Account B doesn’t allow the Glue role or account root from Account A
Access Denied on S3 (role is correct) The IAM role in Account A is missing the S3 permissions — both sides must allow
Crawler runs but finds no tables The S3 path is wrong, or the data format isn’t recognized by Glue classifiers
Crawler can’t write to Data Catalog The IAM role is missing glue:CreateTable or glue:UpdateTable permissions
KMS decrypt error The S3 bucket uses SSE-KMS encryption — add kms:Decrypt to the role and the KMS key policy

Conclusion

Cross-account Glue Crawler access requires matching permissions on both sides: a bucket policy in the data account and an IAM role with S3 + Glue permissions in the crawler account. Keep the policies scoped to the minimum actions the crawler needs.

For other cross-account patterns, see How to Copy S3 Bucket Objects Across AWS Accounts or How to Set Up Cross-Account Access in AWS with AssumeRole.

Categories AWS Tags AWS Glue, Crawler, Cross-Account, Data Catalog, IAM, S3
Resolving Jenkins SSH Connection Errors for Mac Users
How to Build and Deploy Python Libraries for AWS Lambda Layers
← PreviousResolving Jenkins SSH Connection Errors for Mac UsersNext →How to Build and Deploy Python Libraries for AWS Lambda Layers

Related Articles

AWS

How to Deploy EC2 Ubuntu 22.04 LTS Instance on AWS

How to Configure AWS SSO CLI Access for Windows PowerShell
AWS

How to Configure AWS SSO CLI Access for Windows PowerShell

how to mount amazon efs on ec2 ubuntu 22.04 instance
AWS

How to Mount Amazon EFS on EC2 Ubuntu 22.04 Instance

© 2026 Linuxbeast • Built with GeneratePress
Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
  • Manage options
  • Manage services
  • Manage {vendor_count} vendors
  • Read more about these purposes
View preferences
  • {title}
  • {title}
  • {title}