In this guide, I will show you how to build an AWS Lambda project that gets HubSpot company records and saves them to an S3 bucket. We use Step Functions to continue the batch process safely. This helps avoid Lambda timeout problems.
✅ What This Lambda Function Does
- Reads
lastmodifieddate
from environment variables - Gets records in batches (for example, 100 at a time)
- Stops before hitting the Lambda timeout limit
- Saves each batch to S3
- Uses pagination (offset) to get more data
- Can continue using Step Functions
🗂️ Project Structure
lambda_hubspot_sync/ │ ├── handler.py ├── requirements.txt └── utils.py
🔧 handler.py
This is the main Lambda function.
import os import time from utils import get_company_records, write_to_s3 BATCH_SIZE = 100 TIME_LIMIT = 840 # Stop at 14 mins (Lambda max is 15 mins) BUCKET_NAME = os.environ['BUCKET_NAME'] HUBSPOT_TZ = ZoneInfo("UTC") # Default: UTC+00:00 # Set your offset in minutes offset_mins = int(os.getenv("OFFSET_MINUTES")) # Calculate time X minutes ago in ISO 8601 format (UTC) minutes_ago = datetime.now(HUBSPOT_TZ) - timedelta(minutes=offset_mins) # Convert to UTC timezone LAST_MODIFIED = minutes_ago.isoformat() def lambda_handler(event, context): start_time = time.time() offset = event.get("offset", 0) while True: records, next_offset = get_company_records(LAST_MODIFIED, BATCH_SIZE, offset) if not records: print("No more records.") break filename = f"hubspot_companies_batch_{offset}.json" write_to_s3(records, BUCKET_NAME, filename) if not next_offset: break offset = next_offset if time.time() - start_time > TIME_LIMIT: print("Reached safe time limit.") return { "status": "incomplete", "next_offset": offset } return { "status": "complete" }
🔧 utils.py
This handles API calls and S3 uploads.
import json import boto3 import requests HUBSPOT_API_KEY = os.environ['HUBSPOT_API_KEY'] S3 = boto3.client("s3") def get_company_records(last_modified, limit, offset): url = "https://api.hubapi.com/crm/v3/objects/companies/search" headers = { "Authorization": f"Bearer {HUBSPOT_API_KEY}", "Content-Type": "application/json" } payload = { "filterGroups": [{ "filters": [{ "propertyName": "lastmodifieddate", "operator": "GTE", "value": last_modified }] }], "limit": limit, "after": offset } resp = requests.post(url, headers=headers, json=payload) data = resp.json() companies = data.get("results", []) next_offset = data.get("paging", {}).get("next", {}).get("after") return companies, next_offset def write_to_s3(data, bucket, filename): S3.put_object( Bucket=bucket, Key=filename, Body=json.dumps(data, indent=2).encode("utf-8") ) print(f"Wrote batch to {filename}")
📦 requirements.txt
This handles API calls and S3 uploads.
boto3 requests
🧪 Lambda Environment Variables
Key | Example Value |
---|---|
BUCKET_NAME | your-s3-bucket-name |
HUBSPOT_API_KEY | your-hubspot-app-token |
OFFSET_MINUTES | 2024-01-01T00:00:00.000Z |
🔁 Use Step Functions to Continue the Process
Sometimes, one Lambda function is not enough. We use AWS Step Functions to continue the process using the offset.
🗺️ Step Function Workflow
{ "Comment": "State Machine to process HubSpot records in batches", "StartAt": "Get HubSpot Data", "States": { "Get HubSpot Data": { "Type": "Task", "Resource": "arn:aws:lambda:your-region:your-account-id:function:YourLambdaFunctionName", "Parameters": { "offset.$": "$.offset" }, "ResultPath": "$.lambdaResult", "Next": "Check If More Data" }, "Check If More Data": { "Type": "Choice", "Choices": [ { "Variable": "$.lambdaResult.status", "StringEquals": "complete", "Next": "End" }, { "Variable": "$.lambdaResult.status", "StringEquals": "incomplete", "Next": "Get HubSpot Data" } ] }, "End": { "Type": "Succeed" } } }
🧭 How to Set Up
- Go to AWS Step Functions Console
- Create a state machine and paste the JSON
- Give permission to call the Lambda function
- Done!
⏰ Trigger Step Function with CloudWatch
You can trigger the process every 5 minutes using this cron:
cron(0/5 * * * ? *)
Or trigger using events like log messages.
🔐 IAM Permissions
Make sure you have these permissions:
Step Functions Role
{ "Effect": "Allow", "Action": "lambda:InvokeFunction", "Resource": "arn:aws:lambda:your-region:your-account-id:function:YourLambdaFunctionName" }
Lambda Role
{ "Effect": "Allow", "Action": [ "s3:PutObject", "logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents", "lambda:InvokeFunction" ], "Resource": "*" }
✅ Final Thoughts
This solution helps you:
- Sync data from HubSpot in safe batches
- Avoid timeout issues
- Continue processing using Step Functions
- Automate using CloudWatch rules
Let me know if you want to add error handling, custom filtering, or push this to a CI/CD pipeline next!