Snapshot your HubSpot pipeline weekly in Google Sheets using code and cron
Prerequisites
- Node.js 18+ or Python 3.9+
- HubSpot private app token (scopes:
crm.objects.deals.read,crm.schemas.deals.read) - Google Cloud project with the Google Sheets API enabled
- Google service account with a JSON key file, or OAuth2 credentials
- The service account's email address shared as an Editor on the target Google Sheet
- A scheduling environment: cron, GitHub Actions, or a cloud function
Step 1: Set up the project
# Verify your HubSpot token works
curl -s "https://api.hubapi.com/crm/v3/pipelines/deals" \
-H "Authorization: Bearer $HUBSPOT_TOKEN" | head -c 200Create a service account in the Google Cloud Console, download the JSON key file, and share your Google Sheet with the service account email (e.g., snapshot@your-project.iam.gserviceaccount.com) as an Editor. Store the key file path in the GOOGLE_SERVICE_ACCOUNT_KEY environment variable.
Step 2: Fetch pipeline stages
Get a mapping of stage IDs to human-readable names.
curl -s "https://api.hubapi.com/crm/v3/pipelines/deals" \
-H "Authorization: Bearer $HUBSPOT_TOKEN" \
| jq '.results[0].stages[] | {id: .id, label: .label}'Step 3: Search for active deals
Use the HubSpot Search API to pull all deals with pagination.
from hubspot.crm.deals import PublicObjectSearchRequest, Filter, FilterGroup
def get_all_deals(client):
pipeline_filter = Filter(
property_name="pipeline",
operator="EQ",
value="default"
)
filter_group = FilterGroup(filters=[pipeline_filter])
all_deals = []
after = 0
while True:
request = PublicObjectSearchRequest(
filter_groups=[filter_group],
properties=[
"dealname", "amount", "dealstage",
"closedate", "createdate"
],
sorts=[{"propertyName": "amount", "direction": "DESCENDING"}],
limit=100,
after=after
)
response = client.crm.deals.search_api.do_search(
public_object_search_request=request
)
all_deals.extend(response.results)
if response.paging and response.paging.next:
after = response.paging.next.after
else:
break
return all_dealsAll HubSpot property values come back as strings — even numbers. amount returns "50000" not 50000. Always parse with float() / parseFloat() and handle None/null values.
Step 4: Calculate snapshot metrics
Aggregate deals into total value, count, and per-stage breakdowns.
from datetime import datetime, timezone
def calculate_snapshot(deals, stage_map, stage_order):
total_value = 0
by_stage = {}
for deal in deals:
props = deal.properties
amount = float(props.get("amount") or 0)
total_value += amount
stage_id = props.get("dealstage", "unknown")
stage_name = stage_map.get(stage_id, stage_id)
if stage_name not in by_stage:
by_stage[stage_name] = {"count": 0, "value": 0}
by_stage[stage_name]["count"] += 1
by_stage[stage_name]["value"] += amount
# Build the row: date, total value, deal count, then per-stage count/value pairs
today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
row = [today, total_value, len(deals)]
for stage_name in stage_order:
data = by_stage.get(stage_name, {"count": 0, "value": 0})
row.append(data["count"])
row.append(data["value"])
return rowStep 5: Append to Google Sheets
Write the snapshot row to Google Sheets using the Sheets API v4 append method.
from google.oauth2 import service_account
from googleapiclient.discovery import build
def append_to_sheet(row, spreadsheet_id, range_name="Sheet1!A1"):
creds = service_account.Credentials.from_service_account_file(
os.environ["GOOGLE_SERVICE_ACCOUNT_KEY"],
scopes=["https://www.googleapis.com/auth/spreadsheets"]
)
service = build("sheets", "v4", credentials=creds)
body = {"values": [row]}
result = service.spreadsheets().values().append(
spreadsheetId=spreadsheet_id,
range=range_name,
valueInputOption="USER_ENTERED",
insertDataOption="INSERT_ROWS",
body=body,
).execute()
print(f"Appended {result.get('updates', {}).get('updatedRows', 0)} row(s)")
return resultThe Sheets API append method finds the last row with data in the specified range and inserts below it. valueInputOption: "USER_ENTERED" means Google Sheets will parse values the same way it would if you typed them — numbers stay as numbers, dates as dates.
The service account is a separate Google identity. It can't see your sheets by default. You must share the spreadsheet with the service account email address (xxx@yyy.iam.gserviceaccount.com) and give it Editor access. Without this, you'll get a 403 error.
Step 6: Wire it all together
#!/usr/bin/env python3
"""Pipeline snapshot: HubSpot → Google Sheets"""
import os
import sys
from datetime import datetime, timezone
try:
from hubspot import HubSpot
from google.oauth2 import service_account
from googleapiclient.discovery import build
except ImportError:
os.system("pip install hubspot-api-client google-api-python-client google-auth -q")
from hubspot import HubSpot
from google.oauth2 import service_account
from googleapiclient.discovery import build
HUBSPOT_TOKEN = os.environ.get("HUBSPOT_TOKEN")
SPREADSHEET_ID = os.environ.get("GOOGLE_SPREADSHEET_ID")
GOOGLE_KEY_FILE = os.environ.get("GOOGLE_SERVICE_ACCOUNT_KEY")
if not all([HUBSPOT_TOKEN, SPREADSHEET_ID, GOOGLE_KEY_FILE]):
print("ERROR: Missing env vars: HUBSPOT_TOKEN, GOOGLE_SPREADSHEET_ID, GOOGLE_SERVICE_ACCOUNT_KEY")
sys.exit(1)
# 1. Get stages
client = HubSpot(access_token=HUBSPOT_TOKEN)
pipelines = client.crm.pipelines.pipelines_api.get_all(object_type="deals")
stage_map = {}
stage_order = []
for pipeline in pipelines.results:
for stage in pipeline.stages:
stage_map[stage.id] = stage.label
stage_order.append(stage.label)
print(f"Found {len(stage_map)} stages: {', '.join(stage_order)}")
# 2. Get deals (paginated)
from hubspot.crm.deals import PublicObjectSearchRequest, Filter, FilterGroup
pipeline_filter = Filter(property_name="pipeline", operator="EQ", value="default")
all_deals = []
after = 0
while True:
request = PublicObjectSearchRequest(
filter_groups=[FilterGroup(filters=[pipeline_filter])],
properties=["dealname", "amount", "dealstage", "closedate", "createdate"],
sorts=[{"propertyName": "amount", "direction": "DESCENDING"}],
limit=100,
after=after,
)
response = client.crm.deals.search_api.do_search(public_object_search_request=request)
all_deals.extend(response.results)
if response.paging and response.paging.next:
after = response.paging.next.after
else:
break
print(f"Found {len(all_deals)} deals")
# 3. Calculate snapshot
total_value = 0
by_stage = {}
for deal in all_deals:
props = deal.properties
amount = float(props.get("amount") or 0)
total_value += amount
stage_name = stage_map.get(props.get("dealstage", ""), props.get("dealstage", "unknown"))
if stage_name not in by_stage:
by_stage[stage_name] = {"count": 0, "value": 0}
by_stage[stage_name]["count"] += 1
by_stage[stage_name]["value"] += amount
today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
row = [today, total_value, len(all_deals)]
for stage_name in stage_order:
data = by_stage.get(stage_name, {"count": 0, "value": 0})
row.append(data["count"])
row.append(data["value"])
print(f"Snapshot: {today} | ${total_value:,.0f} | {len(all_deals)} deals")
# 4. Append to Google Sheets
creds = service_account.Credentials.from_service_account_file(
GOOGLE_KEY_FILE, scopes=["https://www.googleapis.com/auth/spreadsheets"]
)
service = build("sheets", "v4", credentials=creds)
result = service.spreadsheets().values().append(
spreadsheetId=SPREADSHEET_ID,
range="Sheet1!A1",
valueInputOption="USER_ENTERED",
insertDataOption="INSERT_ROWS",
body={"values": [row]},
).execute()
print(f"Appended to Google Sheets: {result.get('updates', {}).get('updatedRows', 0)} row(s)")Step 7: Schedule with cron or GitHub Actions
Cron (server-based):
# crontab -e
0 8 * * 1 cd /path/to/pipeline-snapshot && python snapshot.py >> /var/log/pipeline-snapshot.log 2>&1GitHub Actions (serverless):
# .github/workflows/pipeline-snapshot.yml
name: Weekly Pipeline Snapshot
on:
schedule:
- cron: '0 13 * * 1' # 8 AM ET = 1 PM UTC
workflow_dispatch: {} # Allow manual runs
jobs:
snapshot:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install hubspot-api-client google-api-python-client google-auth
- run: python snapshot.py
env:
HUBSPOT_TOKEN: ${{ secrets.HUBSPOT_TOKEN }}
GOOGLE_SPREADSHEET_ID: ${{ secrets.GOOGLE_SPREADSHEET_ID }}
GOOGLE_SERVICE_ACCOUNT_KEY: ${{ secrets.GOOGLE_SERVICE_ACCOUNT_KEY }}For GitHub Actions, store the service account JSON key as a secret. You have two options: (1) save the entire JSON content as a secret and write it to a temp file at runtime, or (2) base64-encode the key, store that as a secret, and decode it in the workflow. Option 1 is simpler — just use echo "$KEY_JSON" > /tmp/sa-key.json and set GOOGLE_SERVICE_ACCOUNT_KEY=/tmp/sa-key.json.
Rate limits
| API | Limit | Impact |
|---|---|---|
| HubSpot general | 150 req / 10 sec per account | No concern for weekly snapshot |
| HubSpot Search | 5 req / sec | Matters with pagination (add 200ms delay between pages) |
| Google Sheets API | 300 req / min per project | No concern for a single append |
| Google Sheets API | 60 req / min per user per project | No concern |
Trade-offs
- Full control over data transformation, column ordering, and snapshot logic
- Version controlled — track changes in git, review in PRs
- No per-execution cost — runs on existing infrastructure (GitHub Actions free tier includes 2,000 min/month)
- Google Sheets API is free — standard quotas are generous for weekly snapshots
- Maintenance overhead — you handle errors, monitoring, and credential rotation
- No visual debugging — logs, not a visual flow builder
Need help implementing this?
We build and optimize automation systems for mid-market businesses. Let's discuss the right approach for your team.