Snapshot your HubSpot pipeline weekly in Google Sheets using code and cron

medium complexityCost: $0

Prerequisites

Prerequisites
  • Node.js 18+ or Python 3.9+
  • HubSpot private app token (scopes: crm.objects.deals.read, crm.schemas.deals.read)
  • Google Cloud project with the Google Sheets API enabled
  • Google service account with a JSON key file, or OAuth2 credentials
  • The service account's email address shared as an Editor on the target Google Sheet
  • A scheduling environment: cron, GitHub Actions, or a cloud function

Step 1: Set up the project

# Verify your HubSpot token works
curl -s "https://api.hubapi.com/crm/v3/pipelines/deals" \
  -H "Authorization: Bearer $HUBSPOT_TOKEN" | head -c 200
Google service account setup

Create a service account in the Google Cloud Console, download the JSON key file, and share your Google Sheet with the service account email (e.g., snapshot@your-project.iam.gserviceaccount.com) as an Editor. Store the key file path in the GOOGLE_SERVICE_ACCOUNT_KEY environment variable.

Step 2: Fetch pipeline stages

Get a mapping of stage IDs to human-readable names.

curl -s "https://api.hubapi.com/crm/v3/pipelines/deals" \
  -H "Authorization: Bearer $HUBSPOT_TOKEN" \
  | jq '.results[0].stages[] | {id: .id, label: .label}'

Step 3: Search for active deals

Use the HubSpot Search API to pull all deals with pagination.

from hubspot.crm.deals import PublicObjectSearchRequest, Filter, FilterGroup
 
def get_all_deals(client):
    pipeline_filter = Filter(
        property_name="pipeline",
        operator="EQ",
        value="default"
    )
    filter_group = FilterGroup(filters=[pipeline_filter])
 
    all_deals = []
    after = 0
 
    while True:
        request = PublicObjectSearchRequest(
            filter_groups=[filter_group],
            properties=[
                "dealname", "amount", "dealstage",
                "closedate", "createdate"
            ],
            sorts=[{"propertyName": "amount", "direction": "DESCENDING"}],
            limit=100,
            after=after
        )
        response = client.crm.deals.search_api.do_search(
            public_object_search_request=request
        )
        all_deals.extend(response.results)
 
        if response.paging and response.paging.next:
            after = response.paging.next.after
        else:
            break
 
    return all_deals
Property values are strings

All HubSpot property values come back as strings — even numbers. amount returns "50000" not 50000. Always parse with float() / parseFloat() and handle None/null values.

Step 4: Calculate snapshot metrics

Aggregate deals into total value, count, and per-stage breakdowns.

from datetime import datetime, timezone
 
def calculate_snapshot(deals, stage_map, stage_order):
    total_value = 0
    by_stage = {}
 
    for deal in deals:
        props = deal.properties
        amount = float(props.get("amount") or 0)
        total_value += amount
 
        stage_id = props.get("dealstage", "unknown")
        stage_name = stage_map.get(stage_id, stage_id)
 
        if stage_name not in by_stage:
            by_stage[stage_name] = {"count": 0, "value": 0}
        by_stage[stage_name]["count"] += 1
        by_stage[stage_name]["value"] += amount
 
    # Build the row: date, total value, deal count, then per-stage count/value pairs
    today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
    row = [today, total_value, len(deals)]
 
    for stage_name in stage_order:
        data = by_stage.get(stage_name, {"count": 0, "value": 0})
        row.append(data["count"])
        row.append(data["value"])
 
    return row

Step 5: Append to Google Sheets

Write the snapshot row to Google Sheets using the Sheets API v4 append method.

from google.oauth2 import service_account
from googleapiclient.discovery import build
 
def append_to_sheet(row, spreadsheet_id, range_name="Sheet1!A1"):
    creds = service_account.Credentials.from_service_account_file(
        os.environ["GOOGLE_SERVICE_ACCOUNT_KEY"],
        scopes=["https://www.googleapis.com/auth/spreadsheets"]
    )
    service = build("sheets", "v4", credentials=creds)
 
    body = {"values": [row]}
    result = service.spreadsheets().values().append(
        spreadsheetId=spreadsheet_id,
        range=range_name,
        valueInputOption="USER_ENTERED",
        insertDataOption="INSERT_ROWS",
        body=body,
    ).execute()
 
    print(f"Appended {result.get('updates', {}).get('updatedRows', 0)} row(s)")
    return result
Append behavior

The Sheets API append method finds the last row with data in the specified range and inserts below it. valueInputOption: "USER_ENTERED" means Google Sheets will parse values the same way it would if you typed them — numbers stay as numbers, dates as dates.

Service account sharing

The service account is a separate Google identity. It can't see your sheets by default. You must share the spreadsheet with the service account email address (xxx@yyy.iam.gserviceaccount.com) and give it Editor access. Without this, you'll get a 403 error.

Step 6: Wire it all together

#!/usr/bin/env python3
"""Pipeline snapshot: HubSpot → Google Sheets"""
import os
import sys
from datetime import datetime, timezone
 
try:
    from hubspot import HubSpot
    from google.oauth2 import service_account
    from googleapiclient.discovery import build
except ImportError:
    os.system("pip install hubspot-api-client google-api-python-client google-auth -q")
    from hubspot import HubSpot
    from google.oauth2 import service_account
    from googleapiclient.discovery import build
 
HUBSPOT_TOKEN = os.environ.get("HUBSPOT_TOKEN")
SPREADSHEET_ID = os.environ.get("GOOGLE_SPREADSHEET_ID")
GOOGLE_KEY_FILE = os.environ.get("GOOGLE_SERVICE_ACCOUNT_KEY")
 
if not all([HUBSPOT_TOKEN, SPREADSHEET_ID, GOOGLE_KEY_FILE]):
    print("ERROR: Missing env vars: HUBSPOT_TOKEN, GOOGLE_SPREADSHEET_ID, GOOGLE_SERVICE_ACCOUNT_KEY")
    sys.exit(1)
 
# 1. Get stages
client = HubSpot(access_token=HUBSPOT_TOKEN)
pipelines = client.crm.pipelines.pipelines_api.get_all(object_type="deals")
stage_map = {}
stage_order = []
for pipeline in pipelines.results:
    for stage in pipeline.stages:
        stage_map[stage.id] = stage.label
        stage_order.append(stage.label)
 
print(f"Found {len(stage_map)} stages: {', '.join(stage_order)}")
 
# 2. Get deals (paginated)
from hubspot.crm.deals import PublicObjectSearchRequest, Filter, FilterGroup
 
pipeline_filter = Filter(property_name="pipeline", operator="EQ", value="default")
all_deals = []
after = 0
while True:
    request = PublicObjectSearchRequest(
        filter_groups=[FilterGroup(filters=[pipeline_filter])],
        properties=["dealname", "amount", "dealstage", "closedate", "createdate"],
        sorts=[{"propertyName": "amount", "direction": "DESCENDING"}],
        limit=100,
        after=after,
    )
    response = client.crm.deals.search_api.do_search(public_object_search_request=request)
    all_deals.extend(response.results)
    if response.paging and response.paging.next:
        after = response.paging.next.after
    else:
        break
 
print(f"Found {len(all_deals)} deals")
 
# 3. Calculate snapshot
total_value = 0
by_stage = {}
for deal in all_deals:
    props = deal.properties
    amount = float(props.get("amount") or 0)
    total_value += amount
    stage_name = stage_map.get(props.get("dealstage", ""), props.get("dealstage", "unknown"))
    if stage_name not in by_stage:
        by_stage[stage_name] = {"count": 0, "value": 0}
    by_stage[stage_name]["count"] += 1
    by_stage[stage_name]["value"] += amount
 
today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
row = [today, total_value, len(all_deals)]
for stage_name in stage_order:
    data = by_stage.get(stage_name, {"count": 0, "value": 0})
    row.append(data["count"])
    row.append(data["value"])
 
print(f"Snapshot: {today} | ${total_value:,.0f} | {len(all_deals)} deals")
 
# 4. Append to Google Sheets
creds = service_account.Credentials.from_service_account_file(
    GOOGLE_KEY_FILE, scopes=["https://www.googleapis.com/auth/spreadsheets"]
)
service = build("sheets", "v4", credentials=creds)
result = service.spreadsheets().values().append(
    spreadsheetId=SPREADSHEET_ID,
    range="Sheet1!A1",
    valueInputOption="USER_ENTERED",
    insertDataOption="INSERT_ROWS",
    body={"values": [row]},
).execute()
 
print(f"Appended to Google Sheets: {result.get('updates', {}).get('updatedRows', 0)} row(s)")

Step 7: Schedule with cron or GitHub Actions

Cron (server-based):

# crontab -e
0 8 * * 1 cd /path/to/pipeline-snapshot && python snapshot.py >> /var/log/pipeline-snapshot.log 2>&1

GitHub Actions (serverless):

# .github/workflows/pipeline-snapshot.yml
name: Weekly Pipeline Snapshot
on:
  schedule:
    - cron: '0 13 * * 1'  # 8 AM ET = 1 PM UTC
  workflow_dispatch: {}     # Allow manual runs
jobs:
  snapshot:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install hubspot-api-client google-api-python-client google-auth
      - run: python snapshot.py
        env:
          HUBSPOT_TOKEN: ${{ secrets.HUBSPOT_TOKEN }}
          GOOGLE_SPREADSHEET_ID: ${{ secrets.GOOGLE_SPREADSHEET_ID }}
          GOOGLE_SERVICE_ACCOUNT_KEY: ${{ secrets.GOOGLE_SERVICE_ACCOUNT_KEY }}
Service account key in CI

For GitHub Actions, store the service account JSON key as a secret. You have two options: (1) save the entire JSON content as a secret and write it to a temp file at runtime, or (2) base64-encode the key, store that as a secret, and decode it in the workflow. Option 1 is simpler — just use echo "$KEY_JSON" > /tmp/sa-key.json and set GOOGLE_SERVICE_ACCOUNT_KEY=/tmp/sa-key.json.

Rate limits

APILimitImpact
HubSpot general150 req / 10 sec per accountNo concern for weekly snapshot
HubSpot Search5 req / secMatters with pagination (add 200ms delay between pages)
Google Sheets API300 req / min per projectNo concern for a single append
Google Sheets API60 req / min per user per projectNo concern

Trade-offs

  • Full control over data transformation, column ordering, and snapshot logic
  • Version controlled — track changes in git, review in PRs
  • No per-execution cost — runs on existing infrastructure (GitHub Actions free tier includes 2,000 min/month)
  • Google Sheets API is free — standard quotas are generous for weekly snapshots
  • Maintenance overhead — you handle errors, monitoring, and credential rotation
  • No visual debugging — logs, not a visual flow builder

Need help implementing this?

We build and optimize automation systems for mid-market businesses. Let's discuss the right approach for your team.