Enrich HubSpot companies with technographic data from BuiltWith using code

medium complexityCost: $0Recommended

Prerequisites

Prerequisites
  • Node.js 18+ or Python 3.9+
  • HubSpot private app token with crm.objects.companies.read and crm.objects.companies.write scopes
  • BuiltWith API key (Pro plan or above for API access)
  • Custom HubSpot company properties: tech_stack_crm, tech_stack_marketing, tech_stack_analytics, tech_stack_all, tech_enrichment_date
  • A scheduling environment: cron or GitHub Actions

Why code?

A script gives you full control over the category taxonomy, error handling, and batch logic. You can customize the CATEGORY_MAP to match your exact competitive landscape, add competitor detection flags, and export results to CSV for team review. Zero platform fees — the only cost is BuiltWith credits.

The trade-off is maintenance. You own the rate limiting, retry logic, and monitoring. There's no visual execution history. If non-technical team members need to adjust the category mapping or trigger a run, use n8n or Make instead.

Step 1: Set up the project

# Test the BuiltWith API
curl "https://api.builtwith.com/v21/api.json?KEY=$BUILTWITH_API_KEY&LOOKUP=hubspot.com" \
  | python3 -m json.tool | head -50

Step 2: Fetch companies missing tech data from HubSpot

import requests
import os
import time
from datetime import datetime
 
HUBSPOT_ACCESS_TOKEN = os.environ["HUBSPOT_ACCESS_TOKEN"]
BUILTWITH_API_KEY = os.environ["BUILTWITH_API_KEY"]
HS_HEADERS = {"Authorization": f"Bearer {HUBSPOT_ACCESS_TOKEN}", "Content-Type": "application/json"}
 
def get_companies_without_tech(limit=50):
    companies = []
    after = 0
 
    while len(companies) < limit:
        resp = requests.post(
            "https://api.hubapi.com/crm/v3/objects/companies/search",
            headers=HS_HEADERS,
            json={
                "filterGroups": [{"filters": [{
                    "propertyName": "tech_stack_crm",
                    "operator": "NOT_HAS_PROPERTY"
                }]}],
                "properties": ["domain", "name"],
                "limit": min(100, limit - len(companies)),
                "after": after
            }
        )
        resp.raise_for_status()
        data = resp.json()
        companies.extend(data["results"])
 
        if data.get("paging", {}).get("next"):
            after = data["paging"]["next"]["after"]
        else:
            break
 
    return companies

Step 3: Look up tech stacks via BuiltWith

CATEGORY_MAP = {
    "crm": ["crm"],
    "marketing": ["marketing-automation", "email", "marketing"],
    "analytics": ["analytics", "web-analytics"],
    "ecommerce": ["ecommerce", "shopping-cart", "payment"],
    "hosting": ["hosting", "cdn", "cloud-paas"],
}
 
def lookup_tech_stack(domain):
    """Call BuiltWith and categorize technologies."""
    resp = requests.get(
        "https://api.builtwith.com/v21/api.json",
        params={"KEY": BUILTWITH_API_KEY, "LOOKUP": domain}
    )
    resp.raise_for_status()
    data = resp.json()
 
    # Extract technologies from the nested response
    results = data.get("Results", [])
    if not results:
        return None
 
    paths = results[0].get("Result", {}).get("Paths", [])
    if not paths:
        return None
 
    technologies = paths[0].get("Technologies", [])
    if not technologies:
        return None
 
    # Categorize
    categorized = {cat: [] for cat in CATEGORY_MAP}
    for tech in technologies:
        tag = (tech.get("Tag") or "").lower()
        cats = [c.lower() for c in (tech.get("Categories") or [])]
        for category, keywords in CATEGORY_MAP.items():
            if any(kw in tag for kw in keywords) or any(kw in c for kw in keywords for c in cats):
                categorized[category].append(tech["Name"])
                break
 
    return {
        "tech_stack_crm": ", ".join(categorized["crm"]) or "None detected",
        "tech_stack_marketing": ", ".join(categorized["marketing"]) or "None detected",
        "tech_stack_analytics": ", ".join(categorized["analytics"]) or "None detected",
        "tech_stack_all": ", ".join(t["Name"] for t in technologies),
        "tech_count": len(technologies),
    }
BuiltWith response for unknown domains

If BuiltWith doesn't recognize a domain, it may return an empty Results array or a result with no Paths. Always check for empty/missing data at each nesting level. Don't assume the structure is always fully populated.

Step 4: Update HubSpot companies

def update_company_tech(company_id, tech_data):
    """Write tech stack data to HubSpot company."""
    properties = {
        **tech_data,
        "tech_enrichment_date": datetime.now().strftime("%Y-%m-%d"),
    }
    # Remove tech_count from HubSpot update (internal metric only)
    properties.pop("tech_count", None)
 
    resp = requests.patch(
        f"https://api.hubapi.com/crm/v3/objects/companies/{company_id}",
        headers=HS_HEADERS,
        json={"properties": properties}
    )
    resp.raise_for_status()
 
def main():
    companies = get_companies_without_tech(limit=50)
    print(f"Found {len(companies)} companies to enrich\n")
 
    enriched = 0
    skipped = 0
 
    for company in companies:
        domain = company["properties"].get("domain")
        name = company["properties"].get("name", "Unknown")
 
        if not domain:
            print(f"  {name} — no domain, skipping")
            skipped += 1
            continue
 
        tech_data = lookup_tech_stack(domain)
        if not tech_data:
            print(f"  {name} ({domain}) — no tech data found")
            skipped += 1
            time.sleep(2)
            continue
 
        update_company_tech(company["id"], tech_data)
        enriched += 1
        print(f"  {name} ({domain}) — {tech_data['tech_count']} technologies")
        print(f"    CRM: {tech_data['tech_stack_crm']}")
        print(f"    Marketing: {tech_data['tech_stack_marketing']}")
        print(f"    Analytics: {tech_data['tech_stack_analytics']}")
 
        time.sleep(2)  # BuiltWith rate limit
 
    print(f"\nDone. Enriched: {enriched}, Skipped: {skipped}")
 
if __name__ == "__main__":
    main()

Step 5: Schedule the script

# .github/workflows/tech-enrichment.yml
name: Technographic Enrichment
on:
  schedule:
    - cron: '0 3 * * 0'  # Weekly on Sunday at 3 AM UTC
  workflow_dispatch: {}
jobs:
  enrich:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install requests
      - run: python tech_enrich.py
        env:
          HUBSPOT_ACCESS_TOKEN: ${{ secrets.HUBSPOT_ACCESS_TOKEN }}
          BUILTWITH_API_KEY: ${{ secrets.BUILTWITH_API_KEY }}

Rate limits

APILimitDelay
BuiltWithVaries by plan (typically 1-2 req/sec)2 seconds between calls
HubSpot Search5 req/sec200ms between pages
HubSpot PATCH150 req/10 secNo delay needed

Troubleshooting

Cost

  • BuiltWith Pro: $295/mo for 500 API calls ($0.59/lookup). Enterprise: $495/mo for 2,000 calls ($0.25/lookup).
  • HubSpot: Free within API rate limits.
  • GitHub Actions: Free tier (2,000 min/month).
  • Budget tip: At 50 companies/week, you'll use 200 calls/month — well within the Pro plan's 500-call limit.
BuiltWith bills per lookup, not per technology

You pay 1 API call per domain lookup, regardless of how many technologies BuiltWith detects on that domain. A domain with 3 technologies costs the same as one with 300. This makes BuiltWith more cost-effective for companies with large tech stacks.

Common questions

How much does it cost to enrich 50 companies per week?

50 BuiltWith lookups/week = 200/month. On the Pro plan ($295/mo, 500 calls), that's $1.48 per lookup when you factor in the monthly fee. GitHub Actions and HubSpot are free. Total: $295/mo for BuiltWith + $0 for compute.

What if BuiltWith's tag taxonomy changes?

BuiltWith occasionally updates tag names (e.g., "crm" might become "customer-relationship-management"). The script's CATEGORY_MAP uses substring matching, so small changes are usually caught. Run the script with verbose logging every quarter to check for uncategorized technologies and update the map.

Can I batch the HubSpot updates instead of patching one at a time?

Yes. HubSpot offers a batch update endpoint (POST /crm/v3/objects/companies/batch/update) that accepts up to 100 records per request. This reduces the number of HubSpot API calls but adds complexity to the update payload construction. For batches under 100, individual PATCH calls are simpler and well within rate limits.

Next steps

  • Detect specific competitors — add a check for competitor product names and set a boolean uses_competitor_product property for easy filtering
  • Segment by tech maturity — companies with 50+ technologies are likely tech-savvy enterprises; companies with 5-10 are leaner. Use tech_count for segmentation.
  • Track changes over time — store previous tech stack data and compare on re-enrichment to detect when a company adds or drops a tool

Looking to scale your AI operations?

We build and optimize automation systems for mid-market businesses. Let's discuss the right approach for your team.