Enrich HubSpot companies with technographic data from BuiltWith using code

medium complexityCost: $0Recommended

Prerequisites

Prerequisites
  • Node.js 18+ or Python 3.9+
  • HubSpot private app token with crm.objects.companies.read and crm.objects.companies.write scopes
  • BuiltWith API key (Pro plan or above for API access)
  • Custom HubSpot company properties: tech_stack_crm, tech_stack_marketing, tech_stack_analytics, tech_stack_all, tech_enrichment_date
  • A scheduling environment: cron or GitHub Actions

Step 1: Set up the project

# Test the BuiltWith API
curl "https://api.builtwith.com/v21/api.json?KEY=$BUILTWITH_API_KEY&LOOKUP=hubspot.com" \
  | python3 -m json.tool | head -50

Step 2: Fetch companies missing tech data from HubSpot

import requests
import os
import time
from datetime import datetime
 
HUBSPOT_TOKEN = os.environ["HUBSPOT_TOKEN"]
BUILTWITH_API_KEY = os.environ["BUILTWITH_API_KEY"]
HS_HEADERS = {"Authorization": f"Bearer {HUBSPOT_TOKEN}", "Content-Type": "application/json"}
 
def get_companies_without_tech(limit=50):
    companies = []
    after = 0
 
    while len(companies) < limit:
        resp = requests.post(
            "https://api.hubapi.com/crm/v3/objects/companies/search",
            headers=HS_HEADERS,
            json={
                "filterGroups": [{"filters": [{
                    "propertyName": "tech_stack_crm",
                    "operator": "NOT_HAS_PROPERTY"
                }]}],
                "properties": ["domain", "name"],
                "limit": min(100, limit - len(companies)),
                "after": after
            }
        )
        resp.raise_for_status()
        data = resp.json()
        companies.extend(data["results"])
 
        if data.get("paging", {}).get("next"):
            after = data["paging"]["next"]["after"]
        else:
            break
 
    return companies

Step 3: Look up tech stacks via BuiltWith

CATEGORY_MAP = {
    "crm": ["crm"],
    "marketing": ["marketing-automation", "email", "marketing"],
    "analytics": ["analytics", "web-analytics"],
    "ecommerce": ["ecommerce", "shopping-cart", "payment"],
    "hosting": ["hosting", "cdn", "cloud-paas"],
}
 
def lookup_tech_stack(domain):
    """Call BuiltWith and categorize technologies."""
    resp = requests.get(
        "https://api.builtwith.com/v21/api.json",
        params={"KEY": BUILTWITH_API_KEY, "LOOKUP": domain}
    )
    resp.raise_for_status()
    data = resp.json()
 
    # Extract technologies from the nested response
    results = data.get("Results", [])
    if not results:
        return None
 
    paths = results[0].get("Result", {}).get("Paths", [])
    if not paths:
        return None
 
    technologies = paths[0].get("Technologies", [])
    if not technologies:
        return None
 
    # Categorize
    categorized = {cat: [] for cat in CATEGORY_MAP}
    for tech in technologies:
        tag = (tech.get("Tag") or "").lower()
        cats = [c.lower() for c in (tech.get("Categories") or [])]
        for category, keywords in CATEGORY_MAP.items():
            if any(kw in tag for kw in keywords) or any(kw in c for kw in keywords for c in cats):
                categorized[category].append(tech["Name"])
                break
 
    return {
        "tech_stack_crm": ", ".join(categorized["crm"]) or "None detected",
        "tech_stack_marketing": ", ".join(categorized["marketing"]) or "None detected",
        "tech_stack_analytics": ", ".join(categorized["analytics"]) or "None detected",
        "tech_stack_all": ", ".join(t["Name"] for t in technologies),
        "tech_count": len(technologies),
    }
BuiltWith response for unknown domains

If BuiltWith doesn't recognize a domain, it may return an empty Results array or a result with no Paths. Always check for empty/missing data at each nesting level. Don't assume the structure is always fully populated.

Step 4: Update HubSpot companies

def update_company_tech(company_id, tech_data):
    """Write tech stack data to HubSpot company."""
    properties = {
        **tech_data,
        "tech_enrichment_date": datetime.now().strftime("%Y-%m-%d"),
    }
    # Remove tech_count from HubSpot update (internal metric only)
    properties.pop("tech_count", None)
 
    resp = requests.patch(
        f"https://api.hubapi.com/crm/v3/objects/companies/{company_id}",
        headers=HS_HEADERS,
        json={"properties": properties}
    )
    resp.raise_for_status()
 
def main():
    companies = get_companies_without_tech(limit=50)
    print(f"Found {len(companies)} companies to enrich\n")
 
    enriched = 0
    skipped = 0
 
    for company in companies:
        domain = company["properties"].get("domain")
        name = company["properties"].get("name", "Unknown")
 
        if not domain:
            print(f"  {name} — no domain, skipping")
            skipped += 1
            continue
 
        tech_data = lookup_tech_stack(domain)
        if not tech_data:
            print(f"  {name} ({domain}) — no tech data found")
            skipped += 1
            time.sleep(2)
            continue
 
        update_company_tech(company["id"], tech_data)
        enriched += 1
        print(f"  {name} ({domain}) — {tech_data['tech_count']} technologies")
        print(f"    CRM: {tech_data['tech_stack_crm']}")
        print(f"    Marketing: {tech_data['tech_stack_marketing']}")
        print(f"    Analytics: {tech_data['tech_stack_analytics']}")
 
        time.sleep(2)  # BuiltWith rate limit
 
    print(f"\nDone. Enriched: {enriched}, Skipped: {skipped}")
 
if __name__ == "__main__":
    main()

Step 5: Schedule the script

# .github/workflows/tech-enrichment.yml
name: Technographic Enrichment
on:
  schedule:
    - cron: '0 3 * * 0'  # Weekly on Sunday at 3 AM UTC
  workflow_dispatch: {}
jobs:
  enrich:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install requests
      - run: python tech_enrich.py
        env:
          HUBSPOT_TOKEN: ${{ secrets.HUBSPOT_TOKEN }}
          BUILTWITH_API_KEY: ${{ secrets.BUILTWITH_API_KEY }}

Rate limits

APILimitDelay
BuiltWithVaries by plan (typically 1-2 req/sec)2 seconds between calls
HubSpot Search5 req/sec200ms between pages
HubSpot PATCH150 req/10 secNo delay needed

Cost

  • BuiltWith Pro: $295/mo for 500 API calls ($0.59/lookup). Enterprise: $495/mo for 2,000 calls ($0.25/lookup).
  • HubSpot: Free within API rate limits.
  • GitHub Actions: Free tier (2,000 min/month).
  • Budget tip: At 50 companies/week, you'll use 200 calls/month — well within the Pro plan's 500-call limit.
BuiltWith bills per lookup, not per technology

You pay 1 API call per domain lookup, regardless of how many technologies BuiltWith detects on that domain. A domain with 3 technologies costs the same as one with 300. This makes BuiltWith more cost-effective for companies with large tech stacks.

Next steps

  • Detect specific competitors — add a check for competitor product names and set a boolean uses_competitor_product property for easy filtering
  • Segment by tech maturity — companies with 50+ technologies are likely tech-savvy enterprises; companies with 5-10 are leaner. Use tech_count for segmentation.
  • Track changes over time — store previous tech stack data and compare on re-enrichment to detect when a company adds or drops a tool

Need help implementing this?

We build and optimize automation systems for mid-market businesses. Let's discuss the right approach for your team.