Enrich HubSpot companies with technographic data from BuiltWith using code
Prerequisites
- Node.js 18+ or Python 3.9+
- HubSpot private app token with
crm.objects.companies.readandcrm.objects.companies.writescopes - BuiltWith API key (Pro plan or above for API access)
- Custom HubSpot company properties:
tech_stack_crm,tech_stack_marketing,tech_stack_analytics,tech_stack_all,tech_enrichment_date - A scheduling environment: cron or GitHub Actions
Why code?
A script gives you full control over the category taxonomy, error handling, and batch logic. You can customize the CATEGORY_MAP to match your exact competitive landscape, add competitor detection flags, and export results to CSV for team review. Zero platform fees — the only cost is BuiltWith credits.
The trade-off is maintenance. You own the rate limiting, retry logic, and monitoring. There's no visual execution history. If non-technical team members need to adjust the category mapping or trigger a run, use n8n or Make instead.
Step 1: Set up the project
# Test the BuiltWith API
curl "https://api.builtwith.com/v21/api.json?KEY=$BUILTWITH_API_KEY&LOOKUP=hubspot.com" \
| python3 -m json.tool | head -50Step 2: Fetch companies missing tech data from HubSpot
import requests
import os
import time
from datetime import datetime
HUBSPOT_ACCESS_TOKEN = os.environ["HUBSPOT_ACCESS_TOKEN"]
BUILTWITH_API_KEY = os.environ["BUILTWITH_API_KEY"]
HS_HEADERS = {"Authorization": f"Bearer {HUBSPOT_ACCESS_TOKEN}", "Content-Type": "application/json"}
def get_companies_without_tech(limit=50):
companies = []
after = 0
while len(companies) < limit:
resp = requests.post(
"https://api.hubapi.com/crm/v3/objects/companies/search",
headers=HS_HEADERS,
json={
"filterGroups": [{"filters": [{
"propertyName": "tech_stack_crm",
"operator": "NOT_HAS_PROPERTY"
}]}],
"properties": ["domain", "name"],
"limit": min(100, limit - len(companies)),
"after": after
}
)
resp.raise_for_status()
data = resp.json()
companies.extend(data["results"])
if data.get("paging", {}).get("next"):
after = data["paging"]["next"]["after"]
else:
break
return companiesStep 3: Look up tech stacks via BuiltWith
CATEGORY_MAP = {
"crm": ["crm"],
"marketing": ["marketing-automation", "email", "marketing"],
"analytics": ["analytics", "web-analytics"],
"ecommerce": ["ecommerce", "shopping-cart", "payment"],
"hosting": ["hosting", "cdn", "cloud-paas"],
}
def lookup_tech_stack(domain):
"""Call BuiltWith and categorize technologies."""
resp = requests.get(
"https://api.builtwith.com/v21/api.json",
params={"KEY": BUILTWITH_API_KEY, "LOOKUP": domain}
)
resp.raise_for_status()
data = resp.json()
# Extract technologies from the nested response
results = data.get("Results", [])
if not results:
return None
paths = results[0].get("Result", {}).get("Paths", [])
if not paths:
return None
technologies = paths[0].get("Technologies", [])
if not technologies:
return None
# Categorize
categorized = {cat: [] for cat in CATEGORY_MAP}
for tech in technologies:
tag = (tech.get("Tag") or "").lower()
cats = [c.lower() for c in (tech.get("Categories") or [])]
for category, keywords in CATEGORY_MAP.items():
if any(kw in tag for kw in keywords) or any(kw in c for kw in keywords for c in cats):
categorized[category].append(tech["Name"])
break
return {
"tech_stack_crm": ", ".join(categorized["crm"]) or "None detected",
"tech_stack_marketing": ", ".join(categorized["marketing"]) or "None detected",
"tech_stack_analytics": ", ".join(categorized["analytics"]) or "None detected",
"tech_stack_all": ", ".join(t["Name"] for t in technologies),
"tech_count": len(technologies),
}If BuiltWith doesn't recognize a domain, it may return an empty Results array or a result with no Paths. Always check for empty/missing data at each nesting level. Don't assume the structure is always fully populated.
Step 4: Update HubSpot companies
def update_company_tech(company_id, tech_data):
"""Write tech stack data to HubSpot company."""
properties = {
**tech_data,
"tech_enrichment_date": datetime.now().strftime("%Y-%m-%d"),
}
# Remove tech_count from HubSpot update (internal metric only)
properties.pop("tech_count", None)
resp = requests.patch(
f"https://api.hubapi.com/crm/v3/objects/companies/{company_id}",
headers=HS_HEADERS,
json={"properties": properties}
)
resp.raise_for_status()
def main():
companies = get_companies_without_tech(limit=50)
print(f"Found {len(companies)} companies to enrich\n")
enriched = 0
skipped = 0
for company in companies:
domain = company["properties"].get("domain")
name = company["properties"].get("name", "Unknown")
if not domain:
print(f" {name} — no domain, skipping")
skipped += 1
continue
tech_data = lookup_tech_stack(domain)
if not tech_data:
print(f" {name} ({domain}) — no tech data found")
skipped += 1
time.sleep(2)
continue
update_company_tech(company["id"], tech_data)
enriched += 1
print(f" {name} ({domain}) — {tech_data['tech_count']} technologies")
print(f" CRM: {tech_data['tech_stack_crm']}")
print(f" Marketing: {tech_data['tech_stack_marketing']}")
print(f" Analytics: {tech_data['tech_stack_analytics']}")
time.sleep(2) # BuiltWith rate limit
print(f"\nDone. Enriched: {enriched}, Skipped: {skipped}")
if __name__ == "__main__":
main()Step 5: Schedule the script
# .github/workflows/tech-enrichment.yml
name: Technographic Enrichment
on:
schedule:
- cron: '0 3 * * 0' # Weekly on Sunday at 3 AM UTC
workflow_dispatch: {}
jobs:
enrich:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install requests
- run: python tech_enrich.py
env:
HUBSPOT_ACCESS_TOKEN: ${{ secrets.HUBSPOT_ACCESS_TOKEN }}
BUILTWITH_API_KEY: ${{ secrets.BUILTWITH_API_KEY }}Rate limits
| API | Limit | Delay |
|---|---|---|
| BuiltWith | Varies by plan (typically 1-2 req/sec) | 2 seconds between calls |
| HubSpot Search | 5 req/sec | 200ms between pages |
| HubSpot PATCH | 150 req/10 sec | No delay needed |
Troubleshooting
Cost
- BuiltWith Pro: $295/mo for 500 API calls (
$0.59/lookup). Enterprise: $495/mo for 2,000 calls ($0.25/lookup). - HubSpot: Free within API rate limits.
- GitHub Actions: Free tier (2,000 min/month).
- Budget tip: At 50 companies/week, you'll use 200 calls/month — well within the Pro plan's 500-call limit.
You pay 1 API call per domain lookup, regardless of how many technologies BuiltWith detects on that domain. A domain with 3 technologies costs the same as one with 300. This makes BuiltWith more cost-effective for companies with large tech stacks.
Common questions
How much does it cost to enrich 50 companies per week?
50 BuiltWith lookups/week = 200/month. On the Pro plan ($295/mo, 500 calls), that's $1.48 per lookup when you factor in the monthly fee. GitHub Actions and HubSpot are free. Total: $295/mo for BuiltWith + $0 for compute.
What if BuiltWith's tag taxonomy changes?
BuiltWith occasionally updates tag names (e.g., "crm" might become "customer-relationship-management"). The script's CATEGORY_MAP uses substring matching, so small changes are usually caught. Run the script with verbose logging every quarter to check for uncategorized technologies and update the map.
Can I batch the HubSpot updates instead of patching one at a time?
Yes. HubSpot offers a batch update endpoint (POST /crm/v3/objects/companies/batch/update) that accepts up to 100 records per request. This reduces the number of HubSpot API calls but adds complexity to the update payload construction. For batches under 100, individual PATCH calls are simpler and well within rate limits.
Next steps
- Detect specific competitors — add a check for competitor product names and set a boolean
uses_competitor_productproperty for easy filtering - Segment by tech maturity — companies with 50+ technologies are likely tech-savvy enterprises; companies with 5-10 are leaner. Use
tech_countfor segmentation. - Track changes over time — store previous tech stack data and compare on re-enrichment to detect when a company adds or drops a tool
Looking to scale your AI operations?
We build and optimize automation systems for mid-market businesses. Let's discuss the right approach for your team.