view .md
How to · Grow

Rate limits in production: surviving HubSpot's per-app-install budget at scale.

A Worker that never throws a 429 at your user, with retry, backoff, and jitter tuned to HubSpot's published budget for your app's tier, fair sharing between syncs and on-demand actions, and a dashboard that tells you how close to the cap you are before HubSpot does. The reference page at /docs/rate-limits has the numbers; this guide has the patterns.

Time
≈ 12 min
Outcome
A Worker that never throws a 429 to your user, with retry/backoff/jitter tuned for HubSpot's published budget, fair sharing across syncs and on-demand actions, and a dashboard that tells you how close to the cap you are before HubSpot does.
Prerequisites
  • A deployed HS-X project (if not, run through /docs/guides/getting-started first)
  • Your Worker is making real HubSpot calls — a sync, a workflow action, or a UI extension server function
  • Read access to your Cloudflare account so you can view Worker logs and metrics
  • Optional: the /docs/rate-limits reference open in another tab for the canonical numbers

Before you begin

The single most important thing to internalize is this. For a marketplace OAuth app, HubSpot rate-limits each install to 110 requests per 10 seconds, applied per (app, installed account). Each app you ship gets its own bucket on every portal it's installed in — your app does not share that bucket with the Salesforce sync, the Zapier connector, or any other OAuth app the customer has installed. Private apps run on a different budget entirely (100 req/10s on Free/Starter, 190 req/10s on Pro/Enterprise). Confirm the exact tier numbers for your situation against the usage guidelines before tuning.

The 110/10s ceiling is a rolling window, not a fixed bucket that resets on the second — short spikes within the window are fine as long as the 10-second total stays under cap. The other fact that shapes every decision in this guide. Batch endpoints cap at 100 records per request and count as 1 request against the limit, which is the single largest lever you have for surviving high write volume. The full per-tier table lives at /docs/rate-limits; the rest of this page covers what to do with those numbers.

How HS-X tracks the budget

Every HubSpot response carries X-HubSpot-RateLimit-Daily, X-HubSpot-RateLimit-Daily-Remaining, X-HubSpot-RateLimit-Interval-Milliseconds, X-HubSpot-RateLimit-Max, and X-HubSpot-RateLimit-Remaining — the -Max/-Remaining pair describes the current rolling-window budget for your app on that portal. (HubSpot also still emits X-HubSpot-RateLimit-Secondly and -Secondly-Remaining, but those have been marked deprecated since 2018; don't build new code against them.) Those headers are the only honest source of remaining budget — counting calls in your own code drifts the moment another caller on the same app install makes a request. The runtime's HTTP layer reads them on every response and feeds them into a Durable Object token bucket keyed by (appId, portalId), which every sync, action, and tool on the Worker shares.

Budget topology
01
Your callers
syncs · workflow actions · UI server functions · agent tools
02
Shared bucket
Durable Object · keyed by (appId, portalId) · reads response headers
03
HubSpot
110 req / 10s rolling (per app install) · Search API has its own 5 req/s

One more thing worth knowing up front. The Search API has its own separate 5-req/s cap that does not count against the 110/10s pool. It also caps page size at 200 records and won't paginate past a 10,000-record total per query. Hot paths through Search are the single most common cause of "we got 429s but our dashboard said we had headroom" tickets. We cover what to do about it in step 5.

Wrap HubSpot calls with retry + backoff

Design preview

The http.fetch retry helper, the http.batch.upsert wrapper, the shared (appId, portalId) token bucket, worker.metrics.enable(), worker.action(..., { concurrency: 1 }), worker.sync(..., { batch }), and the hs-x doctor --ratelimit command described in this guide are documented design surface for the runtime — they're how we plan for the patterns to land. Today, your handler receives context.fetch (a standard Web fetch bound to the installed account's OAuth token), and you wire retries, jitter, batching, and metrics yourself. The snippets below show the design-preview API; the real example after them is what you can implement against today's SDK.

You almost never want to call raw fetch against HubSpot without a retry layer, because there are five things you'd otherwise reimplement badly: reading the rate-limit headers on every response, serializing through a shared token bucket, retrying 429 and 5xx with exponential backoff plus jitter, surfacing the request to the dev-mode log with timing, and normalizing the hubspot-api-nodejs "throws on 404" misbehaviour into a regular response with a status field.

// Design preview — not yet wired in @hs-x/sdk.
import { defineWorker } from '@hs-x/sdk';
 
export default defineWorker(({ worker }) => {
  worker.action('enrich-contact', async ({ input, http }) => {
    const res = await http.fetch(`/crm/v3/objects/contacts/${input.id}`);
    if (res.status === 404) return { found: false };
    return { found: true, contact: res.body };
  });
});

The interesting bits are the parts that touch the rate limit:

  • Header parsing. Every response updates the shared bucket with the freshest X-HubSpot-RateLimit-Remaining value HubSpot reported. Even a request that succeeds feeds the next request's decision about whether to wait.
  • Pre-flight wait. Before the call goes out, the layer asks the bucket for a token. If headroom is below 5% (configurable), the call sleeps until the rolling window opens or yields to a higher-priority caller. You see this in the dev log as a RATELIMIT WAIT 412ms line.
  • Retry on 429. If HubSpot returns 429 anyway, the layer reads the Retry-After header, waits, and retries up to the configured max times with exponential backoff and decorrelated jitter. The retry budget is per-call, not global, so a slow loop doesn't starve fast ones.
  • Idempotency. For GET, PUT, and DELETE the retry is unconditional. For POST it retries only when the response is a 429 or a 5xx that arrived before the body was acknowledged. That avoids the classic "we created the contact twice because the timeout happened mid-write" bug.

What to do today: fetch-with-retry against context.fetch

The same shape, hand-rolled against the runtime you have right now:

async function hubspotFetch(
  fetch: typeof globalThis.fetch,
  url: string,
  init: RequestInit = {},
  retry = { max: 5, baseMs: 200, capMs: 10_000 },
): Promise<Response> {
  for (let attempt = 0; ; attempt++) {
    const res = await fetch(url, init);
    if (res.status !== 429 && res.status < 500) return res;
    if (attempt >= retry.max) return res;
    const retryAfter = Number(res.headers.get('retry-after')) * 1000;
    const backoff = Math.min(retry.capMs, retry.baseMs * 2 ** attempt);
    const wait = Number.isFinite(retryAfter) && retryAfter > 0
      ? retryAfter
      : Math.random() * backoff; // decorrelated jitter
    await new Promise((r) => setTimeout(r, wait));
  }
}

Calling raw fetch from your handler still counts against the portal limit, but nothing else in the Worker sees that the budget moved. If you go this route, funnel every HubSpot call through the helper above so backoff state is at least consistent within a single invocation.

Tune the retry policy per call site

The planned defaults are tuned for background work: { max: 5, baseMs: 200, capMs: 10_000 }. That's correct for a sync because the alternative — failing fast — means you lose the record. It's wrong for an interactive UI extension or a workflow action where the user is staring at a spinner. Override per call. (The http.fetch(url, { retry }) option is still design-preview — for now, pass a per-call retry object into the hubspotFetch helper above.)

// Design preview API. Interactive: fail fast, surface the error, let the UI offer a retry button.
const res = await http.fetch(`/crm/v3/objects/deals/${id}`, {
  retry: { max: 1, baseMs: 100, capMs: 500 },
});
 
// Sync engine: patient, retries hard, takes its time.
const res = await http.fetch('/crm/v3/objects/contacts/batch/upsert', {
  method: 'POST',
  body: chunk,
  retry: { max: 8, baseMs: 500, capMs: 30_000 },
});
 
// Webhook handler: in between. The portal will retry the webhook itself
// if you 500, so don't sit on the connection forever.
const res = await http.fetch(url, {
  retry: { max: 3, baseMs: 200, capMs: 4_000 },
});

Picking max, baseMs, and capMs

A short tour of the three knobs and how they interact, because picking values without a model in your head usually produces something that's wrong in the worst way.

  • max is the number of retries, not attempts. max: 5 means up to 6 total calls. Set it low when there's a human waiting (1 or 2), high when there isn't (5 to 8). Beyond 8 you're rarely getting more reliability, just longer tail latency on a portal that's already saturated.
  • baseMs is the first backoff. The actual wait on attempt N is roughly random(0, min(capMs, baseMs * 2^N)) — that's decorrelated jitter, which avoids the synchronized-retry-storm failure mode where every caller backs off the same amount and slams the API at the same instant. Start at 200ms for interactive, 500ms for background.
  • capMs is the ceiling on a single backoff. Important when max is high, because baseMs * 2^8 = 51,200ms is almost certainly longer than you want to wait. Set it to your user-perceptible budget for interactive calls (say 500ms to 1s) and to your sync's tolerance for tail latency for background work (10s to 30s is normal).

When the defaults are wrong on purpose

Two cases worth calling out. First, if your call is inside a workflow action that HubSpot itself will retry on failure, you want max: 1 to fail fast and let HubSpot's own retry handle it — otherwise you've built two retry loops stacked on each other and your timing math is wrong. Second, if you're in a scheduled (cron) job that runs every minute, set capMs below 60_000 so a retry storm can't make the next invocation overlap the previous one.

Batch correctly with /batch/upsert

Single-record writes are the most common reason a healthy-looking Worker hits the rate limit. A sync that writes 1,000 contacts one at a time burns 1,000 requests; the same sync using /crm/v3/objects/contacts/batch/upsert burns 10. Every write path in production should be batched unless you have a specific reason it can't be.

// Design preview — the worker.sync `batch` option is documented here for
// the API we're building toward. Today, drive the batch endpoint manually
// using the hubspotFetch helper from step 1.
import { defineSource, defineWorker, env } from '@hs-x/sdk';
 
export default defineWorker(({ worker }) => {
  worker.sync(stripeCustomers, {
    into: 'contacts',
    schedule: '15m',
    schema: {
      email: 'email',
      stripe_customer_id: 'string',
      mrr_cents: 'number',
      plan: { type: 'enum', values: ['free', 'starter', 'pro', 'enterprise'] },
    },
    // The two knobs are the chunk size (default 100, which is the HubSpot
    // max) and the idProperty used for upsert.
    batch: { size: 100, idProperty: 'stripe_customer_id' },
  });
});

If you need to write outside a sync — say, in a workflow action that processes a list of records — the design-preview http.batch.upsert helper will chunk the array into 100-record requests, run them with the configured concurrency, and return the combined result with per-record success and failure:

// Design preview — wrap a real POST to /crm/v3/objects/contacts/batch/upsert today.
const result = await http.batch.upsert('contacts', records, {
  idProperty: 'stripe_customer_id',
  concurrency: 2,
});
// result.created, result.updated, result.failed are all typed arrays.

The idProperty rules that bite people

HubSpot's /batch/upsert endpoint is upsert by a specific property, and the rules around that property are not symmetric with single-record upsert. Three things to know before you ship a batch write:

  • The idProperty must be unique on the object. That's email and hs_object_id for contacts out of the box, plus any custom property you've marked unique in property settings. If you try to upsert by a non-unique property, the call 400s on every record in the batch.
  • email as an idProperty does not support partial upsert on contacts. HubSpot's object APIs require a custom unique property when you want partial upserts on contacts — passing idProperty=email will reject the partial-upsert path. The fix is to declare a custom unique property (for example stripe_customer_id) and upsert by that; reserve email for the create path.
  • Properties not in the payload are not cleared. This is the upsert-vs-replace distinction. Sending { email, mrr_cents: 0 } sets mrr_cents to 0 and leaves every other property alone. If you want to clear a property explicitly, send it as null — omitting it does nothing.

Watching the chunk boundary

The 100-record cap is hard. http.batch.upsert chunks for you, but if you're rolling your own (say, because you need custom error handling per chunk), the call shape is POST /crm/v3/objects/{object}/batch/upsert with { inputs: [...100 records] }. Counting against the limit, this is one request, which is the whole point. A 1,000-record sync at 100-per-chunk is 10 requests; running them serially fits inside a single 10-second window with 100 requests of headroom for everything else on the portal.

Avoid the Search API in hot paths

The Search API (/crm/v3/objects/{object}/search) has its own rate limit that does not share with the main pool, and it is tight. Five requests per second, regardless of your subscription tier, with page size capped at 200 records and a 10,000-record hard ceiling per query. A UI extension that calls Search to filter a list of deals will hit the cap with a handful of people refreshing the same record at the same second. If your dashboard says you have plenty of headroom and you're still seeing 429s, this is almost always why.

PatternAPILimitUse when
Get one record by id/crm/v3/objects/{type}/{id}110/10s app-install poolYou have the object id
Get by unique property/crm/v3/objects/{type}/{id}?idProperty=email110/10s app-install poolYou have email or another unique prop
List paginated/crm/v3/objects/{type}110/10s app-install poolYou need every record
Filter by 1 propertyStored list or pre-indexed propertyn/aYou filter on this property often
Filter by 2+ properties/crm/v3/objects/{type}/search5 req/s separate cap, 200/page, 10k totalGenuinely ad-hoc, low-frequency

When Search is a smell

A Search call in a UI extension's hot path usually means a property you should have stored isn't stored. If you find yourself writing search where stripe_customer_id = X, the fix is to write stripe_customer_id as a unique HubSpot property on the contact (declare it in your schema, add unique: true), and then use the much faster objects/{id}?idProperty=stripe_customer_id endpoint. That's the shared pool, no 5-req/s cap, and a single round trip.

The other common Search anti-pattern is the dashboard query — "show me deals modified in the last hour, owned by user X, in stage Y." The right fix there is to run that query as part of a 15-minute sync into a small materialized object you control (a custom object or a KV namespace on the Worker), and have the extension read from your cache. Search stays for genuinely ad-hoc, low-frequency reporting.

When you actually need Search

Sometimes you do need the live query — a workflow action that takes a free-text filter input from the user, a one-off report, a debug tool. For those, two things help. First, set retry: { max: 8, baseMs: 1_000, capMs: 15_000 } on the call, because 429s on Search are common and the backoff needs to be on the order of the 1-second window, not the 10-second one. Second, gate access — in the planned worker.action(..., { concurrency: 1 }) option, a single action is serialized across the whole Worker so a Search-heavy endpoint slows down instead of 429-ing. Today, you can approximate this with an in-memory mutex around the call inside your handler.

Alert on headroom, not on 429 count

A 429 count alert is the wrong alert. It fires after the failure, it fires at a fixed threshold that bears no relationship to your portal's actual budget, and it tells you nothing about how close you came to failing on the requests that succeeded. The right metric is headroom — the minimum value of X-HubSpot-RateLimit-Remaining / X-HubSpot-RateLimit-Max the Worker has seen in the last N minutes.

Design preview

The planned worker.metrics.enable() helper and the suggested metric names (hubspot_ratelimit_headroom_pct, hubspot_ratelimit_429_total, etc.) are documented design surface — they're not yet emitted by the runtime. Today, log the X-HubSpot-RateLimit-Remaining / -Max headers from your retry wrapper and forward them to your observability stack (Cloudflare Analytics Engine, Datadog, Grafana) yourself.

Alert when the 5-minute minimum drops below 20%, page when it drops below 5%. By the time you're seeing 429s, headroom has been at zero for at least one 10-second window — the headroom alert gives you minutes of lead time on a problem the 429 alert tells you about after it's already user-visible.

// Design preview — not yet wired in @hs-x/sdk.
import { defineWorker } from '@hs-x/sdk';
 
export default defineWorker(({ worker }) => {
  // Will emit headroom, 429 total, and request-duration metrics by default.
  // Configure scrape targets and alerts in your observability stack of choice.
  worker.metrics.enable();
});

The four signals worth a dashboard tile

  • Headroom (gauge). Minimum remaining / max over the last 5 minutes, per (app, portal). The leading indicator. Anything under 20% sustained is "you have a runaway loop, or a sync just gained an order of magnitude of work."
  • 429 rate (per minute). Not the count — the rate, normalized by total request volume. A spike from 0.1% to 5% means something changed; a steady 0.2% means HubSpot has occasional flakiness and your retry logic is working as designed.
  • Retry depth p99. The 99th-percentile number of retries before a call succeeds. If this climbs from 0 to 3 over a day, you're operating closer to the cap than you think — every call is paying a backoff tax even when it succeeds.
  • Batch chunk count. How many /batch/upsert chunks the Worker has sent in the window. A sync that suddenly goes from 10 chunks to 200 means your source's row count grew an order of magnitude and your next problem is going to be the daily limit, not the secondly one.

Cross-link: /docs/guides/monitoring covers the full metrics surface — how to scrape, what to graph, and which alert thresholds correlate with real incidents in production.

Common rate-limit failures

The patterns below cover most of the rate-limit support tickets we see. If you're hitting 429s and the cause isn't on this list, instrument your retry wrapper to log the X-HubSpot-RateLimit-Remaining value seen on every response, broken down by caller and endpoint.

"We get 429s in bursts even when our average is under cap."

The 110/10s ceiling is a rolling window, not a per-second cap with reset. A sync that opportunistically blasts 200 requests in a single second will exceed the rolling 110/10s instantly, even if the next nine seconds are quiet. There is no published "burst credit" multiplier and no documented recovery cooldown beyond what the rolling window naturally enforces. The fix is to either batch (step 3, you go from 5,000 calls to 50) or to clamp concurrency so you can't push more than the window allows in any one-second slice.

"Sync engine and a workflow action are fighting for budget."

This is the case the shared token bucket exists to handle, but only if both callers go through the same HTTP layer. If one of them is calling raw fetch or using hubspot-api-nodejs directly without a shared limiter, it bypasses the bucket and starves the other. The fix is to route both through the same retry/limit wrapper, then bias the interactive caller — in the design-preview API that's http.fetch(url, { priority: 'high' }). Today, you can approximate this by giving the interactive path its own short-max retry config and the sync a long-max one, so the sync naturally backs off further when contention shows up.

"Search API limit confused with REST limit."

The single most common one. Your dashboard shows the main pool at 30% utilized, and your Search-heavy extension is still 429-ing. That's because the 5-req/s Search cap is a completely separate bucket. Track Search headroom independently (count Search 200/429 responses per second, alert on the ratio). The fix is to either move the query off Search (see step 4) or to serialize Search calls behind a mutex in your handler.

Where next