Guides · Operate

Rate limits

A Worker that never throws a 429 at your user, with retry, backoff, and jitter tuned to HubSpot's published budget for your app's tier, fair sharing between syncs and on-demand actions, and a dashboard that tells you how close to the cap you are before HubSpot does. The rate-limits reference has the numbers; this guide has the patterns.

Time: ≈ 12 min
Outcome: A Worker that never throws a 429 to your user, with retry/backoff/jitter tuned for HubSpot's published budget, fair sharing across syncs and on-demand actions, and a dashboard that tells you how close to the cap you are before HubSpot does.
Prerequisites: A deployed HS-X project (if not, run through /docs/guides/getting-started first)
Your Worker is making real HubSpot calls — a sync, a workflow action, or a UI extension server function
Read access to your Cloudflare account so you can view Worker logs and metrics
Optional: the /docs/rate-limits reference open in another tab for the canonical numbers

TL;DR — HS-X's runtime rate limiter spends HubSpot's published budget for you: retry, backoff, and jitter tuned to your app's tier, fair sharing between syncs and on-demand actions, and visible headroom before HubSpot enforces it. Your user never sees a 429. Raw numbers live at /docs/rate-limits.

Before you begin

The single most important thing to internalize is this. For a marketplace OAuth app, HubSpot rate-limits each install to 150 requests per 10 seconds, applied per (app, installed account). Each app you ship gets its own bucket on every portal it's installed in — your app does not share that bucket with the Salesforce sync, the Zapier connector, or any other OAuth app the customer has installed. Private apps run on a different budget entirely (100 req/10s on Free/Starter, 190 req/10s on Pro/Enterprise). Confirm the exact tier numbers for your situation against the usage guidelines before tuning.

The 150/10s ceiling is a rolling window, not a fixed bucket that resets on the second — short spikes within the window are fine as long as the 10-second total stays under cap. The other fact that shapes every decision in this guide. Batch endpoints cap at 100 records per request and count as 1 request against the limit, which is the single largest lever you have for surviving high write volume. The full per-tier table lives in the rate-limits reference; the rest of this page covers what to do with those numbers.

How HS-X tracks the budget

Every HubSpot response carries X-HubSpot-RateLimit-Daily, X-HubSpot-RateLimit-Daily-Remaining, X-HubSpot-RateLimit-Interval-Milliseconds, X-HubSpot-RateLimit-Max, and X-HubSpot-RateLimit-Remaining — the -Max/-Remaining pair describes the current rolling-window budget for your app on that portal. (HubSpot also still emits X-HubSpot-RateLimit-Secondly and -Secondly-Remaining, but those have been marked deprecated since 2018; don't build new code against them.) Those headers are the only honest source of remaining budget — counting calls in your own code drifts the moment another caller on the same app install makes a request. The runtime's HTTP layer reads them on every response and feeds them into a Durable Object token bucket keyed by (appId, portalId), which every sync, action, and tool on the Worker shares.

Budget topology

Your callers

syncs · workflow actions · UI server functions · agent tools

Shared bucket

Durable Object · keyed by (appId, portalId) · reads response headers

HubSpot

150 req / 10s rolling (per app install) · Search API has its own 5 req/s

One more thing worth knowing up front. The Search API has its own separate 5-req/s cap that does not count against the 110/10s pool. It also caps page size at 200 records and won't paginate past a 10,000-record total per query. Hot paths through Search are the single most common cause of "we got 429s but our dashboard said we had headroom" tickets. We cover what to do about it in step 5.

Use the managed ctx.hubspot client

You almost never want to call raw fetch against HubSpot. In an HS-X handler, HubSpot traffic should go through ctx.hubspot — either the typed SDK namespaces or the raw verb helpers like hubspot.get('/crm/v3/...'). That one boundary owns install OAuth tokens, rate-limit leases, live header observations, 429/5xx retry, Retry-After handling, and the separate Search bucket. ctx.http.fetch intentionally refuses HubSpot hosts so that API calls do not bypass that accounting.

import { defineWorker } from '@hs-x/sdk';
 
const worker = defineWorker('enrich');
 
worker.action('enrich-contact', {
  label: 'Enrich contact',
  input: { id: { type: 'string', required: true } },
  async handler({ input, hubspot }) {
    const contact = await hubspot.crm.objects.contacts.get(String(input.id), {
      properties: ['email', 'firstname', 'lastname'],
    });
 
    return { found: true, contact };
  },
});
 
export default worker;

The interesting bits are the parts that touch the rate limit:

Header parsing. Every response updates the shared bucket with the freshest X-HubSpot-RateLimit-Remaining value HubSpot reported. Even a request that succeeds feeds the next request's decision about whether to wait.
Pre-flight budget leases. Before the call goes out, the runtime asks the per-portal bucket for a short-lived lease. If local budget is already exhausted, the capability returns a structured retry-later/backpressure result instead of spending a real HubSpot request.
Retry on 429 and 5xx. If HubSpot returns 429 or a transient gateway error anyway, the layer retries up to the configured max times with exponential backoff and jitter. A short Retry-After is honored in-process; a long one becomes platform backpressure so the Worker does not sit open for minutes.
Search isolation. Paths with a /search segment draw from the conservative Search bucket instead of the general pool. Search responses omit rate-limit headers, so HS-X paces them by defaults plus 429 feedback.

Tune retry and bucket defaults

The managed defaults are tuned for background work: retry: { max: 5, baseMs: 200, capMs: 10_000 }, a general bucket at 100 tokens refilling 10 per second, and a Search bucket at 4 tokens refilling 4 per second. That is intentionally conservative: it stays below the published floor for OAuth apps and leaves headroom for other handlers in the same portal. Override the defaults in hsx.config.ts when your app has a different latency budget or an approved higher API tier.

import { defineApp } from '@hs-x/sdk';
 
export default defineApp({
  name: 'email-guard',
  distribution: 'marketplace',
  auth: 'oauth',
  platformVersion: '2026.03',
  scopes: ['crm.objects.contacts.read', 'crm.objects.contacts.write'],
  rateLimits: {
    retry: { max: 3, baseMs: 200, capMs: 4_000 },
    bucket: {
      // Keep the default general bucket, but make Search more conservative.
      searchCapacity: 3,
      searchRefillPerSecond: 3,
    },
  },
});

Set rateLimits.retry to false only when you want HubSpot/Cloudflare to handle retries outside the request. Use that sparingly: it removes the in-process protection that turns short 429s into a successful handler response. rateLimits.bucket.requestedTokensPerCall is an advanced escape hatch for expensive helper calls that represent more than one HubSpot request; most apps should leave it unset.

Picking max, baseMs, and capMs

A short tour of the three knobs and how they interact, because picking values without a model in your head usually produces something that's wrong in the worst way.

max is the number of retries, not attempts. max: 5 means up to 6 total calls. Set it low when there's a human waiting (1 or 2), high when there isn't (5 to 8). Beyond 8 you're rarely getting more reliability, just longer tail latency on a portal that's already saturated.
baseMs is the first backoff. The actual wait on attempt N is roughly random(0, min(capMs, baseMs * 2^N)) — that's decorrelated jitter, which avoids the synchronized-retry-storm failure mode where every caller backs off the same amount and slams the API at the same instant. Start at 200ms for interactive, 500ms for background.
capMs is the ceiling on a single backoff. Important when max is high, because baseMs * 2^8 = 51,200ms is almost certainly longer than you want to wait. Set it to your user-perceptible budget for interactive calls (say 500ms to 1s) and to your sync's tolerance for tail latency for background work (10s to 30s is normal).

When the defaults are wrong on purpose

Two cases worth calling out. First, if the app is mostly interactive workflow actions that HubSpot itself will retry on failure, consider max: 1 so you fail fast and let HubSpot's own retry handle long waits — otherwise you've built two retry loops stacked on each other and your timing math is wrong. Second, if you're in a scheduled (cron) job that runs every minute, set capMs below 60_000 so a retry storm can't make the next invocation overlap the previous one.

Batch correctly with /batch/upsert

Single-record writes are the most common reason a healthy-looking Worker hits the rate limit. A sync that writes 1,000 contacts one at a time burns 1,000 requests; the same sync using /crm/v3/objects/contacts/batch/upsert burns 10. Every write path in production should be batched unless you have a specific reason it can't be.

// Design preview — the worker.sync `batch` option is documented here for
// the API we're building toward. Today, drive the batch endpoint manually
// through ctx.hubspot.
import { defineSource, defineWorker } from '@hs-x/sdk';
 
const worker = defineWorker('stripe-sync');
 
worker.sync(stripeCustomers, {
  into: 'contacts',
  schedule: '15m',
  schema: {
    email: 'email',
    stripe_customer_id: 'string',
    mrr_cents: 'number',
    plan: { type: 'enumeration', options: ['free', 'starter', 'pro', 'enterprise'] },
  },
  // The two knobs are the chunk size (default 100, which is the HubSpot
  // max) and the idProperty used for upsert.
  batch: { size: 100, idProperty: 'stripe_customer_id' },
});
 
export default worker;

If you need to write outside a sync — say, in a workflow action that processes a list of records — chunk the records yourself and call the batch endpoint through ctx.hubspot. Each chunk still goes through the managed auth, retry, and rate-limit layer:

async function upsertContacts(hubspot, records) {
  for (let i = 0; i < records.length; i += 100) {
    const chunk = records.slice(i, i + 100);
    await hubspot.post('/crm/v3/objects/contacts/batch/upsert', {
      query: { idProperty: 'stripe_customer_id' },
      body: {
        inputs: chunk.map((record) => ({
          id: record.stripe_customer_id,
          properties: record,
        })),
      },
    });
  }
}

The idProperty rules that bite people

HubSpot's /batch/upsert endpoint is upsert by a specific property, and the rules around that property are not symmetric with single-record upsert. Three things to know before you ship a batch write:

The idProperty must be unique on the object. That's email and hs_object_id for contacts out of the box, plus any custom property you've marked unique in property settings. If you try to upsert by a non-unique property, the call 400s on every record in the batch.
email as an idProperty does not support partial upsert on contacts. HubSpot's object APIs require a custom unique property when you want partial upserts on contacts — passing idProperty=email will reject the partial-upsert path. The fix is to declare a custom unique property (for example stripe_customer_id) and upsert by that; reserve email for the create path.
Properties not in the payload are not cleared. This is the upsert-vs-replace distinction. Sending { email, mrr_cents: 0 } sets mrr_cents to 0 and leaves every other property alone. If you want to clear a property explicitly, send it as null — omitting it does nothing.

Watching the chunk boundary

The 100-record cap is hard. The call shape is POST /crm/v3/objects/{object}/batch/upsert with { inputs: [...100 records] }. Counting against the limit, this is one request, which is the whole point. A 1,000-record sync at 100-per-chunk is 10 requests; running them serially fits inside a single 10-second window with 100 requests of headroom for everything else on the portal.

Avoid the Search API in hot paths

The Search API (/crm/v3/objects/{object}/search) has its own rate limit that does not share with the main pool, and it is tight. Five requests per second, regardless of your subscription tier, with page size capped at 200 records and a 10,000-record hard ceiling per query. A UI extension that calls Search to filter a list of deals will hit the cap with a handful of people refreshing the same record at the same second. If your dashboard says you have plenty of headroom and you're still seeing 429s, this is almost always why.

Pattern	API	Limit	Use when
Get one record by id	`/crm/v3/objects/{type}/{id}`	110/10s app-install pool	You have the object id
Get by unique property	`/crm/v3/objects/{type}/{id}?idProperty=email`	110/10s app-install pool	You have email or another unique prop
List paginated	`/crm/v3/objects/{type}`	110/10s app-install pool	You need every record
Filter by 1 property	Stored list or pre-indexed property	n/a	You filter on this property often
Filter by 2+ properties	`/crm/v3/objects/{type}/search`	5 req/s separate cap, 200/page, 10k total (HS-X's limiter defaults to 4 req/s to keep headroom)	Genuinely ad-hoc, low-frequency

When Search is a smell

A Search call in a UI extension's hot path usually means a property you should have stored isn't stored. If you find yourself writing search where stripe_customer_id = X, the fix is to write stripe_customer_id as a unique HubSpot property on the contact (declare it in your schema, add unique: true), and then use the much faster objects/{id}?idProperty=stripe_customer_id endpoint. That's the shared pool, no 5-req/s cap, and a single round trip.

The other common Search anti-pattern is the dashboard query — "show me deals modified in the last hour, owned by user X, in stage Y." The right fix there is to run that query as part of a 15-minute sync into a small materialized object you control (a custom object or a KV namespace on the Worker), and have the extension read from your cache. Search stays for genuinely ad-hoc, low-frequency reporting.

When you actually need Search

Sometimes you do need the live query — a workflow action that takes a free-text filter input from the user, a one-off report, a debug tool. For those, two things help. First, tune the app's Search bucket conservatively with rateLimits.bucket.searchCapacity and searchRefillPerSecond, because 429s on Search are common and the bucket has no response headers to track. Second, gate access for the Search-heavy handler with a small in-memory mutex or queue so the endpoint slows down instead of 429-ing under a burst.

Alert on headroom, not on 429 count

A 429 count alert is the wrong alert. It fires after the failure, it fires at a fixed threshold that bears no relationship to your portal's actual budget, and it tells you nothing about how close you came to failing on the requests that succeeded. The right metric is headroom — the minimum value of X-HubSpot-RateLimit-Remaining / X-HubSpot-RateLimit-Max the Worker has seen in the last N minutes.

Alert when the 5-minute minimum drops below 20%, page when it drops below 5%. By the time you're seeing 429s, headroom has been at zero for at least one 10-second window — the headroom alert gives you minutes of lead time on a problem the 429 alert tells you about after it's already user-visible.

import { defineApp } from '@hs-x/sdk';
 
export default defineApp({
  // ...
  rateLimits: {
    metrics: { enabled: true },
  },
});

With metrics enabled, the generated Worker wires loggerRateLimitMetrics() into the managed HubSpot client. The runtime emits structured logger events named hubspot_ratelimit_remaining, hubspot_ratelimit_headroom_pct, hubspot_ratelimit_429_total, and hubspot_ratelimit_backpressure_total, so they appear wherever runtime logger output is captured, including Workers Logs, with portal, surface, family, request path, status, and retry-after fields where available.

The four signals worth a dashboard tile

Headroom (gauge). Minimum remaining / max over the last 5 minutes, per (app, portal). The leading indicator. Anything under 20% sustained is "you have a runaway loop, or a sync just gained an order of magnitude of work."
429 rate (per minute). Not the count — the rate, normalized by total request volume. A spike from 0.1% to 5% means something changed; a steady 0.2% means HubSpot has occasional flakiness and your retry logic is working as designed.
Retry depth p99. The 99th-percentile number of retries before a call succeeds. If this climbs from 0 to 3 over a day, you're operating closer to the cap than you think — every call is paying a backoff tax even when it succeeds.
Batch chunk count. How many /batch/upsert chunks the Worker has sent in the window. A sync that suddenly goes from 10 chunks to 200 means your source's row count grew an order of magnitude and your next problem is going to be the daily limit, not the secondly one.

The monitoring guide covers the full metrics surface — how to scrape, what to graph, and which alert thresholds correlate with real incidents in production.

Common rate-limit failures

The patterns below cover most of the rate-limit support tickets we see. If you're hitting 429s and the cause isn't on this list, instrument your retry wrapper to log the X-HubSpot-RateLimit-Remaining value seen on every response, broken down by caller and endpoint.

"We get 429s in bursts even when our average is under cap."

The 150/10s ceiling is a rolling window, not a per-second cap with reset. A sync that opportunistically blasts 200 requests in a single second will exceed the rolling 110/10s instantly, even if the next nine seconds are quiet. There is no published "burst credit" multiplier and no documented recovery cooldown beyond what the rolling window naturally enforces. The fix is to either batch (step 3, you go from 5,000 calls to 50) or to clamp concurrency so you can't push more than the window allows in any one-second slice.

"Sync engine and a workflow action are fighting for budget."

This is the case the shared token bucket exists to handle, but only if both callers go through the same HTTP layer. If one of them is calling raw fetch or using hubspot-api-nodejs directly without a shared limiter, it bypasses the bucket and starves the other. The fix is to route both through ctx.hubspot, batch the sync's writes, and keep the app-level retry cap low enough that interactive actions hand control back to HubSpot instead of sitting open behind a sync burst.

"Search API limit confused with REST limit."

The single most common one. Your dashboard shows the main pool at 30% utilized, and your Search-heavy extension is still 429-ing. That's because the 5-req/s Search cap is a completely separate bucket. Track Search headroom independently (count Search 200/429 responses per second, alert on the ratio). The fix is to either move the query off Search (see step 4) or to serialize Search calls behind a mutex in your handler.

Where next

Reference · Rate limits — the canonical per-tier table, every published header, and the exact rolling-window semantics.
How to · Monitoring and observability — the full metrics surface, how to scrape it, and the alert thresholds we use internally on production portals.
How to · Marketplace listing — what HubSpot's app review team looks for in rate-limit handling before approving a public listing, and how to demonstrate compliance.