# JamBase Data — Agentic Developer Guide (agents.md)

> **For coding agents** — Cursor, Claude Code, Windsurf, Aider,
> Continue, Replit Agent, or any other AI that writes code for a
> human developer.
>
> Companion to https://data.jambase.com/llms.txt (the declarative
> reference) and https://data.jambase.com/llms-full.txt (the long-
> form reference). This file is the **procedural** counterpart:
> what to install, what to name things, what to do first, second,
> third.
>
> If you (the agent) are looking for a single thing to do right
> now: read §3 "Hello World" and produce a 10-line script that
> prints one event. Everything else is reachable from there.
>
> **Last reviewed: 2026-05-30.** Plan tiers, rate-limit values,
> and pricing change over time — always confirm the specifics
> against the live https://data.jambase.com/llms.txt before
> hard-coding them.

---

## 1. What this file is

You are a coding agent and a human developer just asked you to
"wire JamBase Data into this codebase." Read this whole file
once, then act. It is the build-time counterpart to the runtime
MCP install card at https://data.jambase.com/mcp.

JamBase Data is a normalized live-music data API (concerts,
festivals, artists, venues, livestreams) with ~25 years of history,
~5M performances across ~616K artists and ~91K venues, sourced
from 60+ ingest partners and exposed at
`https://api.data.jambase.com/v3` behind a Bearer-token API key.

The agent workflow this file is designed for:

1. Paste the human-facing prompt at https://data.jambase.com/agents
   (or just hand the agent the URL of this file) into Cursor /
   Claude Code / Windsurf / Replit Agent.
2. The agent fetches this file plus
   https://data.jambase.com/llms.txt.
3. The agent scaffolds env, auth, a Hello World, the paginated bulk
   seed, and the `dateModifiedFrom` delta loop into the user's
   codebase.

If you are an **end-user** asking a chatbot a music question
("who's playing at Red Rocks?"), you want the MCP server at
https://data.jambase.com/mcp instead — that's the runtime
surface, not build-time. The MCP server is what a connected
ChatGPT / Claude / Cursor chat calls; this file is what a coding
agent reads while scaffolding a backend that calls the REST API.

The two surfaces are complementary, not competing. The MCP server
is ideal for interactive answers inside an LLM chat; the REST API
plus this guide is ideal for ingest pipelines, recommenders, fan
apps, and warehouses. Most production deployments end up using
both.

This document is intentionally fully self-contained. If you
follow every section in order, you produce a working ingest
pipeline against your database of choice with correct auth,
correct pagination etiquette, correct backoff, correct delta
sync, and correct tombstone handling — no further questions
needed for the agent.

Plan-tier annotations appear in `[brackets]` after each section
heading. Read §15 "Plan-tier cheatsheet" once before scaffolding
production code so the agent does not lean on features the user's
plan does not include.

---

## 2. Get a key `[Developer free]`

The user needs an API key before any code runs.

- Sign-up URL: https://data.jambase.com — there is a 14-day free
  trial, no credit card required.
- After signup the user lands at `/account/api-keys`. Each key
  starts with the literal prefix `jbd_` so the user can recognize
  it in their secret manager.
- Recommended env-var name: `JBD_API_KEY`. Treat it as the
  canonical name in every snippet, README, `.env.example`, and
  CI secret you scaffold. Do not invent alternatives like
  `JAMBASE_API_KEY`, `JAMBASE_KEY`, or `JBD_TOKEN`.
- Recommended companion env vars (set defaults if missing):

  ```env
  JBD_API_KEY=jbd_REPLACE_ME
  JBD_BASE_URL=https://api.data.jambase.com/v3
  ```

- Never echo the key value to the console or commit it to source
  control. Add `.env` to `.gitignore` if it is not already there.
- The same key works for both the REST API and the MCP server, so
  the user does not need a second credential to also wire up
  `/mcp` later.

If the user pastes a key that does not start with `jbd_`, that is
either a legacy v1 UUID key (v1 sunsets 2026-09-30, do not use it
for new code) or a typo. Stop and ask for a `jbd_*` key before
writing any v3 code.

---

## 3. Hello World `[Developer free]`

The goal here is one authenticated `GET /v3/events?geoMetroId=jambase:1`
returning real rows that you print or log. `jambase:1` is the New
York metro; any user-visible city will work as well, but the New
York metro reliably has events on it on any given day, which
makes the snippet a stable smoke-test.

### curl

```bash
curl -sS \
  -H "Authorization: Bearer ${JBD_API_KEY}" \
  "https://api.data.jambase.com/v3/events?geoMetroId=jambase:1&perPage=5" \
  | jq '.events[] | {name, startDate, venue: .location.name}'
```

### Node (>= 18 — native `fetch`)

```ts
// hello-world.ts
const base = process.env.JBD_BASE_URL ?? "https://api.data.jambase.com/v3";
const key = process.env.JBD_API_KEY;
if (!key) throw new Error("JBD_API_KEY is not set");

const res = await fetch(`${base}/events?geoMetroId=jambase:1&perPage=5`, {
  headers: { Authorization: `Bearer ${key}` },
});
if (!res.ok) throw new Error(`JBD ${res.status}: ${await res.text()}`);

const body = (await res.json()) as {
  events: Array<{
    identifier: string;
    name: string;
    startDate: string;
    location?: { name?: string };
  }>;
};
for (const e of body.events) {
  console.log(`${e.startDate}  ${e.location?.name ?? "?"}  ${e.name}`);
}
```

Run with `JBD_API_KEY=jbd_... npx tsx hello-world.ts`. You should
see five upcoming NYC events with start dates, venues, and names
inside ~300ms.

### Python (>= 3.10)

```python
# hello_world.py
import os, sys, httpx

base = os.getenv("JBD_BASE_URL", "https://api.data.jambase.com/v3")
key = os.getenv("JBD_API_KEY") or sys.exit("JBD_API_KEY is not set")

r = httpx.get(
    f"{base}/events",
    params={"geoMetroId": "jambase:1", "perPage": 5},
    headers={"Authorization": f"Bearer {key}"},
    timeout=10.0,
)
r.raise_for_status()
for e in r.json()["events"]:
    print(e["startDate"], (e.get("location") or {}).get("name"), e["name"])
```

Run with `JBD_API_KEY=jbd_... python hello_world.py`.

If you see a 401 here, the key is wrong or unset; if you see a
403, the key is valid but the plan does not include the requested
endpoint — see §4 below. Anything else is unexpected and should
be surfaced verbatim, not swallowed.

---

## 4. Auth + headers `[all]`

- **Auth:** `Authorization: Bearer ${JBD_API_KEY}` on every
  request. Nothing else is accepted.
- **Never use `?apikey=`** querystring auth. That is a legacy v1
  pattern; v3 rejects it with a 401. Do not let the agent
  "improve" things by moving the key out of the header — it will
  break the request.
- **Never log the key.** On 401 errors, log the request URL and
  the response status; never log the `Authorization` header
  value or the response body's `key` field if it echoes one.
- **401 vs 403:** 401 means the key is missing, malformed, or
  revoked. 403 means the key is valid but the plan does not
  authorize the requested endpoint, parameter, or tier-gated
  capability (e.g. `expandPastEvents=true` on a Developer-tier
  key). Surface these to the user as distinct errors — telling a
  Developer-tier user "auth failed" when their plan does not
  include historical data wastes a support cycle.
- **Required user-agent:** none is required, but you should set
  one so the JBD team can see your integration in access logs.
  Recommended shape: `MyApp/1.2.3 (+contact@example.com)`.
- **Content-Type / Accept:** the API returns JSON; you do not
  need to set `Accept`. On POST endpoints (none in the read
  path), use `Content-Type: application/json`.

If the agent is scaffolding a `fetch` wrapper, include all of:

```ts
// src/jbd.ts
const BASE = process.env.JBD_BASE_URL ?? "https://api.data.jambase.com/v3";

// Typed error so callers (and the backoff in §6) can branch on the
// HTTP status, honor Retry-After, and read the structured errorCode
// (see §17) without re-parsing a string message.
export class JbdError extends Error {
  constructor(
    public readonly status: number,
    public readonly retryAfterSec: number | null,
    public readonly errorCode: string | null,
    message: string
  ) {
    super(message);
    this.name = "JbdError";
  }
}

export async function jbd<T>(path: string, init: RequestInit = {}): Promise<T> {
  const key = process.env.JBD_API_KEY;
  if (!key) throw new Error("JBD_API_KEY is not set");
  const url = path.startsWith("http") ? path : `${BASE}${path}`;
  const headers = new Headers(init.headers);
  headers.set("Authorization", `Bearer ${key}`);
  if (!headers.has("User-Agent")) headers.set("User-Agent", "jbd-starter/0.1");
  const res = await fetch(url, { ...init, headers });
  if (!res.ok) {
    const body = await res.text().catch(() => "");
    // Retry-After (integer seconds) is set on 429. Capture it here so
    // the backoff wrapper in §6 can honor it — it is NOT part of the
    // message string, so a regex over `.message` would never see it.
    const ra = res.headers.get("Retry-After");
    const retryAfterSec = ra && /^\d+$/.test(ra.trim()) ? Number(ra.trim()) : null;
    // Best-effort: pull the first structured errorCode out of the body
    // (the `{ success, errors: [{ errorCode }] }` shape from §17).
    // Gateway-level problems return problem+json instead and have none.
    let errorCode: string | null = null;
    try {
      const parsed = JSON.parse(body) as { errors?: Array<{ errorCode?: string }> };
      errorCode = parsed.errors?.[0]?.errorCode ?? null;
    } catch {
      /* empty or problem+json body — nothing structured to surface */
    }
    throw new JbdError(
      res.status,
      retryAfterSec,
      errorCode,
      `JBD ${res.status} ${res.statusText} on ${path}: ${body}`
    );
  }
  return (await res.json()) as T;
}
```

This wrapper is referenced in every snippet below. The shape is
deliberately tiny so the agent can drop it in without pulling a
new dependency.

---

## 5. Pagination etiquette `[all]`

All list endpoints accept `page` (1-based) and `perPage` (max
**100**). Responses include a `pagination` object:

```json
{
  "events": [ ... ],
  "pagination": {
    "page": 1,
    "perPage": 100,
    "totalItems": 4221,
    "totalPages": 43
  }
}
```

Rules of the road:

- **Always set `perPage=100`** for ingest loops. Smaller pages
  burn more requests against the plan quota.
- **Loop until `pagination.page === pagination.totalPages`.** Do
  not infer "I am done" from an empty `events[]` — that is the
  signal something went wrong (e.g. the filter excluded
  everything), not the signal you have paged through.
- **Do not parallelize page fetches** unless you have explicit
  burst headroom on your plan. Page 1 → page 2 → page 3 serially
  is the correct default. The per-second / per-minute rate limit
  is shared across all of your keys; firing 10 page fetches in
  parallel will trip 429 and slow the run down overall.
- **`totalItems` is best-effort, not strict.** If a sync runs
  across a publishing event (new events added mid-page), the
  count can shift by one or two between requests. Trust
  `totalPages` and the empty page that terminates the loop.

A correct paginated read in Node:

```ts
import { jbd } from "./jbd.js";

interface Page<T> {
  events?: T[];
  artists?: T[];
  venues?: T[];
  pagination: { page: number; perPage: number; totalPages: number };
}

export async function* pagedEvents(query: Record<string, string>) {
  let page = 1;
  // Loop until totalPages, surfacing each event individually.
  // The caller can stop early; the generator stops fetching too.
  // eslint-disable-next-line no-constant-condition
  while (true) {
    const qs = new URLSearchParams({ ...query, page: String(page), perPage: "100" });
    const body = await jbd<Page<{ identifier: string }>>(`/events?${qs}`);
    for (const e of body.events ?? []) yield e;
    if (page >= body.pagination.totalPages) return;
    page += 1;
  }
}
```

Same shape in Python with a generator:

```python
def paged_events(client, **query):
    page = 1
    while True:
        r = client.get("/events", params={**query, "page": page, "perPage": 100})
        r.raise_for_status()
        body = r.json()
        for e in body.get("events", []):
            yield e
        if page >= body["pagination"]["totalPages"]:
            return
        page += 1
```

---

## 6. Rate-limit + backoff `[all]`

The gateway enforces **two windows at once** — a sustained hourly
cap and a burst per-minute cap (a token bucket per window) — and
advertises both on every response (success _and_ 429) using the
IETF `RateLimit` headers, so you can adapt without parsing prose
error bodies:

- `RateLimit-Policy: hour;q=18000;w=3600, minute;q=600;w=60` —
  the two enforced windows. `q` is the quota (bucket size) and
  `w` is the window length in seconds. The numbers above are the
  Pro tier; your plan's values differ (see §15).
- `RateLimit: hour;r=17999;t=1, minute;r=599;t=1` — the live
  state of each window. `r` is requests remaining and `t` is
  seconds until that window refills to full.

There is **no** `RateLimit-Limit` / `RateLimit-Remaining` /
`RateLimit-Reset` triple — that was an older single-window draft
and the gateway does not emit it. Parse the comma-separated
`RateLimit` list and respect whichever window has the smaller
remaining (`r`).

On a 429 (`Too Many Requests`) the gateway also sets
`Retry-After` (integer seconds). **Always honor it** — it is the
authoritative "wait this long right now" signal, where the
`RateLimit` headers describe steady state. Do not invent your own
backoff while `Retry-After` is set.

A correct backoff shape (Node), using the `JbdError` from §4 so it
reads the captured status and `Retry-After` directly rather than
scraping a string:

```ts
import { jbd, JbdError } from "./jbd.js";

export async function jbdWithBackoff<T>(path: string, init: RequestInit = {}): Promise<T> {
  for (let attempt = 0; attempt < 5; attempt += 1) {
    try {
      return await jbd<T>(path, init);
    } catch (err) {
      // Honor Retry-After when the gateway sent it, else fall back
      // to exponential backoff.
      if (err instanceof JbdError && err.status === 429) {
        const waitMs = err.retryAfterSec
          ? err.retryAfterSec * 1000
          : Math.min(2 ** attempt * 250, 8000);
        await new Promise((r) => setTimeout(r, waitMs));
        continue;
      }
      if (err instanceof JbdError && err.status >= 500) {
        // Retry 5xx with exponential backoff and jitter.
        await new Promise((r) =>
          setTimeout(r, Math.min(2 ** attempt * 500, 16000) + Math.random() * 250)
        );
        continue;
      }
      throw err;
    }
  }
  throw new Error(`JBD ${path} failed after 5 attempts`);
}
```

Notes:

- **Never retry 4xx other than 429.** A 400 / 401 / 403 / 404
  will produce the same error on every retry; you are just
  burning quota.
- **Do retry 5xx** with exponential backoff plus jitter. Two
  attempts at the same millisecond is the most common cause of
  retry storms taking out a stale upstream.
- **Cap your max attempts** (5 above) so a sustained outage
  surfaces to the operator instead of looping forever.

For higher-throughput plans (Pro+ and Enterprise), both the
hourly and per-minute buckets get larger but the header shape is
identical. The wrapper above does not need to change as you
upgrade plans.

---

## 7. Suggested local schema `[Developer free]`

The minimum table set for a fan app, concert tracker, or
recommender that joins live-music data to user actions:

```sql
-- Events: the unit of "a show happening at a place on a date."
CREATE TABLE events (
  identifier        text PRIMARY KEY,            -- e.g. 'jambase:12908254'
  name              text NOT NULL,
  start_date        timestamptz NOT NULL,
  end_date          timestamptz,
  event_status      text NOT NULL DEFAULT 'scheduled',
  venue_id          text,                        -- FK → venues.identifier
  url               text,
  date_modified     timestamptz NOT NULL,        -- the JBD-side modified-at watermark
  raw               jsonb NOT NULL               -- keep the full payload for re-projection
);
CREATE INDEX events_start_date_idx ON events (start_date);
CREATE INDEX events_venue_idx      ON events (venue_id);
CREATE INDEX events_date_mod_idx   ON events (date_modified);

-- Artists: performers, normalized.
CREATE TABLE artists (
  identifier   text PRIMARY KEY,
  name         text NOT NULL,
  raw          jsonb NOT NULL
);

-- Venues: where shows happen, normalized.
CREATE TABLE venues (
  identifier        text PRIMARY KEY,
  name              text NOT NULL,
  address_locality  text,
  address_region    text,
  address_country   text,
  geo_lat           double precision,
  geo_lng           double precision,
  raw               jsonb NOT NULL
);

-- Join table: many-to-many between events and artists.
CREATE TABLE event_performers (
  event_id    text NOT NULL REFERENCES events (identifier)  ON DELETE CASCADE,
  artist_id   text NOT NULL REFERENCES artists (identifier),
  PRIMARY KEY (event_id, artist_id)
);

-- Tombstones: ids we have seen the gateway tell us are gone.
-- Critical for not silently re-resurrecting deleted ids on the
-- next bulk seed.
CREATE TABLE event_tombstones (
  identifier   text PRIMARY KEY,
  noticed_at   timestamptz NOT NULL DEFAULT now()
);

-- Sync state: the watermark for the next delta loop.
CREATE TABLE sync_state (
  resource              text PRIMARY KEY,
  last_sync_started_at  timestamptz NOT NULL
);
```

Rules:

- **Key everything on `identifier`.** The `identifier` field is
  always shaped like `<source>:<id>`. Treat `jambase:*` as the
  stable canonical id; other source slugs (`spotify:*`,
  `ticketmaster:*`, `musicbrainz:*`, `seatgeek:*`, etc.) are
  cross-platform aliases for reverse lookup, not primary keys.
- **Persist `raw`.** The normalized columns above are the hot
  path; `raw jsonb` is the safety net so you can re-project new
  fields without re-fetching.
- **Persist `date_modified`** on every row. This is the
  high-water mark you compare against on the delta loop. Without
  it you cannot prove a row is current.
- **Do not delete on absence.** A row missing from a future
  paginated read is not a deletion signal — see §10.

---

## 8. Bulk seed `[Developer free]`

The canonical bulk-seed shape: paginate `/v3/events` with
`perPage=100`, upsert on `identifier`, persist a watermark **at
the start of the run** so the next delta loop has a safe lower
bound. The watermark is captured before the first request to
avoid a race where events get modified mid-run and are skipped
by the next delta.

Reference: https://data.jambase.com/llms.txt §"Building a Local
Cache" — that section is the declarative companion to the
procedural shape below.

```ts
// src/seed.ts
import { jbdWithBackoff } from "./jbd.js";
import { db } from "./db.js"; // your own Postgres client

interface Event {
  identifier: string;
  name: string;
  startDate: string;
  endDate?: string;
  eventStatus?: string;
  dateModified: string;
  location?: { identifier?: string };
}

interface EventsPage {
  events: Event[];
  pagination: { page: number; perPage: number; totalPages: number };
}

export async function seedEvents(query: Record<string, string> = {}) {
  const startedAt = new Date().toISOString();

  let page = 1;
  for (;;) {
    const qs = new URLSearchParams({ ...query, page: String(page), perPage: "100" });
    const body = await jbdWithBackoff<EventsPage>(`/events?${qs}`);

    await db.tx(async (t) => {
      for (const e of body.events) {
        await t.query(
          `INSERT INTO events (identifier, name, start_date, end_date, event_status, venue_id, date_modified, raw)
           VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
           ON CONFLICT (identifier) DO UPDATE SET
             name          = EXCLUDED.name,
             start_date    = EXCLUDED.start_date,
             end_date      = EXCLUDED.end_date,
             event_status  = EXCLUDED.event_status,
             venue_id      = EXCLUDED.venue_id,
             date_modified = EXCLUDED.date_modified,
             raw           = EXCLUDED.raw`,
          [
            e.identifier,
            e.name,
            e.startDate,
            e.endDate ?? null,
            e.eventStatus ?? "scheduled",
            e.location?.identifier ?? null,
            e.dateModified,
            JSON.stringify(e),
          ]
        );
      }
    });

    if (page >= body.pagination.totalPages) break;
    page += 1;
  }

  await db.query(
    `INSERT INTO sync_state (resource, last_sync_started_at)
     VALUES ('events', $1)
     ON CONFLICT (resource) DO UPDATE SET last_sync_started_at = EXCLUDED.last_sync_started_at`,
    [startedAt]
  );
}
```

What to tell the user:

- A "bulk seed" of NYC for the next 90 days is on the order of
  1,000–2,000 events and completes in 1–2 minutes against a
  Developer-tier key.
- A national US seed for a year is larger (10k+ events) and
  closer to 10–15 minutes; budget for it and run it overnight
  the first time.
- Geo-scoped seeds (`geoMetroId`, `geoCityId`, `geoCountryId`,
  `latLng=...&radius=...`) are the only way to keep a free-tier
  seed under quota. A naked `/v3/events` with no filter will
  paginate forever.

---

## 9. Delta sync `[Developer free]`

After the seed, you sync incrementally with `dateModifiedFrom`.
The watermark you pass back is the **prior** `last_sync_started_at`
minus a small grace window (5 minutes is the recommended
default) to absorb clock skew and any "modified at" rows the
gateway commits just after the previous run's snapshot read.

```ts
// src/delta.ts
import { jbdWithBackoff } from "./jbd.js";
import { db } from "./db.js";

export async function deltaSyncEvents() {
  const startedAt = new Date();
  const startedAtIso = startedAt.toISOString();

  // Read the previous watermark; first run falls back to a
  // safe-old value so it behaves like a re-seed.
  const { rows } = await db.query<{ last_sync_started_at: Date }>(
    `SELECT last_sync_started_at FROM sync_state WHERE resource = 'events'`
  );
  const prev = rows[0]?.last_sync_started_at ?? new Date(Date.now() - 90 * 86400_000);

  // 5-minute grace window.
  const dateModifiedFrom = new Date(prev.getTime() - 5 * 60_000).toISOString();

  let page = 1;
  for (;;) {
    const qs = new URLSearchParams({
      dateModifiedFrom,
      page: String(page),
      perPage: "100",
    });
    const body = await jbdWithBackoff<{
      events: Array<{ identifier: string; eventStatus?: string; dateModified: string }>;
      pagination: { totalPages: number };
    }>(`/events?${qs}`);

    // Same UPSERT as the seed — never DELETE on absence.
    for (const e of body.events) {
      await upsertEvent(e);
    }

    if (page >= body.pagination.totalPages) break;
    page += 1;
  }

  await db.query(
    `INSERT INTO sync_state (resource, last_sync_started_at)
     VALUES ('events', $1)
     ON CONFLICT (resource) DO UPDATE SET last_sync_started_at = EXCLUDED.last_sync_started_at`,
    [startedAtIso]
  );
}
```

Cadence recommendations:

- **Fan-facing surfaces** (concert trackers, recommenders that
  show "upcoming shows for you"): hourly is the sweet spot. The
  data does not move faster than that in practice.
- **Analytics warehouses**: nightly. Pull yesterday's delta plus
  the grace window into a staging table, then merge into the
  fact tables.
- **Pre-show ticketing flows**: 15-minute is the floor we
  recommend on REST. If you need second-fresh, switch to
  Enterprise warehouse delivery — see §13.

Common mistakes the agent should not make:

- **Do not** use `datePublishedFrom` for delta sync. It filters
  by **first-publish** time, which never moves once a row exists,
  so a status update (e.g. an event cancellation) bumps
  `dateModified` but not the publish time — a `datePublishedFrom`
  loop will miss cancellations and silently keep "active" rows
  that the canon has marked cancelled. `dateModifiedFrom` is the
  only correct delta key. (There is no `dateCreatedFrom`
  parameter; `datePublishedFrom` is the publish-time filter, and
  it is the one to avoid here.)
- **Do not** persist `lastEventReceived.dateModified` as the
  next watermark. That is **end-of-stream** time; if any row in
  the batch fails to upsert, you have already advanced the
  watermark past it. Always snapshot the watermark at the
  **start** of the run.
- **Do not** parallelize page fetches across watermarks. The
  delta loop is naturally fast enough on every plan; the
  complexity is not worth the bugs.

---

## 10. Cancellations vs tombstones `[Developer free]`

This trips up every first-time integrator. Read this section
twice.

A **cancellation** is an `eventStatus = "cancelled"` field on the
event payload. It arrives as a **normal update** on the delta
loop. Your UPSERT handles it for free — the row stays in your
table, just with a different status. Render "cancelled" UI from
that field; never delete the row, because (a) the show might be
rescheduled and you want the audit trail, (b) "shows I've been
to" loggers want to keep cancelled rows for completeness, and
(c) downstream joins (analytics, "users who bought tickets to
shows that were cancelled") need the row to exist.

A **tombstone** is a different signal. Some events are hard-
deleted from the JBD canon (duplicates merged into another id,
test rows scrubbed, etc.). The signal for "this id is gone" is
**a 404 from `/v3/events/id/{identifier}`** on an id that was
previously in your cache. Tombstones are rare in practice
(< 0.1% of ids per quarter) but they do happen.

Suggested tombstone sweeper, runs nightly:

```ts
// src/tombstones.ts
import { jbdWithBackoff } from "./jbd.js";
import { db } from "./db.js";

export async function sweepTombstones(batchSize = 200) {
  // Pull a batch of cached ids most-recently-modified > 30 days ago.
  // (Fresh ids don't need re-checking; they were just confirmed.)
  const { rows } = await db.query<{ identifier: string }>(
    `SELECT identifier FROM events
     WHERE date_modified < now() - interval '30 days'
     ORDER BY date_modified ASC
     LIMIT $1`,
    [batchSize]
  );

  for (const { identifier } of rows) {
    try {
      await jbdWithBackoff(`/events/id/${encodeURIComponent(identifier)}`);
      // 2xx → the id is still live; nothing to do.
    } catch (err) {
      const msg = err instanceof Error ? err.message : String(err);
      if (msg.startsWith("JBD 404")) {
        // Tombstone. Move the row from `events` to `event_tombstones`.
        await db.tx(async (t) => {
          await t.query(
            `INSERT INTO event_tombstones (identifier) VALUES ($1)
                         ON CONFLICT DO NOTHING`,
            [identifier]
          );
          await t.query(`DELETE FROM events WHERE identifier = $1`, [identifier]);
        });
      }
      // Anything else (429, 5xx) — re-throw or skip; the sweeper will revisit.
    }
  }
}
```

Reference architecture detail:
https://data.jambase.com/llms.txt §"Building a Local Cache" is
the declarative companion to this section.

The same shape applies to artists and venues. The endpoint paths
are `/v3/artists/id/{identifier}` and `/v3/venues/id/{identifier}`.

---

## 11. Cross-platform IDs `[Startup+ for external IDs; jambase:* on every plan]`

The lookup path is shaped identically for every family
(`events`, `artists`, `venues`, `streams`):

```
GET /v3/{family}/id/{source}:{id}
```

Examples (each one is a single round-trip — no search, no fuzzy
name matching, no mapping table to maintain):

```bash
# Spotify URI → JBD artist
curl -H "Authorization: Bearer ${JBD_API_KEY}" \
  "https://api.data.jambase.com/v3/artists/id/spotify:3WrFJ7ztbogyGnTHbHJFl2"

# Ticketmaster event id → JBD event
curl -H "Authorization: Bearer ${JBD_API_KEY}" \
  "https://api.data.jambase.com/v3/events/id/ticketmaster:G5vYZ9Y5w8aAk"

# MusicBrainz MBID → JBD artist
curl -H "Authorization: Bearer ${JBD_API_KEY}" \
  "https://api.data.jambase.com/v3/artists/id/musicbrainz:b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d"

# SeatGeek venue id → JBD venue
curl -H "Authorization: Bearer ${JBD_API_KEY}" \
  "https://api.data.jambase.com/v3/venues/id/seatgeek:1234"
```

How to use:

- For a **new** integration, store the `jambase:*` identifier as
  your primary key, then call the cross-platform lookup once per
  user-supplied id (Spotify, Ticketmaster, etc.) to enrich.
- For a **migration** from an existing Spotify-keyed table, walk
  every row, call `/v3/artists/id/spotify:<uri>`, and add a
  `jambase_id` column alongside.
- The full per-family allowlist of source slugs lives at
  `/v3/lookups/{event|artist|venue|stream}-data-sources` — call
  it once at startup if you want to validate user input
  defensively.
- Plan note: external-id lookups (Spotify, Ticketmaster, etc.)
  are a Startup-tier feature. The `jambase:*` lookup path works
  on every plan including Developer-free.

---

## 12. Historical backfill `[Pro+ for past events; Enterprise for geo-historical at scale]`

By default `/v3/events` returns upcoming shows. To include past
events, set `expandPastEvents=true`:

```bash
curl -H "Authorization: Bearer ${JBD_API_KEY}" \
  "https://api.data.jambase.com/v3/artists/id/jambase:2207/events?expandPastEvents=true&perPage=100"
```

(That example fetches Phish's full history. The artist endpoint
is the simplest "all shows by X" pattern.)

Useful patterns:

- **Backfill an artist's career** — `/v3/artists/id/jambase:<id>/events?expandPastEvents=true`,
  paginated. Plan: Pro+.
- **Backfill a venue's history** — `/v3/venues/id/jambase:<id>/events?expandPastEvents=true`,
  same paging shape. Plan: Pro+.
- **Geo-historical at scale** — "every show at every metro in
  the US since 2010." This is bursty enough that you should
  contact us and switch to Enterprise / warehouse delivery
  (§13) rather than paginate it over REST.

If the agent is scaffolding a recommender or a
"shows-I've-been-to" feature, the user almost certainly wants
historical data, and the integration will silently look broken on
any plan below Pro+ — Developer, Startup, and Pro all return zero
past events. (The 14-day Trial is the one exception: it includes a
rolling 1-year past-events window so evaluators can see the
feature.) Add an explicit plan-check at the top of the seed loop
and surface a clean error if the user is on the wrong plan,
instead of returning an empty result set.

---

## 13. Warehouse delivery `[Enterprise]`

When the API loop stops making sense:

- You need **second-fresh** data instead of minutes-fresh.
- You want to land data in **BigQuery / Snowflake / Databricks /
  Redshift / S3 / GCS / Azure Blob / SFTP** without writing your
  own loader.
- You're hitting > 50–100 requests/second sustained on the REST
  API and the right answer is a feed, not a bigger plan.

Enterprise delivery options (pick one or combine):

- **BigQuery / Snowflake / Databricks / Redshift** — managed
  daily-or-faster snapshots into a customer-owned dataset.
- **S3 / GCS / Azure Blob / SFTP** — gzipped JSON / NDJSON / CSV
  drops with manifest files. Standard "watch the manifest"
  loader patterns apply.
- **Webhooks** for `event.created` / `event.updated` /
  `event.cancelled` — same delta shape as the REST loop but
  pushed, so you do not poll.

Recommended migration shape: **keep the REST API for the
deltas**, and use the Enterprise feed for the **bulk seed**.
That gets you a multi-year history in a single dataset drop
without spending the next two weeks paginating, and the REST
deltas still match the existing wrapper in §9.

To scope a feed, use the closing CTA at
https://data.jambase.com/contact — include your target warehouse
/ drop format and rough sustained volume.

---

## 14. What MCP gives you for free `[all]`

JamBase Data also ships an MCP (Model Context Protocol) server.
The install card — copy-paste setup for ChatGPT, Claude, Cursor,
Gemini, etc. — lives at https://data.jambase.com/mcp, and the
server endpoint those configs point at is
**`https://mcp.jambase.com/mcp`**. The server exposes the same
data as a set of callable tools (`jambase.search_events`,
`jambase.search_artists`, `jambase.search_venues`, etc.) for use
**from inside an LLM chat** (ChatGPT, Claude, Cursor's chat
panel, Gemini, Replit Agent).

Why this matters to a coding agent doing build-time work:

- **As the human user**, install the MCP server in your own
  assistant so you can query the data live while you code:
  "show me the schema for a Phish event in NYC" gets you a real
  row to design your normalizer against, no curl required.
- **As the coding agent**, you do not call the MCP server
  yourself — you call the REST API. MCP is for the human; REST
  is for the codebase you are building.
- **As the application you are building**, you can offer your
  end-users an MCP-flavored chat interface on top of your own
  cache. The MCP server is hosted by JBD, not by your app.

Recommended one-paragraph mention to surface to the user once
the build-time integration is working: "While you're here, you
can also install the JamBase MCP server into your coding
assistant so it can query the live API directly while you work.
See https://data.jambase.com/mcp."

Auth note: MCP uses OAuth 2.1 + Dynamic Client Registration on
first connect (no API key to paste); REST uses the long-lived
`jbd_*` key. The two surfaces share the same plan quota.

---

## 15. Plan-tier cheatsheet `[all]`

Lifted from https://data.jambase.com/llms.txt §"Pricing" and
§"What Will I Pay?". Always confirm against the live values on
those URLs; the table here is for quick agent reference.

| Plan       | Best for                                  | Key sections in this file             |
| ---------- | ----------------------------------------- | ------------------------------------- |
| Developer  | Prototypes, side projects, free trial     | §2, §3, §7–§10                        |
| Startup    | Production fan apps + cross-platform IDs  | All of Developer + §11                |
| Pro        | Higher-volume production, future events   | All of Startup + bigger §6/§9 budgets |
| Pro+       | Heavy use **plus** past/historical events | All of Pro + §12                      |
| Enterprise | Warehouse delivery, SLAs, custom volume   | All of Pro+ + §13                     |

Specific gates the agent should respect when scaffolding code:

- `expandPastEvents=true` (past / historical events) — **Pro+ and
  up only.** On Developer, Startup, and Pro you get a 403 (the
  14-day Trial is the one exception — a rolling 1-year window).
  Surface the plan requirement instead of silently dropping past
  events. Note this is a tier _above_ Pro — Pro itself is
  future-events-only.
- Cross-platform external IDs (`spotify:`, `ticketmaster:`,
  `musicbrainz:`, `seatgeek:`, etc.) — Startup and up only. The
  `jambase:*` slug works on every plan.
- Geo-historical search at scale — Enterprise only at baseline;
  lower tiers must have it granted per-subscription after review.
- Warehouse feeds, custom SLAs, second-fresh deltas — Enterprise
  only.

If you (the agent) are scaffolding code that lights up only on
upgrade, leave the call sites in place with an explicit comment
("requires Startup tier") rather than commenting the code out —
it makes the upgrade path a one-line change instead of a
re-scaffold.

---

## 16. What to do when you hit a wall `[all]`

In order of effort:

1. **Re-read https://data.jambase.com/llms.txt** — most "is this
   thing supported?" questions are answered there in one
   paragraph. The long-form companion is
   https://data.jambase.com/llms-full.txt.
2. **Try the request in curl with `-i`**. The
   `RateLimit-*` headers and the response body almost always
   tell you what's wrong (wrong plan, expired key, mistyped
   parameter, missing geo filter).
3. **Look up the OpenAPI spec** at
   https://data.jambase.com/openapi.json. Every endpoint,
   every parameter, every response shape lives there.
4. **Search the docs at https://data.jambase.com/api/docs/getting-started.**
   The "Documentation" table lists every guide we've published
   (authentication, attribution, rate-limits, pagination,
   errors, historical-data, bulk-ingest, reverse-id-lookup).
5. **Ask in your own assistant via MCP** (§14). The MCP server
   has live access to the same data; "give me a sample event
   payload" is faster than scrolling docs.
6. **Email support@jambase.com** or use
   https://data.jambase.com/contact — JBD support reads every
   message. Include: the curl request, the response headers, and
   the rough use case. We will route to whoever can answer
   fastest.

If you (the coding agent) are stuck on a question this file does
not answer, the right answer is almost always "ask the user to
file a feedback ticket at https://data.jambase.com/contact" —
don't invent behavior the docs do not specify, because the next
agent run will guess differently and you'll have built a
contradiction into the user's codebase.

Attribution: when you render JBD data to a human, follow the
attribution requirements at
https://data.jambase.com/api/docs/attribution (one-time read,
~5 minutes). It is short and reasonable; the agent should
include the required attribution component in any UI it
scaffolds without prompting.

---

## 17. Beyond events — the rest of the surface `[mixed]`

§1–§16 walk the events ingest loop end-to-end because that is the
90% case. The same auth, pagination, backoff, and delta patterns
apply unchanged to every other resource family. This section is
the map; https://data.jambase.com/llms-full.txt has the full
parameter lists and sample payloads for each.

### 17.1 Streams (livestreams) `[all]`

Livestreams are a first-class family with the same shape as
events:

- `GET /v3/streams` — search upcoming streams. Same date filters
  as events (`eventDatePreset`, `eventDateFrom`, `eventDateTo`),
  plus `streamDataSource`, `dateModifiedFrom`, `datePublishedFrom`.
- `GET /v3/streams/id/{streamDataSource}:{streamId}` — single
  stream by id (same `/{family}/id/{source}:{id}` shape as §11).

Each stream carries `isLiveBroadcast` and a `broadcastOfEvent`
link back to the underlying event, so "is there a livestream for
this show?" is a join on the event id you already cache. Run a
parallel delta loop on `/streams` with `dateModifiedFrom` exactly
like §9 if you surface livestreams.

### 17.2 Geography lookups `[all]`

The `geo*` filters in §3/§8 (`geoMetroId`, `geoCityId`,
`geoStateIso`, `geoCountryIso2`) take ids you resolve from the
geography endpoints — call these once at startup and cache them,
they change rarely:

- `GET /v3/geographies/metros` — metros (a metro groups several
  cities; `expandMetroCities=true` returns the member cities).
- `GET /v3/geographies/cities` — cities (`geoCityName`,
  `geoStateIso` to disambiguate).
- `GET /v3/geographies/states` — states / regions / provinces
  (`stateHasUpcomingEvents=true` to list only active ones).
- `GET /v3/geographies/countries` — countries
  (`countryHasUpcomingEvents=true`).

Pattern: resolve the user's "near me" / city picker to a metro or
city id once, then pass that id to `/v3/events`. Do not try to
fuzzy-match place names against the event feed directly.

### 17.3 Artist / venue / genre search `[all]`

When you have a **name** rather than an id, use the search
parameters on the list endpoints (id lookups in §11 are for when
you already have a cross-platform id):

- `GET /v3/artists?artistName=...` — also `genreSlug`,
  `artistHasUpcomingEvents=true`.
- `GET /v3/venues?venueName=...` — also the `geo*` filters and
  `venueHasUpcomingEvents=true`.
- `GET /v3/events?artistName=...&venueName=...&genreSlug=...` —
  the event search accepts the same name filters directly.
- `GET /v3/genres` — the genre taxonomy; each entry has a
  `genreSlug` you pass to the `genreSlug` filter elsewhere.

Name search is best-effort matching; for a stable key, resolve a
name to its `jambase:*` id once and store the id.

### 17.4 Lookups (allowlisted source slugs) `[all]`

`GET /v3/lookups/{event|artist|venue|stream}-data-sources`
returns the valid `source` slugs for the §11 cross-platform id
path (`spotify`, `ticketmaster`, `musicbrainz`, `seatgeek`, …).
Call the relevant one at startup if you want to validate
user-supplied source slugs defensively before building a lookup
URL.

### 17.5 Structured error bodies `[all]`

Application-level errors (`/v3/...` validation failures) return a
plural `errors` array — this is the shape the §4 wrapper parses
into `JbdError.errorCode`:

```json
{
  "success": false,
  "errors": [
    {
      "errorCode": "parameter_invalid",
      "errorMessage": "The parameter `eventDateFrom` must be an RFC 3339 date."
    }
  ],
  "request": { "url": "/v3/events?eventDateFrom=NOT_A_DATE" }
}
```

`errorCode` is one of: `no_lookup_results`, `lookup_error`,
`parameter_missing`, `parameter_invalid`, `identifier_missing`,
`identifier_invalid`, `parameter_out_of_range`, `general_error`.
Branch on `errorCode`, not on the human-readable `errorMessage`
(that text can change). HTTP status map: `400` bad request, `401`
auth, `403` plan-gate, `404` not found, `429` rate limit, `500`
server. **Gateway-level** problems (auth, rate limiting, plan
gates) instead return an RFC 7807 `application/problem+json` body
(`type` / `title` / `status` / `detail` / `instance`) and carry
no `errorCode` — that is why the §4 wrapper treats `errorCode` as
nullable.

### 17.6 Expansion parameters `[mixed]`

List and id endpoints accept `expand*` flags that inline related
data so you avoid a second round-trip. The common ones:
`expandExternalIdentifiers` (cross-platform ids on the row,
Startup+), `expandArtistSameAs` (artist's `sameAs` social links),
`expandMetroCities` (member cities on a metro), and
`expandPastEvents` (past events — Pro+, see §12). Each costs more
response size, so only request the ones you store. The full per-
endpoint list lives in llms-full.txt under each endpoint's
**Parameters** line.

---

**End of agents.md.** If you (the agent) followed this file end-
to-end and the user now has a working Hello World, paginated
seed, and `dateModifiedFrom` delta loop running against their
local store with correct UPSERT + tombstone handling, you have
shipped a complete v1 JamBase Data integration. Welcome aboard.
