Own Your Vercel Analytics: I Built a Drain-to-Postgres Pipeline
Vercel has no API to query Web Analytics history. The only export path is a Drain. Here is the HMAC-signed receiver, the Postgres rollup, and the one-session build that owns the data.
設定をやめて、構築を始めよう。
AIオーケストレーション付きSaaSビルダーテンプレート。
Vercel has no public API to query your Web Analytics history. The only programmatic way to get raw events out is a Web Analytics Drain — a Pro/Enterprise feature that streams every pageview to an endpoint you control, signed with HMAC-SHA1. So I asked Claude Code to build a pipeline that owns the data: Drain → a signature-verifying edge function → Postgres → a SQL rollup view. It took one session.
This post covers why the API doesn't exist, the exact architecture we shipped, the real signature-verification code (constant-time, SHA1), the rollup SQL, and the honest caveat that drains are forward-only — you collect from day one or not at all.
Table of Contents
- Why There Is No Vercel Analytics API
- The Three Ways to Get Data Out
- The Architecture We Built
- Verifying the Drain Signature (HMAC-SHA1)
- Storing Events in Postgres with RLS
- The Rollup View
- The Forward-Only Trap and Other Gotchas
- Building It in One Claude Code Session
- Frequently Asked Questions
設定をやめて、構築を始めよう。
AIオーケストレーション付きSaaSビルダーテンプレート。
Why There Is No Vercel Analytics API
I wanted my own copy of my pageview data — row-level, queryable, mine. The first thing I checked was whether Vercel exposes a REST endpoint to pull Web Analytics history. It does not.
The Vercel REST API has endpoints to create, update, and validate a Drain. None of them read your analytics. There is no GET /analytics that returns events or aggregates. A community feature request asking for exactly that was acknowledged by a Vercel team member but remains unbuilt as of mid-2026. The dashboard shows you charts; it does not hand you the data behind them through an API.
The one documented path to get raw events into your own database is a Web Analytics Drain: Vercel streams events to an HTTPS endpoint you run, and you store them. The drain "immediately starts forwarding data based on your configuration" the moment you create it. That is the whole export story.
This matters because the conclusion is structural, not a missing feature you can wait out: if you want to own your Vercel analytics, you build a receiver and a database. There is no shortcut API.
The Three Ways to Get Data Out
Before building anything, here is the honest comparison of every documented export path.
| Method | Historical backfill? | Real-time? | Granularity | Plan | Cost | Verdict |
|---|---|---|---|---|---|---|
| Public REST API | — | — | — | — | — | Does not exist. No endpoint queries Web Analytics history. |
| CSV export (dashboard) | Yes (selected range) | No (manual) | Aggregated only; capped ~250 rows | All plans | Included | Manual, aggregated, useless for a pipeline. |
| Web Analytics Drain | No — forward-only | Yes (HTTPS POST) | Raw per-event (~35 fields) | Pro & Enterprise | $0.50/GB | The only way to get raw events into your own DB. |
The CSV export is the trap people fall into first. It exports aggregated data — unique visitors and page views for the range you pick — and it is capped at about 250 entries per export. That is fine for eyeballing last month. It cannot drive a pipeline, and it is not row-level events.
Drains are Pro and Enterprise only. Hobby and Pro Trial accounts have to upgrade. Billing is $0.50 per GB of drained data, the same rate for logs, traces, web analytics, and speed insights. For a site doing tens of thousands of pageviews a month, that is cents.
The Drain is the only real option, so that is what we built on.
The Architecture We Built
Four pieces, each doing one job:
- Web Analytics Drain (Pro, forward-only) — Vercel POSTs newline-delimited JSON (NDJSON) pageview events to our endpoint, signed with an HMAC-SHA1 digest in the
x-vercel-signatureheader. - A serverless receiver — we used a Supabase Edge Function with JWT verification turned off (the drain isn't a logged-in user; we authenticate it by signature instead). It verifies the signature against a shared secret, then inserts rows.
- A Postgres table — RLS on, writable by the service role only. The function holds the service key; nothing else can write.
- A SQL rollup view — aggregates views and visitors per path over 3, 7, and 30 days, with locale normalized so
/en/pricingand/pricingcollapse correctly.
Optionally, a token-gated JSON report endpoint and a weekly cron that emails the summary sit on top. The data layer is the load-bearing part.
Here is the flow:
Vercel Web Analytics
│ HTTPS POST, NDJSON body
│ x-vercel-signature: <hex HMAC-SHA1 of raw body>
▼
Edge Function (JWT off)
│ 1. read RAW body (bytes, not parsed)
│ 2. recompute HMAC-SHA1(secret, rawBody)
│ 3. constant-time compare vs header → 403 on mismatch
│ 4. parse NDJSON, insert rows
▼
Postgres table (RLS on, service-role insert only)
▼
SQL rollup VIEW (views + visitors per path, 3/7/30d, locale-normalized)
▼
(optional) token-gated /report → weekly cron emailThe drain's transport contract is simple and worth pinning down: it's an HTTPS POST, your endpoint must return 200 OK, and the body arrives as either a JSON array of event objects or NDJSON — one event per line. The schema carries the value vercel.analytics.v2 and each event has around 35 fields: eventType, path, timestamp, referrer, route, country/region/city, OS/device info, vercelEnvironment, and more. We store the fields we query on and keep the rest as raw JSON.
Verifying the Drain Signature (HMAC-SHA1)
This is the part that has to be exactly right, because it is your only authentication. The receiver is publicly reachable; the signature is what proves a request actually came from Vercel and not from someone POSTing junk at your function.
Vercel signs each drain delivery with an x-vercel-signature header. The value is an HMAC-SHA1 hex digest of the raw request body, keyed by the drain's signature secret. You recompute the same HMAC and compare. The docs explicitly recommend a constant-time comparison to resist timing attacks.
Two things bite people here, so I'll say them plainly:
- It is HMAC-SHA1, not SHA256. The common webhook convention is SHA256; this is not that. Use SHA1 or the signature never matches.
- You must hash the raw body bytes, not a parsed-and-re-serialized object.
JSON.parse(body)thenJSON.stringify(...)reorders keys and changes whitespace, and the digest comes out different. In an edge/Web-standard runtime, readawait request.text()and hash that string before you parse anything.
Here is the verification, in TypeScript, the way it runs in the edge function:
import { createHmac, timingSafeEqual } from "node:crypto";
const DRAIN_SECRET = Deno.env.get("VERCEL_DRAIN_SECRET")!;
function verifyDrainSignature(rawBody: string, headerSig: string | null): boolean {
if (!headerSig) return false;
// HMAC-SHA1 over the RAW body bytes, keyed by the drain secret, hex output.
const expected = createHmac("sha1", DRAIN_SECRET)
.update(rawBody, "utf8")
.digest("hex");
const a = Buffer.from(expected, "hex");
const b = Buffer.from(headerSig, "hex");
// Length check first — timingSafeEqual throws on length mismatch.
if (a.length !== b.length) return false;
// Constant-time compare so we don't leak the secret via response timing.
return timingSafeEqual(a, b);
}And the request handler that uses it. Note the ordering: read raw text, verify, then parse.
Deno.serve(async (req) => {
if (req.method !== "POST") {
return new Response("Method Not Allowed", { status: 405 });
}
// CRITICAL: read the raw body once, before any parsing.
const rawBody = await req.text();
const sig = req.headers.get("x-vercel-signature");
if (!verifyDrainSignature(rawBody, sig)) {
return new Response("invalid signature", { status: 403 });
}
// NDJSON: one JSON event per line. (Vercel can also send a JSON array.)
const events = rawBody
.trim()
.split("\n")
.filter(Boolean)
.map((line) => JSON.parse(line));
// Ack fast, store async — keep the 200 quick so the drain stays healthy.
await storeEvents(events);
return new Response("ok", { status: 200 });
});A naive === comparison would technically work but leaks information through response timing — the docs call this out, so we use timingSafeEqual. On a mismatch we return 403 and Vercel logs the failed delivery.
Storing Events in Postgres with RLS
The table is boring on purpose. A few typed columns for the fields we filter and group on, a raw jsonb column for everything else, and a received_at timestamp.
create table public.web_analytics_events (
id bigint generated always as identity primary key,
event_type text not null,
path text not null,
-- locale-normalized path, e.g. /en/pricing -> /pricing
path_norm text not null,
referrer text,
route text,
country text,
device text,
occurred_at timestamptz not null, -- event timestamp from Vercel
received_at timestamptz not null default now(),
raw jsonb not null
);
create index on public.web_analytics_events (path_norm, occurred_at);
-- RLS on. No public policies. Only the service role (which bypasses RLS)
-- can write — and only the edge function holds the service key.
alter table public.web_analytics_events enable row level security;With RLS enabled and no policies granting access, the anon and authenticated roles can't read or write the table at all. The edge function uses the service role key, which bypasses RLS, so it can insert. That key lives in the function's environment and nowhere a browser can see it. The receiver does the writing; the table is otherwise sealed.
Locale normalization happens at insert time — strip a leading /en, /fr, etc. prefix into path_norm — so the rollup groups /pricing traffic together regardless of the visitor's language path. We keep the original path too.
The Rollup View
The reason to own the data is to query it however you want. A single view gives per-path views and unique visitors over three windows.
create or replace view public.web_analytics_rollup as
with base as (
select
path_norm,
occurred_at,
-- A pageview event; a per-session identifier lives in raw for "visitors".
(raw ->> 'sessionId') as session_id
from public.web_analytics_events
where event_type = 'pageview'
)
select
path_norm,
count(*) filter (where occurred_at >= now() - interval '3 days') as views_3d,
count(*) filter (where occurred_at >= now() - interval '7 days') as views_7d,
count(*) filter (where occurred_at >= now() - interval '30 days') as views_30d,
count(distinct session_id) filter (where occurred_at >= now() - interval '3 days') as visitors_3d,
count(distinct session_id) filter (where occurred_at >= now() - interval '7 days') as visitors_7d,
count(distinct session_id) filter (where occurred_at >= now() - interval '30 days') as visitors_30d
from base
group by path_norm
order by views_30d desc;count(*) gives pageviews; count(distinct session_id) approximates visitors. FILTER (WHERE ...) lets one pass over the table produce all six numbers. Querying it is one line: select * from web_analytics_rollup limit 20; and you have your top pages by 30-day views with the shorter windows alongside.
This is the payoff. The dashboard gives you a chart. The view gives you where, join, group by, and a column you can graph in whatever you already use.
The Forward-Only Trap and Other Gotchas
The single most important thing to understand about drains: they replay nothing.
A drain forwards data from the moment you create it onward. Analytics that happened before the drain existed are gone for export purposes — there is no backfill anywhere in the docs. So the correct move is to create the drain on day one of caring about the data, not the day you decide to build the pipeline. We lost the history that predated our drain. You will too, unless you set it up early.
The rest of the gotchas, in the order they'll trip you up:
- CSV is not a backfill. It's aggregated and capped at ~250 rows. It cannot reconstruct the raw events you missed.
- Hash the raw body, not a parsed object. Re-serializing reorders keys and the HMAC won't match. Read
request.text()first. - It's SHA1, not SHA256. Don't assume the usual webhook convention.
- Use constant-time comparison. A naive
===leaks timing information about the secret. - Your endpoint must return 200, fast. If more than 80% of deliveries fail, or more than 50 fail in an hour, Vercel emails you and flags the drain as errored. Ack the request quickly and do the database write async so a slow insert never holds the 200 hostage.
- Volume costs money. At $0.50/GB, a high-traffic site generates real payload volume. Vercel offers a sampling-rate control to cap cost — but sampling means you no longer hold 100% of events, which defeats "own all my data." Decide which you want.
- Don't confuse the config API with a query API. Vercel's REST API can create/update/validate a drain. It cannot read your analytics. The pipe is configurable; the data is not readable through it.
None of these are exotic. They're just specific, and getting any one of them wrong produces a silent failure — a drain that 403s every delivery, or a signature that never matches.
Building It in One Claude Code Session
The whole thing — receiver, signature verification, table, RLS, rollup view, and a token-gated report endpoint — came together in one Claude Code session with Claude Opus 4.8 (the current default Opus model, $5/M input and $25/M output). I described the architecture, pointed it at the Vercel drain docs, and it wrote the edge function, the SQL migration, and the deploy steps.
The parts where having a model that reads the docs closely paid off were exactly the gotchas above: it used HMAC-SHA1 (not the reflexive SHA256), it read the raw body before parsing, and it reached for timingSafeEqual without being told. Those are the three things a copy-pasted webhook handler gets wrong.
If you're new to wiring an edge function to Postgres, the Claude Code + Supabase setup walkthrough covers the project structure. For the broader "describe it, let the agent build the whole feature" workflow this post is an instance of, see vibe coding a full app with Claude Code. And the optional weekly email summary is a scheduled task — a cron that hits the rollup view and sends the digest.
The lesson isn't "AI wrote my code." It's that the hard part of this build was knowing the four constraints — no API, forward-only, SHA1-over-raw-body, RLS-sealed table — and once those were stated, the implementation was mechanical. Own the constraints and the code follows.
Frequently Asked Questions
Does Vercel have an API to export Web Analytics data?
No. As of mid-2026 there is no public REST API to query Web Analytics history. Vercel's own documentation and an acknowledged-but-unbuilt community feature request confirm the only programmatic option is a Web Analytics Drain that streams raw events to an endpoint you control. The REST API can configure a drain; it cannot read your analytics.
Can I get historical Vercel Analytics data with a Drain?
No. Drains are forward-only. Vercel starts forwarding data the moment you create the drain and replays nothing from before that point. Set the drain up before you need the data. For a quick look at past aggregates, the dashboard CSV export gives up to about 250 aggregated rows — not raw events, and not a backfill.
What plan do I need for Vercel Drains?
Pro or Enterprise. Hobby and Pro Trial accounts must upgrade to Pro. Billing is $0.50 per GB of drained data, the same rate across logs, traces, web analytics, and speed insights.
How do I verify a Vercel Drain webhook is genuine?
Vercel sends an x-vercel-signature header containing an HMAC-SHA1 hex digest of the raw request body, keyed by your drain's signature secret. Recompute the HMAC over the raw body bytes and compare with a constant-time comparison; reject with 403 on mismatch. Two traps: it is SHA1, not SHA256, and you must hash the raw bytes, not a parsed-and-re-serialized object.
What format does a Vercel Web Analytics Drain send?
An HTTPS POST in either JSON (an array of event objects) or NDJSON (one JSON event per line). Each event follows the vercel.analytics.v2 schema with about 35 fields such as eventType, path, timestamp, country, and device info. Your endpoint must respond 200 OK.
How do I keep drain costs under control?
Billing is $0.50 per GB. For high-traffic sites, Vercel offers a configurable sampling rate to reduce volume. The tradeoff is real: sampling means you no longer capture 100% of events, which conflicts with the goal of owning a complete copy. If completeness matters more than cost, leave sampling off and budget for the volume.
Wrapping Up
There is no Vercel Web Analytics API, and there isn't going to be one soon. The only way to own your data is a Web Analytics Drain into a database you control — which means a signature-verifying receiver (HMAC-SHA1 over the raw body, constant-time compare), an RLS-sealed Postgres table written only by that receiver, and a rollup view that turns raw pageviews into the numbers you actually look at.
The one thing you cannot get back is history before the drain existed. Stand it up before you need it. Everything else — the function, the schema, the rollup — is a single focused build, and a tool like Claude Code that reads the docs carefully gets the three signature gotchas right on the first pass. The constraints are the whole game; the code is downstream of them.
Posted by @speedy_devv
設定をやめて、構築を始めよう。
AIオーケストレーション付きSaaSビルダーテンプレート。