Quickstart
Three steps from signup to extracted data.
- Sign up and mint an API key under Settings → API Keys. Keys look like
ds_live_… and are tenant-scoped. POST a file to /api/v1/extract with the X-API-Key header. The response holds the document id.- Either
GET /api/v1/extract/:id until status is extracted (or approved), or register a webhook and we’ll POST the result to you.
Authentication
Every request sends the key in an X-API-Key header. Keys are tenant-scoped, revocable from the dashboard, and carry a per-key rate limit configurable by the tenant admin.
X-API-Key: ds_live_3f8a…
- Transport: HTTPS only. HTTP requests are refused.
- Rotation: generate a new key, switch clients over, then revoke the old one. Never ship keys in browser or mobile code.
- Rate limit: default 60 requests per minute per key; surfaced as HTTP
429 with a Retry-After header.
Endpoints
Two endpoints cover the full lifecycle. Base URL: https://docusift.co.
| Method | Path | Purpose |
|---|
| POST | /api/v1/extract | Upload a document (multipart/form-data). Returns 202 with the created document id. |
| GET | /api/v1/extract/:id | Read document status and, once complete, the extracted data plus an inlined ocr object with the rendered Markdown and raw OCR JSON. |
Upload a document
One or more files per request. Files that fail validation land in errors while valid files still process.
curl
curl -X POST https://docusift.co/api/v1/extract \
-H "X-API-Key: ds_…" \
-F "file=@invoice-1042.pdf"
# 202 Accepted
{
"data": [
{
"id": "doc_01HZ…",
"file_name": "invoice-1042.pdf",
"mime_type": "application/pdf",
"source": "api",
"status": "uploaded"
}
],
"errors": []
}
node.js
import { readFileSync } from 'node:fs';
const form = new FormData();
form.set(
'file',
new Blob([readFileSync('invoice-1042.pdf')], { type: 'application/pdf' }),
'invoice-1042.pdf',
);
const res = await fetch('https://docusift.co/api/v1/extract', {
method: 'POST',
headers: { 'X-API-Key': process.env.DOCUSIFT_API_KEY! },
body: form,
});
const { data, errors } = await res.json();
const [doc] = data;
console.log(doc.id);
Poll for the result
Typical completion is a few seconds. Poll on 1s backing off to 5s, or skip polling entirely by registering a webhook.
curl
curl https://docusift.co/api/v1/extract/doc_01HZ… \
-H "X-API-Key: ds_…"
# 200 OK
{
"data": {
"id": "doc_01HZ…",
"file_name": "invoice-1042.pdf",
"doc_type": "invoice",
"status": "extracted",
"classification_confidence": 0.99,
"extraction_confidence": 0.96,
"data": {
"invoice_number": "INV-1042",
"vendor_name": "Acme Co.",
"total_amount": 1284.50,
"currency": "USD",
"line_items": [ /* … */ ]
},
"ocr": {
"markdown": "# Invoice INV-1042\n\n...",
"json": { "pages": [ { "page_number": 1, "words": [ /* … */ ] } ] }
}
}
}
node.js
const res = await fetch(
`https://docusift.co/api/v1/extract/${id}`,
{ headers: { 'X-API-Key': process.env.DOCUSIFT_API_KEY! } },
);
const { data } = await res.json();
if (data.status === 'extracted' || data.status === 'approved') {
// data.data holds the extracted fields
console.log(data.data.invoice_number);
}
Response envelope
Every response wraps its payload in a data key. For a single document the shape is { data: {…fields} }; for collections it’s { data: [ … ], errors: [ … ] }. Extracted fields live under data.data.
- Status lifecycle:
uploaded → classifying → extracted / needs_review → approved. - Confidence:
classification_confidence and extraction_confidence are both floats in [0, 1]. - Timestamps: all timestamps are ISO-8601 UTC strings or
null.
File support
| MIME type | Extensions | Max size |
|---|
application/pdf | .pdf | 30 MB |
image/jpeg | .jpg, .jpeg | 30 MB |
image/png | .png | 30 MB |
image/tiff | .tif, .tiff | 30 MB |
Scanned, native, handwritten, and multi-page documents all run through the same pipeline. Oversized or unsupported files appear in the errors array without blocking the rest of the batch.
Webhooks
Register a target URL in Settings → Integrations. We POST a signed JSON payload on document.processed with 3 retries on exponential backoff (60s, 5m, 15m).
example payload
POST https://your-app.example.com/webhooks/docusift
Content-Type: application/json
X-DocuSift-Event: document.processed
X-DocuSift-Signature: sha256=5e3a…b9f1
{
"event": "document.processed",
"timestamp": "2026-04-23T12:04:11.203Z",
"data": {
"id": "doc_01HZ…",
"doc_type": "invoice",
"status": "approved",
"data": { /* same extracted fields */ }
}
}
Verify the signature
The X-DocuSift-Signature header is an HMAC-SHA256 of the raw request body using your webhook secret. Always verify in constant time.
node.js
import { createHmac, timingSafeEqual } from 'node:crypto';
export function verifyDocuSiftSignature(
rawBody: string,
header: string,
secret: string,
): boolean {
const expected = 'sha256=' +
createHmac('sha256', secret).update(rawBody).digest('hex');
const a = Buffer.from(header);
const b = Buffer.from(expected);
return a.length === b.length && timingSafeEqual(a, b);
}
Error codes
Errors respond with a { detail: string } body and a status code from this table.
| Status | Meaning | What to do |
|---|
400 | Malformed request body or missing file | Check multipart encoding; retry with a valid file. |
401 | Missing or invalid API key | Confirm X-API-Key. Keys revoke instantly. |
404 | Document does not belong to the caller’s tenant | Check the id and the key’s tenant. |
413 | File exceeds 30 MB | Downscale, split, or compress the source. |
415 | Unsupported MIME type | Upload one of pdf / jpeg / png / tiff. |
429 | Per-key rate limit hit | Respect Retry-After. Back off with jitter. |
5xx | Transient server error | Retry with exponential backoff; contact support if persistent. |