All notable changes to DocuSift are recorded here. The format follows Keep a Changelog and the project adheres to Semantic Versioning.
1.0.0 — 2026-05-15
First public release of DocuSift — a multi-tenant document processing platform that classifies and extracts structured data from PDFs and images without per-vendor templates.
Document extraction
- Template-free extraction using multimodal LLMs (Anthropic, OpenAI, Google) — classifies the document type and returns structured JSON in a single pass.
- Accepts PDF, JPG, PNG, and TIFF up to 30 MB; scanned, native, handwritten, and multi-page documents all run through the same pipeline.
- Two confidence scores per document (classification + extraction) with a per-tenant auto-approve threshold — above the bar auto-approves, below it lands in the Review queue.
- Review queue with side-by-side PDF + editable fields, duplicate detection via content hash, and a processing-attempt timeline with exponential-backoff retries.
- Ingest sources: drag-and-drop upload, mobile camera capture, email intake with SPF/DKIM/DMARC enforcement, and the public REST API.
Authentication & accounts
- Email + password signup with mandatory verification; password reset via time-limited tokens.
- Multi-factor authentication (TOTP) with recovery codes, lockout after repeated failures, and step-up challenges on sensitive actions.
- OIDC SSO with auto-provisioning and role mapping (per-tenant
SsoProvider). - Social login via Google and Microsoft built on the same OIDC primitives.
- Session management with JWT refresh tokens, device tracking, and per-session revocation.
- GDPR: one-click personal-data export (JSON) and self-service account deletion, both audited end-to-end.
- Email-verification enforcement on sensitive actions (invites, API keys, billing mutations, integration OAuth) with a dismissible nag banner carrying a rate-limited resend.
Billing & plans
- Multi-gateway checkout across Stripe, Razorpay, and Dodo Payments so tenants can use the gateway best suited to their region.
- Free, Pro, and Enterprise plans with per-plan feature flags and page quotas.
- Mid-cycle plan upgrades and downgrades with prorated usage accounting.
- Page-level usage metering feeding the admin analytics dashboard.
- Webhook-driven payment-event ingestion with idempotent processing.
Integrations
- QuickBooks Online OAuth + invoice/bill push with field mapping and sandbox validation.
- Xero OAuth + data push through the same routing-rule engine.
- AWS S3 storage with presigned URLs; tenants can optionally bring their own bucket with per-tenant credentials.
- Outbound webhooks with HMAC-SHA256 signatures, exponential-backoff retries, and a per-tenant sync-history drawer.
- Routing rules for conditional auto-sync (e.g. only push invoices over a threshold to QuickBooks).
- Trash / restore workflow for documents and integrations so destructive actions stay reversible.
Public API
POST /api/v1/extractfor multipart document upload andGET /api/v1/extract/:idfor polling status + extracted payload; every response follows the{ data: ... }envelope.- API-key auth with per-key rate limits and a rotation workflow.
- Webhook push on
document.processedwith signed payloads for event-driven integrations. - OpenAPI schema + Swagger UI at
/docsfor integrators.
Admin & tenant management
- Multi-tenant isolation with
super_admin,tenant_admin, andtenant_userroles. - Admin growth dashboard: signup funnel, active tenants, plan distribution, page-count trend.
- Per-tenant audit log queryable by actor, table, and action.
- Tenant suspension with structured reason codes and an internal
TenantNotestream for ops context. - Invite / role / remove flows for team members with audit coverage.
Marketing & public site
- Landing page with animated hero, pricing, testimonials, and a developer section with copy-ready upload / poll / webhook code samples.
- Lead magnets — ROI calculator, free extraction audit, sandbox request, and template-free extraction guide PDF — with UTM attribution carried through to signup.
- Public pages: pricing, contact, terms, privacy, and this changelog, each with per-page SEO metadata and an auto-generated sitemap.
- Interim beta gate on public signup; scheduled to be removed on 2026-05-15.
UX & design system
- Light and dark themes driven entirely by CSS custom properties in
frontend/src/index.css;[data-theme='dark']switches the whole surface. - Token-first CSS — every component consumes
var(--*); token-check and style-check scripts enforce the rule in CI. - i18n scaffolding (
react-i18next) with English as the default locale and a drop-in path for additional locales. - Reusable primitives: toast notifications, promise-based confirm dialogs, drawers, onboarding wizard, PDF viewer, markdown renderer.
- Mobile-first responsive layout tested at 320 / 768 / 1024 / 1440 viewports; WCAG 2.1 AA contrast and keyboard reachability.
Observability & compliance
- Structured Pino logs with request-ID correlation across the stack.
- Per-route rate limiting via
@fastify/rate-limitwith stricter limits on auth and API-key routes. - Security headers via
@fastify/helmet; CORS configured per origin. - LLM call logging with token counts and cost attribution (
LlmCallLog). - Per-tenant data-retention policy (default 365 days) with a deletion scheduler.
- Consent tracking and privacy-by-design data model — secrets (password hashes, MFA keys, token hashes, OAuth tokens) never leave the server.
Developer experience
- TypeScript end-to-end across the Fastify backend and the React 19 / Vite 7 frontend.
- Prisma 6 with auto-generated migrations and a seed workflow for local bootstrapping.
- Vitest + Testing Library with axe-core accessibility assertions wired into component tests.
- ESLint + Prettier + token-check + style-check quality gates bound to the standard task loop.
pg-bossbackground-job queue for async work (extraction, sync, email retries).