API reference

Stable v0.1. The /extract shape is locked; everything else may evolve up to v1.0.

Auth

API consumers authenticate with bearer tokens. Mint a token on the dashboard. The plaintext is shown once; we store only a SHA-256 hash. Lost tokens cannot be recovered, only revoked & replaced.

Authorization: Bearer omk_your-token-here

Pricing

Credit type	What 1 credit buys	Free tier (signup)
LLM	$0.001 of LLM cost (token-priced via the cascade router)	5,000 ($5)
GPU	1 second of GPU wall time (mineru / paddle parses)	600 (10 min)

A typical 30-page slide-deck parse with chart-VLM costs ~3 LLM credits + ~30 GPU credits. A 64-page IC memo through the full pack pipeline costs ~780 LLM credits ($0.78). Top-ups land in v1 via Stripe.

Endpoints

POST /extract

Run an extraction pass on an uploaded PDF. multipart/form-data.

Field	Type	Default	Description
`file`	file	—	The PDF (required).
`parser`	string	`docling`	One of `docling`, `paddleocr_vl`, `mineru`, `mineru_remote`.
`enable_chart_vlm`	bool	`false`	Run the chart-VLM enhancer over figure regions.
`chart_vlm_model`	string	(profile default)	Override the VLM model — e.g. `openrouter/qwen/qwen3-vl-30b-a3b-instruct`.

Example — parse a PDF with chart-VLM on a GPU host:

curl -X POST https://api.omega-extract.com/extract \
  -H "Authorization: Bearer omk_..." \
  -F "file=@uber-q1.pdf" \
  -F "parser=mineru_remote" \
  -F "enable_chart_vlm=true"

Response (200):

{
  "job_id": "9c1f…",
  "status": "done",
  "parser": "mineru_remote",
  "pages": 29,
  "blocks": 318,
  "tables": 12,
  "figures_described": 20,
  "llm_credits_charged": 4,
  "gpu_credits_charged": 75,
  "elapsed_parse_s": 75.3,
  "elapsed_vlm_s": 50.8,
  "result": { "pages": [ ... ] }
}

Errors:

Code	When
401	Missing or invalid bearer token.
402	Insufficient credits — top up.
400	Unknown parser, bad PDF, missing field.
502	Parse failed upstream (e.g. GPU unreachable).

GET /extract/jobs?limit=20

List your recent extraction jobs.

GET /credits/api

Bearer-auth balance check. Returns {"llm": int, "gpu": int}.

Frontend (session-cookie auth)

The dashboard at /dashboard talks to these via the session cookie:

POST /auth/signup, POST /auth/login, POST /auth/logout, GET /auth/me
POST /tokens (returns plaintext once), GET /tokens, DELETE /tokens/{id}
GET /credits, GET /credits/history?limit=50

Production checklist

Before opening this API to the internet, replace these v0 stubs:

Auth provider. Email/password is convenient for pilots; for production, swap services/api/users.py for an OAuth integration (Auth0, WorkOS, Clerk).
Payments. Replace POST /credits/grant (admin-keyed) with a Stripe webhook that grants on payment_intent.succeeded.
Database. Set OMEGA_API_DATABASE_URL to a Postgres URL. The schema is created via SQLAlchemy on first boot for v0; v1 ships Alembic migrations.
Cookie security. Set OMEGA_API_SECRET_KEY to a random 256-bit value. Behind HTTPS, flip the cookie's secure flag in users.py.
Rate limiting. The /extract endpoint has no per-token rate limit yet — fine for one-or-two pilot customers, plug nginx limit_req in front of it for general availability.
Observability. Wire Langfuse callbacks into the cascade router (already supported via core/observability/langfuse_tracer.py) so per-request token costs are exact instead of heuristic-priced.