How DocuMind works — AI PDF chat, RAG & API

Personal Workspace

You sign in with your email (one-time code). Your account is a personal workspace: you create chats, upload PDFs, and ask questions in natural language.

Documents

Global documents — PDFs you add under “Global Documents” are available across your account. Good for policies, manuals, or reference material you want in every chat context.
Per-chat documents — When you upload while a chat is open, files can be tied to that chat so answers stay scoped to that conversation’s knowledge.
Uploads are PDF only right now. Text is extracted (including OCR on sparse pages when needed) and embedded for search.

Chat

Each chat keeps its own message history. The model uses retrieved chunks from the documents that apply to that session (global and/or chat-scoped, depending on what you uploaded and how keys are scoped—see API below).

Company mode

If your organization uses company logins, DocuMind can attach users to a shared company library based on email domain (e.g. you@acme.com).

HR uploads — Typically only an HR-style address (e.g. hr@yourcompany.com) can upload company PDFs. Those become the knowledge base for employees at the same company.
Employees — Can read and chat against those documents; they do not replace HR uploads unless your deployment allows it.
Visibility — Admins may control whether document counts are visible to all employees (see the sidebar options when you are logged in as HR).

API keys are available only for personal accounts (not company-type users). Use the web UI for company libraries, or integrate via your own backend that calls DocuMind with appropriate auth if you expose that internally.

HTTP API

Call POST /api/v1/chat for a full JSON reply, or POST /api/v1/chat/stream for SSE token streaming. Send your user-created API key in the header Authorization: Bearer <api_key>.

Create keys (personal only)

In the sidebar, use API keys (global documents) for a key that only sees global uploads.
For a key scoped to one chat, create it from the chat flow in the app (scope chat + chat name) so retrieval uses that chat’s document scope.

Request body

JSON object: message (string, required) and optional history — an array of {"role":"user"|"assistant","content":"..."} (up to ~120 turns). Responses are compact for API use.

Non-streaming: `POST /api/v1/chat`

# Replace YOUR_BASE_URL and YOUR_API_KEY. Response: {"reply":"..."} or {"error":"..."}.
$ curl -sS -X POST "YOUR_BASE_URL/api/v1/chat" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message":"Summarize my uploaded policy.","history":[]}'

# PowerShell — same endpoint; escape quotes as shown.
PS> $body = '{"message":"Summarize my uploaded policy.","history":[]}'
PS> Invoke-RestMethod -Uri "YOUR_BASE_URL/api/v1/chat" -Method Post `
  -Headers @{ Authorization = "Bearer YOUR_API_KEY" } -Body $body -ContentType "application/json"

// Browser or Node 18+ (fetch). Use your real origin in production.
$ const BASE = "YOUR_BASE_URL";
$ const key = "YOUR_API_KEY";
$ const res = await fetch(BASE + "/api/v1/chat", {
$   method: "POST",
$   headers: {
$     "Authorization": "Bearer " + key,
$     "Content-Type": "application/json",
$   },
$   body: JSON.stringify({
$     message: "Summarize my uploaded policy.",
$     history: [{ role: "user", content: "Hi" }, { role: "assistant", content: "Hello!" }],
$   }),
$ });
$ const data = await res.json();
$ console.log(data.reply);

# pip install requests
$ import requests
$ r = requests.post(
$     "YOUR_BASE_URL/api/v1/chat",
$     headers={"Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json"},
$     json={"message": "Summarize my uploaded policy.", "history": []},
$ )
$ print(r.json())

Streaming: `POST /api/v1/chat/stream`

Same body and Authorization header. The response is text/event-stream. Each SSE data line is JSON: deltas {"t":"d","c":"..."}, then {"t":"done"}, or errors {"t":"e","m":"...","code":...} (still HTTP 200 in many cases—parse the event type).

# Raw SSE on stdout; use -N so curl disables buffering.
$ curl -N -X POST "YOUR_BASE_URL/api/v1/chat/stream" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{"message":"Say hello in one sentence.","history":[]}'

# Windows: curl.exe streams SSE (ships with Windows 10+).
PS> curl.exe -N -X POST "YOUR_BASE_URL/api/v1/chat/stream" ^
  -H "Authorization: Bearer YOUR_API_KEY" ^
  -H "Content-Type: application/json" ^
  -d "{\"message\":\"Say hello in one sentence.\",\"history\":[]}"

// Fetch + ReadableStream: parse lines starting with "data: ".
$ const BASE = "YOUR_BASE_URL";
$ const res = await fetch(BASE + "/api/v1/chat/stream", {
$   method: "POST",
$   headers: {
$     Authorization: "Bearer YOUR_API_KEY",
$     "Content-Type": "application/json",
$     Accept: "text/event-stream",
$   },
$   body: JSON.stringify({ message: "Say hello.", history: [] }),
$ });
$ const reader = res.body.getReader();
$ const dec = new TextDecoder();
$ let buf = "", out = "";
$ while (true) {
$   const { value, done } = await reader.read();
$   if (done) break;
$   buf += dec.decode(value, { stream: true });
$   const lines = buf.split("\n");
$   buf = lines.pop() || "";
$   for (const line of lines) {
$     const m = line.match(/^data:\s*(.+)/);
$     if (!m) continue;
$     const j = JSON.parse(m[1]);
$     if (j.t === "d" && j.c) out += j.c;
$   }
$ }
$ console.log(out);

# pip install requests
$ import json, requests
$ with requests.post(
$     "YOUR_BASE_URL/api/v1/chat/stream",
$     headers={"Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json"},
$     json={"message": "Say hello.", "history": []},
$     stream=True,
$ ) as r:
$     for line in r.iter_lines(decode_unicode=True):
$         if not line or not line.startswith("data:"):
$             continue
$         j = json.loads(line[5:].lstrip())
$         if j.get("t") == "d" and j.get("c"):
$             print(j["c"], end="", flush=True)

CORS: If you call the API from a browser on another domain, your DocuMind server must allow that origin (FastAPI middleware / reverse proxy). From a server-side app, CORS does not apply.

Guides on AI PDF chat, RAG, and team workflows—linked for easy discovery from this page.

How it works?