Skip to main content

DAN AI — FAQ and Troubleshooting

This page collects the questions we get most often, plus the symptoms that show up most often in production and how to fix them. For deeper coverage of a specific topic, follow the links into the rest of the documentation.

Tips and best practices

Pick the most specific document type you can

invoice will outperform general on the same file because the invoice prompt teaches the AI about line items, tax columns and totals. Same goes for bank_statement vs general on a statement, or medical_report vs general on a lab report. See the full list at Supported Document Types.

Contrast matters more than resolution

For scanned documents, a clean 200-DPI black-on-white scan usually beats a 400-DPI scan with a faded background. Re-scan rather than upscaling a low-contrast original.

Edit once, mark verified

If a field is consistently wrong on a particular vendor's documents, edit it once and click Mark Verified. The verified values are what flow into exports and webhook payloads — the raw AI response is still preserved in document_extractions.raw_response for audit.

Iterate prompts against the same document

When iterating on a custom prompt, re-extract the same document several times to compare runs without re-uploading. Saves time and gives you a clean comparison. See Custom Fields and Templates.

Use the Activity log to spot recurring problems

Developer → Activity shows recent API calls with status codes and error messages. If you're seeing the same 4xx error on every call, it's almost always a client bug. Same 5xx on every call to one doc_type usually means a provider issue — check Settings → AI Providers.

Frequently asked questions

Which file formats does DAN AI accept?

PDFs, PNGs and JPGs. A multi-page PDF is processed as a single document. PDFs containing scanned images are supported — DAN AI runs OCR transparently.

Is there a file-size limit?

Yes — a per-file size limit applies on every upload path. Very large multi-page PDFs may also be truncated by the AI provider's context window even if they're under the file limit. If you receive empty or partial fields on a long document, try splitting it.

Does DAN AI store my files?

Uploaded files are stored encrypted at rest so that the document detail page can preview them later. You can delete a document (and its underlying file) any time from the dashboard or via the API.

Are OAuth refresh tokens safe?

Yes. Refresh tokens from Gmail/Outlook OAuth are encrypted at rest using AES-256-GCM. Tokens are only ever decrypted in-memory when a sync runs.

Which AI providers does DAN AI use?

The default provider chain is Ollama → OpenRouter → Groq. Ollama is the primary provider; if it fails or times out, DAN AI automatically falls back to the next provider in the chain. Admins can re-order or disable providers under Settings → AI Providers.

How long does a typical extraction take?

FileTypical time
Single-page invoice / receipt2–6 s
Multi-page bank statement8–25 s
Long contract (10+ pages)15–60 s
Failed and retried via fallback providerup to 90 s

Do webhook deliveries come in order?

No — a fast extraction triggered after a slow one can land first. Use the timestamp field on the payload if you need ordering. See Webhooks.

Can I export multiple documents at once?

Yes. From the Documents list, tick the documents you want and pick Export → JSON (zip) or Export → Excel (combined). Programmatically, iterate the documents list endpoint and pull each one individually.

Can I run DAN AI from my own infrastructure?

DAN AI is a hosted SaaS at dan.sdlccorp.com. For on-premise or VPC-deployed installations, contact SDLC Corp directly.

Troubleshooting

Stuck in processing for more than 5 minutes

Likely cause: the AI provider chain couldn't recover.

Fix: Open Settings → AI Providers and check each provider's status. Retry the extraction from the document detail page. If retries also stick, an admin may need to refresh provider API keys. The provider chain auto-repairs once a working provider is found again.

Confidence below 50 on every upload

Likely cause: scan quality, or the wrong doc_type.

Fix: Re-scan at higher contrast (not necessarily higher DPI). If that doesn't help, switch the document type to general for one run to see what the AI actually extracts — that often reveals what's going wrong.

Inbox sync not picking up new mail

Possible causes:

  • The OAuth refresh token expired or was revoked → reconnect the mailbox from Inboxes.
  • A rule's regex is invalid → open Inboxes → Manage Rules and dry-run the ruleset.
  • The sender / subject doesn't actually match any rule → dry-run with a sample to confirm.
  • The provider rate-limited DAN AI → wait for the next interval, no action needed.

401 Unauthorized on the API

Cause: missing, wrong, or revoked API key.

Fix: Generate a new key from Developer → API Keys. Confirm you're sending the right header (X-API-Key: or Authorization: Bearer …, not both).

413 Payload Too Large on the API

Cause: the file exceeds DAN AI's per-file size limit.

Fix: Split a multi-page PDF, or re-render the file at a lower DPI. For very long bank statements, splitting by month also improves extraction quality.

429 Too Many Requests on the API

Cause: API-key rate limit hit.

Fix: Back off until the X-RateLimit-Reset timestamp (unix seconds) in the response headers. Spread bursty workloads using async=true to queue extractions instead of running them synchronously.

Excel export missing line items

Cause: the document type has no line items (resumes, contracts, ID cards, shipping labels, etc.).

Fix: This is expected — the Line Items sheet is omitted entirely for non-tabular types. The Fields sheet still contains everything.

Webhook deliveries failing with 408 / timeout

Cause: your endpoint took longer than 15 seconds to respond.

Fix: Acknowledge the delivery with a 2xx first, then do the heavy work in a background job. DAN AI doesn't wait for your processing — it only cares that the delivery was received.

Webhook signature verification fails

Cause: the body has been parsed/re-serialized before HMAC computation, changing whitespace.

Fix: Capture the raw request body as bytes before any JSON parsing. In Express, that's bodyParser.json({ verify: (req, res, buf) => { req.rawBody = buf; } }). In other frameworks, look for an equivalent "raw body" hook. See Webhooks for a Node.js example.

Still stuck?