feat: import DOCX/PDF/scanned templates via DeepSeek recognition
Backend pipeline:
- POST /api/templates/import (multipart, max 25 MB)
- extract.ts: DOCX→mammoth, PDF→pdf-parse, fallback to OCR via tesseract+poppler-utils
(pdftoppm renders pages to PNG, tesseract reads with rus+eng)
- deepseek.ts: chat completions client with strict JSON response_format
- recognize.ts: structured prompt that produces simplified DocBody (string text),
postprocessor wraps text in TipTap-compatible JSON, validates with zod schema
- prompt enforces placeholder substitution: {{customer.*}}, {{executor.*}},
{{contract.number}}, {{contract.date}}, {{today}}
- error codes: NO_OCR / NO_DEEPSEEK_KEY / UNSUPPORTED_MIME / INVALID_DOC_BODY
Dockerfile: apk add tesseract-ocr (+rus +eng data), poppler-utils, imagemagick
Frontend:
- Templates page: ⤴ Загрузить документ → file picker (.docx,.pdf,.png,.jpg)
- doc type selector (contract/invoice/act/upd)
- import-banner with spinner shows uploading→analyzing stages
- on success navigates to /templates/:id (TemplateEdit) for review
Reuses DEEPSEEK_API_KEY pattern from Hall-planer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -13,7 +13,12 @@ RUN apk add --no-cache \
|
||||
ca-certificates \
|
||||
ttf-dejavu \
|
||||
ttf-liberation \
|
||||
font-noto-cjk
|
||||
font-noto-cjk \
|
||||
tesseract-ocr \
|
||||
tesseract-ocr-data-rus \
|
||||
tesseract-ocr-data-eng \
|
||||
poppler-utils \
|
||||
imagemagick
|
||||
|
||||
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
|
||||
ENV PUPPETEER_SKIP_DOWNLOAD=true
|
||||
|
||||
@@ -42,6 +42,9 @@ services:
|
||||
TOCHKA_JWT_KEY: ${TOCHKA_JWT_KEY:-}
|
||||
TOCHKA_WEBHOOK_SECRET: ${TOCHKA_WEBHOOK_SECRET:-}
|
||||
DADATA_API_KEY: ${DADATA_API_KEY:-}
|
||||
DEEPSEEK_API_KEY: ${DEEPSEEK_API_KEY:-}
|
||||
DEEPSEEK_BASE_URL: ${DEEPSEEK_BASE_URL:-https://api.deepseek.com}
|
||||
DEEPSEEK_MODEL: ${DEEPSEEK_MODEL:-deepseek-chat}
|
||||
DEFAULT_ORGANIZATION_ID: ${DEFAULT_ORGANIZATION_ID:-00000000-0000-0000-0000-000000000001}
|
||||
DEV_BYPASS_AUTH: "0"
|
||||
expose:
|
||||
|
||||
Reference in New Issue
Block a user