Files
doc-manager/docker/Dockerfile.api
T
admin e768d30fb6 feat: import DOCX/PDF/scanned templates via DeepSeek recognition
Backend pipeline:
- POST /api/templates/import (multipart, max 25 MB)
- extract.ts: DOCX→mammoth, PDF→pdf-parse, fallback to OCR via tesseract+poppler-utils
  (pdftoppm renders pages to PNG, tesseract reads with rus+eng)
- deepseek.ts: chat completions client with strict JSON response_format
- recognize.ts: structured prompt that produces simplified DocBody (string text),
  postprocessor wraps text in TipTap-compatible JSON, validates with zod schema
- prompt enforces placeholder substitution: {{customer.*}}, {{executor.*}},
  {{contract.number}}, {{contract.date}}, {{today}}
- error codes: NO_OCR / NO_DEEPSEEK_KEY / UNSUPPORTED_MIME / INVALID_DOC_BODY

Dockerfile: apk add tesseract-ocr (+rus +eng data), poppler-utils, imagemagick

Frontend:
- Templates page: ⤴ Загрузить документ → file picker (.docx,.pdf,.png,.jpg)
- doc type selector (contract/invoice/act/upd)
- import-banner with spinner shows uploading→analyzing stages
- on success navigates to /templates/:id (TemplateEdit) for review

Reuses DEEPSEEK_API_KEY pattern from Hall-planer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:40:28 +03:00

43 lines
1.0 KiB
Docker

FROM node:20-alpine
WORKDIR /app
# Chromium для Puppeteer (PDF рендер) + шрифты для кириллицы.
# nss/freetype/harfbuzz нужны самому chromium для рендера, ttf-* — для текста.
RUN apk add --no-cache \
openssl \
tini \
chromium \
nss \
freetype \
harfbuzz \
ca-certificates \
ttf-dejavu \
ttf-liberation \
font-noto-cjk \
tesseract-ocr \
tesseract-ocr-data-rus \
tesseract-ocr-data-eng \
poppler-utils \
imagemagick
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
ENV PUPPETEER_SKIP_DOWNLOAD=true
COPY package.json package-lock.json* tsconfig.base.json ./
COPY apps/api/package.json apps/api/
COPY packages/shared/package.json packages/shared/
RUN npm install --include=dev
COPY apps/api ./apps/api
COPY packages/shared ./packages/shared
RUN cd apps/api && npx prisma generate
ENV NODE_ENV=production
WORKDIR /app/apps/api
EXPOSE 3030
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["sh", "-c", "npx prisma migrate deploy && npx tsx src/server.ts"]