e768d30fb6
Backend pipeline:
- POST /api/templates/import (multipart, max 25 MB)
- extract.ts: DOCX→mammoth, PDF→pdf-parse, fallback to OCR via tesseract+poppler-utils
(pdftoppm renders pages to PNG, tesseract reads with rus+eng)
- deepseek.ts: chat completions client with strict JSON response_format
- recognize.ts: structured prompt that produces simplified DocBody (string text),
postprocessor wraps text in TipTap-compatible JSON, validates with zod schema
- prompt enforces placeholder substitution: {{customer.*}}, {{executor.*}},
{{contract.number}}, {{contract.date}}, {{today}}
- error codes: NO_OCR / NO_DEEPSEEK_KEY / UNSUPPORTED_MIME / INVALID_DOC_BODY
Dockerfile: apk add tesseract-ocr (+rus +eng data), poppler-utils, imagemagick
Frontend:
- Templates page: ⤴ Загрузить документ → file picker (.docx,.pdf,.png,.jpg)
- doc type selector (contract/invoice/act/upd)
- import-banner with spinner shows uploading→analyzing stages
- on success navigates to /templates/:id (TemplateEdit) for review
Reuses DEEPSEEK_API_KEY pattern from Hall-planer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
43 lines
1.0 KiB
Docker
43 lines
1.0 KiB
Docker
FROM node:20-alpine
|
|
WORKDIR /app
|
|
|
|
# Chromium для Puppeteer (PDF рендер) + шрифты для кириллицы.
|
|
# nss/freetype/harfbuzz нужны самому chromium для рендера, ttf-* — для текста.
|
|
RUN apk add --no-cache \
|
|
openssl \
|
|
tini \
|
|
chromium \
|
|
nss \
|
|
freetype \
|
|
harfbuzz \
|
|
ca-certificates \
|
|
ttf-dejavu \
|
|
ttf-liberation \
|
|
font-noto-cjk \
|
|
tesseract-ocr \
|
|
tesseract-ocr-data-rus \
|
|
tesseract-ocr-data-eng \
|
|
poppler-utils \
|
|
imagemagick
|
|
|
|
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
|
|
ENV PUPPETEER_SKIP_DOWNLOAD=true
|
|
|
|
COPY package.json package-lock.json* tsconfig.base.json ./
|
|
COPY apps/api/package.json apps/api/
|
|
COPY packages/shared/package.json packages/shared/
|
|
|
|
RUN npm install --include=dev
|
|
|
|
COPY apps/api ./apps/api
|
|
COPY packages/shared ./packages/shared
|
|
|
|
RUN cd apps/api && npx prisma generate
|
|
|
|
ENV NODE_ENV=production
|
|
WORKDIR /app/apps/api
|
|
EXPOSE 3030
|
|
|
|
ENTRYPOINT ["/sbin/tini", "--"]
|
|
CMD ["sh", "-c", "npx prisma migrate deploy && npx tsx src/server.ts"]
|