Files
sure/app/models
Guillem Arias 38e950fe23 feat(ai): Anthropic native PDF processing (3/5)
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.

- Provider::Anthropic::PdfProcessor classifies the document, summarizes
  it, and extracts statement metadata via a forced report_document_analysis
  tool whose input_schema mirrors the existing Provider::Openai output
  (document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
  { transactions, period, account_holder, account_number, bank_name,
  opening_balance, closing_balance } shape via report_bank_statement so
  downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
  { type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
  — Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
  No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
  claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
  including cache_creation/cache_read tokens.

Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.

Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
2026-05-29 14:51:09 +02:00
..
2025-03-28 13:08:22 -04:00
2026-03-11 15:54:01 +01:00
2026-01-09 19:38:04 +01:00
2026-01-22 20:37:07 +01:00
2026-01-22 20:37:07 +01:00
2026-03-25 17:47:04 +01:00
2026-03-25 17:47:04 +01:00
2026-04-09 18:33:59 +02:00
2026-03-25 10:50:23 +01:00
2026-03-25 10:50:23 +01:00
2026-03-25 10:50:23 +01:00
2024-10-18 11:26:58 -05:00
2026-03-25 10:50:23 +01:00
2026-01-23 22:05:28 +01:00
2024-08-23 09:33:42 -04:00
2025-11-01 09:12:42 +01:00
2026-04-10 17:42:16 +02:00
2026-01-23 22:05:28 +01:00
2025-03-28 13:08:22 -04:00