Rebase PR #784 and fix OpenAI model/chat regressions (#1384)

* Wire conversation history through OpenAI responses API

* Fix RuboCop hash brace spacing in assistant tests

* Pipelock ignores

* Batch fixes

---------

Co-authored-by: sokiee <sokysrm@gmail.com>
This commit is contained in:
Juan José Mata
2026-04-15 18:45:24 +02:00
committed by GitHub
parent 53ea0375db
commit 7b2b1dd367
24 changed files with 937 additions and 90 deletions

View File

@@ -28,8 +28,22 @@ TWELVE_DATA_API_KEY =
OPENAI_ACCESS_TOKEN =
OPENAI_URI_BASE =
OPENAI_MODEL =
# OPENAI_REQUEST_TIMEOUT: Request timeout in seconds (default: 60)
# OPENAI_SUPPORTS_PDF_PROCESSING: Set to false for endpoints without vision support (default: true)
# LLM token budget. Applies to ALL outbound LLM calls: chat history,
# auto-categorize, merchant detection, provider enhancer, PDF processing.
# Defaults to Ollama's historical 2048-token baseline so small local models
# work out of the box — raise explicitly for cloud or larger-context models.
# LLM_CONTEXT_WINDOW = 2048 # Total tokens the model will accept
# LLM_MAX_RESPONSE_TOKENS = 512 # Reserved for the model's reply
# LLM_MAX_HISTORY_TOKENS = # Derived if unset (context - response - system_reserve)
# LLM_SYSTEM_PROMPT_RESERVE = 256 # Tokens reserved for the system prompt
# LLM_MAX_ITEMS_PER_CALL = 25 # Upper bound on auto-categorize / merchant batches
# OpenAI-compatible capability flags (custom/self-hosted providers)
# OPENAI_REQUEST_TIMEOUT = 60 # HTTP timeout in seconds; raise for slow local models
# OPENAI_SUPPORTS_PDF_PROCESSING = true # Set to false for endpoints without vision support
# OPENAI_SUPPORTS_RESPONSES_ENDPOINT = # true to force Responses API on custom providers
# LLM_JSON_MODE = # auto | strict | json_object | none
# (example: LM Studio/Docker config) OpenAI-compatible API endpoint config
# OPENAI_URI_BASE = http://host.docker.internal:1234/