QT/sure - sure

QT/sure

mirror of https://github.com/we-promise/sure.git synced 2026-04-07 14:31:25 +00:00

Author	SHA1	Message	Date
LPW	ca8f04040f	Expand AI docs: external assistant, MCP, architecture, troubleshooting (#1115 ) * Expand AI docs: architecture, MCP, external assistant setup, troubleshooting - Add architecture overview explaining two independent AI pipelines (chat assistant vs auto-categorization) - Document MCP callback endpoint (JSON-RPC 2.0, auth, available tools) - Add OpenClaw gateway configuration example - Add Kubernetes network policy guidance (targetPort vs servicePort) - Add Pipelock notes (mcpToolPolicy, NO_PROXY behavior) - Add troubleshooting for "Failed to generate response" with external assistant - Fix stale function list (4 tools -> 7) - Fix incorrect env-vs-UI precedence statement - Fix em-dashes in existing content * Fix troubleshooting curl to use pod env vars Use sh -c so $EXTERNAL_ASSISTANT_TOKEN and $EXTERNAL_ASSISTANT_URL expand inside the pod, not on the local shell.	2026-03-04 11:26:43 +01:00
LPW	84bfe5b7ab	Add external AI assistant with Pipelock security proxy (#1069 ) * feat(helm): add Pipelock ConfigMap, scanning config, and consolidate compose - Add ConfigMap template rendering DLP, response scanning, MCP input/tool scanning, and forward proxy settings from values - Mount ConfigMap as /etc/pipelock/pipelock.yaml volume in deployment - Add checksum/config annotation for automatic pod restart on config change - Gate HTTPS_PROXY/HTTP_PROXY env injection on forwardProxy.enabled (skip in MCP-only mode) - Use hasKey for all boolean values to prevent Helm default swallowing false - Single source of truth for ports (forwardProxy.port/mcpProxy.port) - Pipelock-specific imagePullSecrets with fallback to app secrets - Merge standalone compose.example.pipelock.yml into compose.example.ai.yml - Add pipelock.example.yaml for Docker Compose users - Add exclude-paths to CI workflow for locale file false positives * Add external assistant support (OpenAI-compatible SSE proxy) Allow self-hosted instances to delegate chat to an external AI agent via an OpenAI-compatible streaming endpoint. Configurable per-family through Settings UI or ASSISTANT_TYPE env override. - Assistant::External::Client: SSE streaming HTTP client (no new gems) - Settings UI with type selector, env lock indicator, config status - Helm chart and Docker Compose env var support - 45 tests covering client, config, routing, controller, integration * Add session key routing, email allowlist, and config plumbing Route to the actual OpenClaw session via x-openclaw-session-key header instead of creating isolated sessions. Gate external assistant access behind an email allowlist (EXTERNAL_ASSISTANT_ALLOWED_EMAILS env var). Plumb session_key and allowedEmails through Helm chart, compose, and env template. * Add HTTPS_PROXY support to External::Client for Pipelock integration Net::HTTP does not auto-read HTTPS_PROXY/HTTP_PROXY env vars (unlike Faraday). Explicitly resolve proxy from environment in build_http so outbound traffic to the external assistant routes through Pipelock's forward proxy when enabled. Respects NO_PROXY for internal hosts. * Add UI fields for external assistant config (Setting-backed with env fallback) Follow the same pattern as OpenAI settings: database-backed Setting fields with env var defaults. Self-hosters can now configure the external assistant URL, token, and agent ID from the browser (Settings > Self-Hosting > AI Assistant) instead of requiring env vars. Fields disable when the corresponding env var is set. * Improve external assistant UI labels and add help text Change placeholder to generic OpenAI-compatible URL pattern. Add help text under each field explaining where the values come from: URL from agent provider, token for authentication, agent ID for multi-agent routing. * Add external assistant docs and fix URL help text Add External AI Assistant section to docs/hosting/ai.md covering setup (UI and env vars), how it works, Pipelock security scanning, access control, and Docker Compose example. Drop "chat completions" jargon from URL help text. * Harden external assistant: retry logic, disconnect UI, error handling, and test coverage - Add retry with backoff for transient network errors (no retry after streaming starts) - Add disconnect button with confirmation modal in self-hosting settings - Narrow rescue scope with fallback logging for unexpected errors - Safe cleanup of partial responses on stream interruption - Gate ai_available? on family assistant_type instead of OR-ing all providers - Truncate conversation history to last 20 messages - Proxy-aware HTTP client with NO_PROXY support - Sanitize protocol to use generic headers (X-Agent-Id, X-Session-Key) - Full test coverage for streaming, retries, proxy routing, config, and disconnect * Exclude external assistant client from Pipelock scan-diff False positive: `@token` instance variable flagged as "Credential in URL". Temporary workaround until Pipelock supports inline suppression. * Address review feedback: NO_PROXY boundary fix, SSE done flag, design tokens - Fix NO_PROXY matching to require domain boundary (exact match or .suffix), case-insensitive. Prevents badexample.com matching example.com. - Add done flag to SSE streaming so read_body stops after [DONE] - Move MAX_CONVERSATION_MESSAGES to class level - Use bg-success/bg-destructive design tokens for status indicators - Add rationale comment for pipelock scan exclusion - Update docs last-updated date * Address second round of review feedback - Allowlist email comparison is now case-insensitive and nil-safe - Cap SSE buffer at 1 MB to prevent memory blowup from malformed streams - Don't expose upstream HTTP response body in user-facing errors (log it instead) - Fix frozen string warning on buffer initialization - Fix "builtin" typo in docs (should be "built-in") * Protect completed responses from cleanup, sanitize error messages - Don't destroy a fully streamed assistant message if post-stream metadata update fails (only cleanup partial responses) - Log raw connection/HTTP errors internally, show generic messages to users to avoid leaking network/proxy details - Update test assertions for new error message wording * Fix SSE content guard and NO_PROXY test correctness Use nil check instead of present? for SSE delta content to preserve whitespace-only chunks (newlines, spaces) that can occur in code output. Fix NO_PROXY test to use HTTP_PROXY matching the http:// client URL so the proxy resolution and NO_PROXY bypass logic are actually exercised. * Forward proxy credentials to Net::HTTP Pass proxy_uri.user and proxy_uri.password to Net::HTTP.new so authenticated proxies (http://user:pass@host:port) work correctly. Without this, credentials parsed from the proxy URL were silently dropped. Nil values are safe as positional args when no creds exist. * Update pipelock integration to v0.3.1 with full scanning config Bump Helm image tag from 0.2.7 to 0.3.1. Add missing security sections to both the Helm ConfigMap and compose example config: mcp_tool_policy, mcp_session_binding, and tool_chain_detection. These protect the /mcp endpoint against tool injection, session hijacking, and multi-step exfiltration chains. Add version and mode fields to config files. Enable include_defaults for DLP and response scanning to merge user patterns with the 35 built-in patterns. Remove redundant --mode CLI flag from the Helm deployment template since mode is now in the config file.	2026-03-03 15:47:51 +01:00
Juan José Mata	4e4ca916a1	Update backend table with status and requirements Clarify status of non-OpenAI vector store Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>	2026-02-11 15:59:12 +01:00
Juan José Mata	9e57954a99	Add `Family` vector search function call / support for document vault (#961 ) * Add SearchFamilyImportedFiles assistant function with vector store support Implement per-Family document search using OpenAI vector stores, allowing the AI assistant to search through uploaded financial documents (tax returns, statements, contracts, etc.). The architecture is modular with a provider- agnostic VectorStoreConcept interface so other RAG backends can be added. Key components: - Assistant::Function::SearchFamilyImportedFiles - tool callable from any LLM - Provider::VectorStoreConcept - abstract vector store interface - Provider::Openai vector store methods (create, upload, search, delete) - Family::VectorSearchable concern with document management - FamilyDocument model for tracking uploaded files - Migration adding vector_store_id to families and family_documents table https://claude.ai/code/session_01TSkKc7a9Yu2ugm1RvSf4dh * Extract VectorStore adapter layer for swappable backends Replace the Provider::VectorStoreConcept mixin with a standalone adapter architecture under VectorStore::. This cleanly separates vector store concerns from the LLM provider and makes it trivial to swap backends. Components: - VectorStore::Base — abstract interface (create/delete/upload/remove/search) - VectorStore::Openai — uses ruby-openai gem's native vector_stores.search - VectorStore::Pgvector — skeleton for local pgvector + embedding model - VectorStore::Qdrant — skeleton for Qdrant vector DB - VectorStore::Registry — resolves adapter from VECTOR_STORE_PROVIDER env - VectorStore::Response — success/failure wrapper (like Provider::Response) Consumers updated to go through VectorStore.adapter: - Family::VectorSearchable - Assistant::Function::SearchFamilyImportedFiles - FamilyDocument Removed: Provider::VectorStoreConcept, vector store methods from Provider::Openai https://claude.ai/code/session_01TSkKc7a9Yu2ugm1RvSf4dh * Add Vector Store configuration docs to ai.md Documents how to configure the document search feature, covering all three supported backends (OpenAI, pgvector, Qdrant), environment variables, Docker Compose examples, supported file types, and privacy considerations. https://claude.ai/code/session_01TSkKc7a9Yu2ugm1RvSf4dh * No need to specify `imported` in code * Missed a couple more places * Tiny reordering for the human OCD * Update app/models/assistant/function/search_family_files.rb Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Juan José Mata <jjmata@jjmata.com> * PR comments * More PR comments --------- Signed-off-by: Juan José Mata <jjmata@jjmata.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2026-02-11 15:22:56 +01:00
MkDev11	6f8858b1a6	feat/Add AI-Powered Bank Statement Import (step 1, PDF import & analysis) (#808 ) * feat: Add PDF import with AI-powered document analysis This enhances the import functionality to support PDF files with AI-powered document analysis. When a PDF is uploaded, it is processed by AI to: - Identify the document type (bank statement, credit card statement, etc.) - Generate a summary of the document contents - Extract key metadata (institution, dates, balances, transaction count) After processing, an email is sent to the user asking for next steps. Key changes: - Add PdfImport model for handling PDF document imports - Add Provider::Openai::PdfProcessor for AI document analysis - Add ProcessPdfJob for async PDF processing - Add PdfImportMailer for user notification emails - Update imports controller to detect and handle PDF uploads - Add PDF import option to the new import page - Add i18n translations for all new strings - Add comprehensive tests for the new functionality * Add bank statement import with AI extraction - Create ImportBankStatement assistant function for MCP - Add BankStatementExtractor with chunked processing for small context windows - Register function in assistant configurable - Make PdfImport#pdf_file_content public for extractor access - Increase OpenAI request timeout to 600s for slow local models - Increase DB connection pool to 20 for concurrent operations Tested with M-Pesa bank statement via remote Ollama (qwen3:8b): - Successfully extracted 18 transactions - Generated CSV and created TransactionImport - Works with 3000 char chunks for small context windows * Add pdf-reader gem dependency The BankStatementExtractor uses PDF::Reader to parse bank statement PDFs, but the gem was not properly declared in the Gemfile. This would cause NameError in production when processing bank statements. Added pdf-reader ~> 2.12 to Gemfile dependencies. * Fix transaction deduplication to preserve legitimate duplicates The previous deduplication logic removed ALL duplicate transactions based on [date, amount, name], which would drop legitimate same-day duplicates like multiple ATM withdrawals or card authorizations. Changed to only deduplicate transactions that appear in consecutive chunks (chunking artifacts) while preserving all legitimate duplicates within the same chunk or non-adjacent chunks. * Refactor bank statement extraction to use public provider method Address code review feedback: - Add public extract_bank_statement method to Provider::Openai - Remove direct access to private client via send(:client) - Update ImportBankStatement to use new public method - Add require 'set' to BankStatementExtractor - Remove PII-sensitive content from error logs - Add defensive check for nil response.error - Handle oversized PDF pages in chunking logic - Remove unused process_native and process_generic methods - Update email copy to reflect feature availability - Add guard for nil document_type in email template - Document pdf-reader gem rationale in Gemfile Tested with both OpenAI (gpt-4o) and Ollama (qwen3:8b): - OpenAI: 49 transactions extracted in 30s - Ollama: 40 transactions extracted in 368s - All encapsulation and error handling working correctly * Update schema.rb with ai_summary and document_type columns * Address PR #808 review comments - Rename :csv_file to :import_file across controllers/views/tests - Add PDF test fixture (sample_bank_statement.pdf) - Add supports_pdf_processing? method for graceful degradation - Revert unrelated database.yml pool change (600->3) - Remove month_start_day schema bleed from other PR - Fix PdfProcessor: use .strip instead of .strip_heredoc - Add server-side PDF magic byte validation - Conditionally show PDF import option when AI provider available - Fix ProcessPdfJob: sanitize errors, handle update failure - Move pdf_file attachment from Import to PdfImport - Document deduplication logic limitations - Fix ImportBankStatement: catch specific exceptions only - Remove unnecessary require 'set' - Remove dead json_schema method from PdfProcessor - Reduce default OpenAI timeout from 600s to 60s - Fix nil guard in text mailer template - Add require 'csv' to ImportBankStatement - Remove Gemfile pdf-reader comment * Fix RuboCop indentation in ProcessPdfJob * Refactor PDF import check to use model predicate method Replace is_a?(PdfImport) type check with requires_csv_workflow? predicate that leverages STI inheritance for cleaner controller logic. * Fix missing 'unknown' locale key and schema version mismatch - Add 'unknown: Unknown Document' to document_types locale - Fix schema version to match latest migration (2026_01_24_180211) * Document OPENAI_REQUEST_TIMEOUT env variable Added to .env.local.example and docs/hosting/ai.md * Rename ALLOWED_MIME_TYPES to ALLOWED_CSV_MIME_TYPES for clarity * Add comment explaining requires_csv_workflow? predicate * Remove redundant required_column_keys from PdfImport Base class already returns [] by default * Add ENV toggle to disable PDF processing for non-vision endpoints OPENAI_SUPPORTS_PDF_PROCESSING=false can be used for OpenAI-compatible endpoints (e.g., Ollama) that don't support vision/PDF processing. * Wire up transaction extraction for PDF bank statements - Add extracted_data JSONB column to imports - Add extract_transactions method to PdfImport - Call extraction in ProcessPdfJob for bank statements - Store transactions in extracted_data for later review * Fix ProcessPdfJob retry logic, sanitize and localize errors - Allow retries after partial success (classification ok, extraction failed) - Log sanitized error message instead of raw message to avoid data leakage - Use i18n for user-facing error messages * Add vision-capable model validation for PDF processing * Fix drag-and-drop test to use correct field name csv_file * Schema bleedover from another branch * Fix drag-drop import form field name to match controller * Add vision capability guard to process_pdf method --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: mkdev11 <jaysmth689+github@users.noreply.github.com> Co-authored-by: Juan José Mata <jjmata@jjmata.com>	2026-01-30 20:44:25 +01:00
eureka928	02c71bca0a	Add AI Cache Management documentation Document the AI cache reset feature including what it does, when to use it, how to reset via UI, and cost implications.	2026-01-26 10:41:14 +01:00
Blaž Dular	8972cb59f0	docs: add env variable for ai debug to docs (#494 )	2025-12-23 19:57:32 +01:00
soky srm	da114b5b3d	Update ai.md (#263 ) * Update ai.md Change some deprecated models Signed-off-by: soky srm <sokysrm@gmail.com> * Fix typo in AI model description Signed-off-by: Juan José Mata <juanjo.mata@gmail.com> --------- Signed-off-by: soky srm <sokysrm@gmail.com> Signed-off-by: Juan José Mata <juanjo.mata@gmail.com> Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>	2025-10-30 23:38:14 +01:00
Juan José Mata	f18c11c7ac	Update AI model recommendations section Added a caution note about model support and testing approach. Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>	2025-10-29 18:52:26 +01:00
Juan José Mata	3f4330eea8	Update AI assistant documentation with version caution Added caution note regarding AI assistant support versions. Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>	2025-10-29 18:44:58 +01:00
Copilot	a8f318c3f9	Fix "Messages is invalid" error for Ollama/custom LLM providers and add comprehensive AI documentation (#225 ) * Add comprehensive AI/LLM configuration documentation * Fix Chat.start! to use default model when model is nil or empty * Ensure all controllers use Chat.default_model for consistency * Move AI doc inside `hosting/` * Probably too much error handling --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jjmata <187772+jjmata@users.noreply.github.com> Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>	2025-10-24 12:04:19 +02:00

11 Commits