diff --git a/docs/hosting/ai.md b/docs/hosting/ai.md index a1f8829c6..d32bd1b6e 100644 --- a/docs/hosting/ai.md +++ b/docs/hosting/ai.md @@ -11,6 +11,30 @@ Sure includes an AI assistant that can help users understand their financial dat > 👉 Help us by taking a structured approach to your issue reporting. 🙏 +## Architecture: Two AI Pipelines + +Sure has **two separate AI systems** that operate independently. Understanding this is important because they have different configuration requirements. + +### 1. Chat Assistant (conversational) + +The interactive chat where users ask questions about their finances. Routes through one of two backends: + +- **Builtin** (default): Uses the OpenAI-compatible provider configured via `OPENAI_ACCESS_TOKEN` / `OPENAI_URI_BASE` / `OPENAI_MODEL`. Calls Sure's function tools directly (get_accounts, get_transactions, etc.). +- **External**: Delegates the entire conversation to a remote AI agent. The agent calls back to Sure via MCP to access financial data. Set `ASSISTANT_TYPE=external` as a global override, or configure each family's assistant type in Settings. + +### 2. Auto-Categorization and Merchant Detection (background) + +Background jobs that classify transactions and detect merchants. These **always** use the OpenAI-compatible provider (`OPENAI_ACCESS_TOKEN`), regardless of what the chat assistant uses. They rely on structured function calling with JSON schemas, not conversational chat. + +### What this means in practice + +| Setting | Chat assistant | Auto-categorization | +|---------|---------------|---------------------| +| `ASSISTANT_TYPE=builtin` (default) | Uses OpenAI provider | Uses OpenAI provider | +| `ASSISTANT_TYPE=external` | Uses external agent | Still uses OpenAI provider | + +If you use an external agent for chat, you still need `OPENAI_ACCESS_TOKEN` set for auto-categorization and merchant detection to work. The two systems are fully independent. + ## Quickstart: OpenAI Token The easiest way to get started with AI features in Sure is to use OpenAI: @@ -288,7 +312,7 @@ For self-hosted deployments, you can configure AI settings through the web inter - **OpenAI URI Base** - Custom endpoint (leave blank for OpenAI) - **OpenAI Model** - Model name (required for custom endpoints) -**Note:** Settings in the UI override environment variables. If you change settings in the UI, those values take precedence. +**Note:** Environment variables take precedence over UI settings. When an env var is set, the corresponding UI field is disabled. ## External AI Assistant @@ -299,28 +323,35 @@ This is useful when: - You want to use a non-OpenAI-compatible model (the agent translates) - You want to keep LLM credentials and logic outside Sure entirely +> [!IMPORTANT] +> **Set `ASSISTANT_TYPE=external` to route all users to the external agent.** Without it, routing falls back to each family's `assistant_type` DB column (configurable per-family in the Settings UI), then defaults to `"builtin"`. If you want a global override that applies to every family regardless of their UI setting, set the env var. If you only want specific families to use the external agent, skip the env var and configure it per-family in Settings. + +> [!NOTE] +> The external assistant handles **chat only**. Auto-categorization and merchant detection still use the OpenAI-compatible provider (`OPENAI_ACCESS_TOKEN`). See [Architecture: Two AI Pipelines](#architecture-two-ai-pipelines) for details. + ### How It Works -1. Sure sends the chat conversation to your agent's API endpoint -2. Your agent processes it (using whatever LLM, tools, or context it needs) -3. Your agent can call Sure's `/mcp` endpoint for financial data (accounts, transactions, balance sheet) -4. Your agent streams the response back to Sure via Server-Sent Events (SSE) +1. User sends a message in the Sure chat UI +2. Sure sends the conversation to your agent's API endpoint (OpenAI chat completions format) +3. Your agent processes it using whatever LLM, tools, or context it needs +4. Your agent can call Sure's `/mcp` endpoint for financial data (accounts, transactions, balance sheet, holdings) +5. Your agent streams the response back to Sure via Server-Sent Events (SSE) -The agent's API must be **OpenAI chat completions compatible** — accept `POST` with `messages` array, return SSE with `delta.content` chunks. +The agent's API must be **OpenAI chat completions compatible**: accept `POST` with a `messages` array, return SSE with `delta.content` chunks. ### Configuration Configure via the UI or environment variables: **Settings UI:** -1. Go to **Settings** → **Self-Hosting** +1. Go to **Settings** -> **Self-Hosting** 2. Set **Assistant type** to "External (remote agent)" 3. Enter the **Endpoint URL** and **API Token** from your agent provider 4. Optionally set an **Agent ID** if the provider hosts multiple agents **Environment variables:** ```bash -ASSISTANT_TYPE=external # Force all families to use external +ASSISTANT_TYPE=external # Global override (or set per-family in UI) EXTERNAL_ASSISTANT_URL=https://your-agent/v1/chat/completions EXTERNAL_ASSISTANT_TOKEN=your-api-token EXTERNAL_ASSISTANT_AGENT_ID=main # Optional, defaults to "main" @@ -330,14 +361,138 @@ EXTERNAL_ASSISTANT_ALLOWED_EMAILS=user@example.com # Optional, comma-separated When environment variables are set, the corresponding UI fields are disabled (env takes precedence). +### MCP Callback Endpoint + +Sure exposes a [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) endpoint at `/mcp` so your external agent can call back and query financial data. This is how the agent accesses accounts, transactions, balance sheets, and other user data. + +**Protocol:** JSON-RPC 2.0 over HTTP POST + +**Authentication:** Bearer token via `Authorization` header + +**Environment variables:** +```bash +MCP_API_TOKEN=your-secret-token # Bearer token the agent sends to authenticate +MCP_USER_EMAIL=user@example.com # Email of the Sure user the agent acts as +``` + +The agent must send requests to `https://your-sure-instance/mcp` with: +``` +Authorization: Bearer +Content-Type: application/json +``` + +**Supported methods:** + +| Method | Description | +|--------|-------------| +| `initialize` | Handshake, returns server info and capabilities | +| `tools/list` | Lists available tools with names, descriptions, and input schemas | +| `tools/call` | Calls a specific tool by name with arguments | + +**Available tools** (exposed via `tools/list`): + +| Tool | Description | +|------|-------------| +| `get_accounts` | Retrieve account information | +| `get_transactions` | Query transaction history | +| `get_holdings` | Investment holdings data | +| `get_balance_sheet` | Current financial position | +| `get_income_statement` | Income and expenses | +| `import_bank_statement` | Import bank statement data | +| `search_family_files` | Search uploaded documents | + +**Example: list tools** +```bash +curl -X POST https://your-sure-instance/mcp \ + -H "Authorization: Bearer $MCP_API_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' +``` + +**Example: call a tool** +```bash +curl -X POST https://your-sure-instance/mcp \ + -H "Authorization: Bearer $MCP_API_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"get_accounts","arguments":{}}}' +``` + +### OpenClaw Gateway Example + +[OpenClaw](https://github.com/luckyPipewrench/openclaw) is an AI agent gateway that exposes agents as OpenAI-compatible endpoints. If your agent runs behind OpenClaw, configure it like this: + +```bash +ASSISTANT_TYPE=external +EXTERNAL_ASSISTANT_URL=http://your-openclaw-host:18789/v1/chat/completions +EXTERNAL_ASSISTANT_TOKEN=your-gateway-token +EXTERNAL_ASSISTANT_AGENT_ID=your-agent-name +``` + +**OpenClaw setup requirements:** +- The gateway must have `chatCompletions.enabled: true` in its config +- The agent's MCP config must point to Sure's `/mcp` endpoint with the correct `MCP_API_TOKEN` +- The URL format is always `/v1/chat/completions` (OpenAI-compatible) + +**Kubernetes in-cluster example** (agent in a different namespace): +```bash +# URL uses Kubernetes DNS: ..svc.cluster.local: +EXTERNAL_ASSISTANT_URL=http://my-agent.my-namespace.svc.cluster.local:18789/v1/chat/completions +``` + ### Security with Pipelock When [Pipelock](https://github.com/luckyPipewrench/pipelock) is enabled (`pipelock.enabled=true` in Helm, or the `pipelock` service in Docker Compose), all traffic between Sure and the external agent is scanned: -- **Outbound** (Sure → agent): routed through Pipelock's forward proxy via `HTTPS_PROXY` -- **Inbound** (agent → Sure /mcp): routed through Pipelock's MCP reverse proxy (port 8889) +- **Outbound** (Sure -> agent): routed through Pipelock's forward proxy via `HTTPS_PROXY` +- **Inbound** (agent -> Sure /mcp): routed through Pipelock's MCP reverse proxy (port 8889) -Pipelock scans for prompt injection, DLP violations, and tool poisoning. The external agent does not need Pipelock installed — Sure's Pipelock handles both directions. +Pipelock scans for prompt injection, DLP violations, and tool poisoning. The external agent does not need Pipelock installed. Sure's Pipelock handles both directions. + +**`NO_PROXY` behavior (Helm/Kubernetes only):** The Helm chart's env template sets `NO_PROXY` to include `.svc.cluster.local` and other internal domains. This means in-cluster agent URLs (like `http://agent.namespace.svc.cluster.local:18789`) bypass the forward proxy and go directly. If your agent is in-cluster, its traffic won't be forward-proxy scanned (but MCP callbacks from the agent are still scanned by the reverse proxy). Docker Compose deployments use a different `NO_PROXY` set; check your compose file for the exact values. + +**`mcpToolPolicy` note:** The Helm chart's `pipelock.mcpToolPolicy.enabled` defaults to `true`. If you haven't defined any policy rules, disable it: + +```yaml +# Helm values +pipelock: + mcpToolPolicy: + enabled: false +``` + +See the [Pipelock documentation](https://github.com/luckyPipewrench/pipelock) for tool policy configuration details. + +### Network Policies (Kubernetes) + +If you use Kubernetes NetworkPolicies (and you should), both Sure and the agent's namespace need rules to allow traffic in both directions. + +> [!WARNING] +> **Port number gotcha:** Kubernetes network policies evaluate **after** kube-proxy DNAT. This means egress rules must use the pod's `targetPort`, not the service port. If your agent's Service maps port 18789 to targetPort 18790, the network policy must allow port **18790**. + +**Sure namespace egress** (Sure calling the agent): +```yaml +# Allow Sure -> agent namespace +- to: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: agent-namespace + ports: + - protocol: TCP + port: 18790 # targetPort, not service port! +``` + +**Sure namespace ingress** (agent calling Sure's pipelock MCP reverse proxy): +```yaml +# Allow agent -> Sure pipelock MCP reverse proxy +- from: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: agent-namespace + ports: + - protocol: TCP + port: 8889 +``` + +**Agent namespace** needs the reverse: egress to Sure on port 8889, ingress from Sure on its listening port. ### Access Control @@ -350,9 +505,11 @@ x-rails-env: &rails_env ASSISTANT_TYPE: external EXTERNAL_ASSISTANT_URL: https://your-agent/v1/chat/completions EXTERNAL_ASSISTANT_TOKEN: your-api-token + MCP_API_TOKEN: your-mcp-token # For agent callback + MCP_USER_EMAIL: user@example.com # User the agent acts as ``` -Or configure via the Settings UI after startup (no env vars needed). +Or configure the assistant via the Settings UI after startup (MCP env vars are still required for callback). ## AI Cache Management @@ -653,6 +810,42 @@ ollama pull model-name # Install a model 3. Restart Sure after changing environment variables 4. Check logs for specific error messages +### "Failed to generate response" with External Assistant + +**Symptom:** Chat shows "Failed to generate response" when expecting the external assistant + +**Check in order:** + +1. **Is external routing active?** Sure uses external mode when `ASSISTANT_TYPE=external` is set as an env var, OR when the family's `assistant_type` is set to "external" in Settings. Check what the pod sees: + ```bash + kubectl exec deploy/sure-web -c rails -- env | grep ASSISTANT_TYPE + kubectl exec deploy/sure-worker -c sidekiq -- env | grep ASSISTANT_TYPE + ``` + If the env var is unset, check the family setting in the database or Settings UI. + +2. **Can Sure reach the agent?** Test from inside the worker pod (use `sh -c` so the env var expands inside the pod, not locally): + ```bash + kubectl exec deploy/sure-worker -c sidekiq -- \ + sh -c 'curl -s -o /dev/null -w "%{http_code}" \ + -H "Authorization: Bearer $EXTERNAL_ASSISTANT_TOKEN" \ + -H "Content-Type: application/json" \ + -d "{\"model\":\"test\",\"messages\":[{\"role\":\"user\",\"content\":\"ping\"}]}" \ + $EXTERNAL_ASSISTANT_URL' + ``` + - **Exit code 7 (connection refused):** Network policy is blocking. Check egress rules, and remember to use the `targetPort`, not the service port. + - **HTTP 401/403:** Token mismatch between Sure's `EXTERNAL_ASSISTANT_TOKEN` and the agent's expected token. + - **HTTP 404:** Wrong URL path. Must be `/v1/chat/completions`. + +3. **Check worker logs** for the actual error: + ```bash + kubectl logs deploy/sure-worker -c sidekiq --tail=50 | grep -i "external\|assistant\|error" + ``` + +4. **If using Pipelock:** Check pipelock sidecar logs. A crashed pipelock can block outbound requests: + ```bash + kubectl logs deploy/sure-worker -c pipelock --tail=20 + ``` + ### High Costs **Symptom:** Unexpected bills from cloud provider @@ -672,7 +865,7 @@ ollama pull model-name # Install a model ### Custom System Prompts -Sure's AI assistant uses a system prompt that defines its behavior. The prompt is defined in `app/models/assistant/configurable.rb`. +The builtin AI assistant uses a system prompt that defines its behavior. The prompt is defined in `app/models/assistant/configurable.rb`. This does not apply to external assistants, which manage their own prompts. To customize: 1. Fork the repository @@ -692,8 +885,11 @@ The assistant uses OpenAI's function calling (tool use) to access user data: **Available functions:** - `get_transactions` - Retrieve transaction history - `get_accounts` - Get account information +- `get_holdings` - Investment holdings data - `get_balance_sheet` - Current financial position - `get_income_statement` - Income and expenses +- `import_bank_statement` - Import bank statement data +- `search_family_files` - Search uploaded documents These are defined in `app/models/assistant/function/`. @@ -722,7 +918,7 @@ Sure's AI assistant can search documents that have been uploaded to a family's v No extra configuration is needed. If you already have `OPENAI_ACCESS_TOKEN` set for the AI assistant, document search works automatically. OpenAI manages chunking, embedding, and retrieval. ```bash -# Already set for AI chat — document search uses the same token +# Already set for AI chat - document search uses the same token OPENAI_ACCESS_TOKEN=sk-proj-... ```