Files
sure/app/models/setting.rb
LPW 84bfe5b7ab Add external AI assistant with Pipelock security proxy (#1069)
* feat(helm): add Pipelock ConfigMap, scanning config, and consolidate compose

- Add ConfigMap template rendering DLP, response scanning, MCP input/tool
  scanning, and forward proxy settings from values
- Mount ConfigMap as /etc/pipelock/pipelock.yaml volume in deployment
- Add checksum/config annotation for automatic pod restart on config change
- Gate HTTPS_PROXY/HTTP_PROXY env injection on forwardProxy.enabled (skip
  in MCP-only mode)
- Use hasKey for all boolean values to prevent Helm default swallowing false
- Single source of truth for ports (forwardProxy.port/mcpProxy.port)
- Pipelock-specific imagePullSecrets with fallback to app secrets
- Merge standalone compose.example.pipelock.yml into compose.example.ai.yml
- Add pipelock.example.yaml for Docker Compose users
- Add exclude-paths to CI workflow for locale file false positives

* Add external assistant support (OpenAI-compatible SSE proxy)

Allow self-hosted instances to delegate chat to an external AI agent
via an OpenAI-compatible streaming endpoint. Configurable per-family
through Settings UI or ASSISTANT_TYPE env override.

- Assistant::External::Client: SSE streaming HTTP client (no new gems)
- Settings UI with type selector, env lock indicator, config status
- Helm chart and Docker Compose env var support
- 45 tests covering client, config, routing, controller, integration

* Add session key routing, email allowlist, and config plumbing

Route to the actual OpenClaw session via x-openclaw-session-key header
instead of creating isolated sessions. Gate external assistant access
behind an email allowlist (EXTERNAL_ASSISTANT_ALLOWED_EMAILS env var).
Plumb session_key and allowedEmails through Helm chart, compose, and
env template.

* Add HTTPS_PROXY support to External::Client for Pipelock integration

Net::HTTP does not auto-read HTTPS_PROXY/HTTP_PROXY env vars (unlike
Faraday). Explicitly resolve proxy from environment in build_http so
outbound traffic to the external assistant routes through Pipelock's
forward proxy when enabled. Respects NO_PROXY for internal hosts.

* Add UI fields for external assistant config (Setting-backed with env fallback)

Follow the same pattern as OpenAI settings: database-backed Setting
fields with env var defaults. Self-hosters can now configure the
external assistant URL, token, and agent ID from the browser
(Settings > Self-Hosting > AI Assistant) instead of requiring env vars.
Fields disable when the corresponding env var is set.

* Improve external assistant UI labels and add help text

Change placeholder to generic OpenAI-compatible URL pattern. Add help
text under each field explaining where the values come from: URL from
agent provider, token for authentication, agent ID for multi-agent
routing.

* Add external assistant docs and fix URL help text

Add External AI Assistant section to docs/hosting/ai.md covering setup
(UI and env vars), how it works, Pipelock security scanning, access
control, and Docker Compose example. Drop "chat completions" jargon
from URL help text.

* Harden external assistant: retry logic, disconnect UI, error handling, and test coverage

- Add retry with backoff for transient network errors (no retry after streaming starts)
- Add disconnect button with confirmation modal in self-hosting settings
- Narrow rescue scope with fallback logging for unexpected errors
- Safe cleanup of partial responses on stream interruption
- Gate ai_available? on family assistant_type instead of OR-ing all providers
- Truncate conversation history to last 20 messages
- Proxy-aware HTTP client with NO_PROXY support
- Sanitize protocol to use generic headers (X-Agent-Id, X-Session-Key)
- Full test coverage for streaming, retries, proxy routing, config, and disconnect

* Exclude external assistant client from Pipelock scan-diff

False positive: `@token` instance variable flagged as "Credential in URL".
Temporary workaround until Pipelock supports inline suppression.

* Address review feedback: NO_PROXY boundary fix, SSE done flag, design tokens

- Fix NO_PROXY matching to require domain boundary (exact match or .suffix),
  case-insensitive. Prevents badexample.com matching example.com.
- Add done flag to SSE streaming so read_body stops after [DONE]
- Move MAX_CONVERSATION_MESSAGES to class level
- Use bg-success/bg-destructive design tokens for status indicators
- Add rationale comment for pipelock scan exclusion
- Update docs last-updated date

* Address second round of review feedback

- Allowlist email comparison is now case-insensitive and nil-safe
- Cap SSE buffer at 1 MB to prevent memory blowup from malformed streams
- Don't expose upstream HTTP response body in user-facing errors (log it instead)
- Fix frozen string warning on buffer initialization
- Fix "builtin" typo in docs (should be "built-in")

* Protect completed responses from cleanup, sanitize error messages

- Don't destroy a fully streamed assistant message if post-stream
  metadata update fails (only cleanup partial responses)
- Log raw connection/HTTP errors internally, show generic messages
  to users to avoid leaking network/proxy details
- Update test assertions for new error message wording

* Fix SSE content guard and NO_PROXY test correctness

Use nil check instead of present? for SSE delta content to preserve
whitespace-only chunks (newlines, spaces) that can occur in code output.

Fix NO_PROXY test to use HTTP_PROXY matching the http:// client URL so
the proxy resolution and NO_PROXY bypass logic are actually exercised.

* Forward proxy credentials to Net::HTTP

Pass proxy_uri.user and proxy_uri.password to Net::HTTP.new so
authenticated proxies (http://user:pass@host:port) work correctly.
Without this, credentials parsed from the proxy URL were silently
dropped. Nil values are safe as positional args when no creds exist.

* Update pipelock integration to v0.3.1 with full scanning config

Bump Helm image tag from 0.2.7 to 0.3.1. Add missing security
sections to both the Helm ConfigMap and compose example config:
mcp_tool_policy, mcp_session_binding, and tool_chain_detection.
These protect the /mcp endpoint against tool injection, session
hijacking, and multi-step exfiltration chains.

Add version and mode fields to config files. Enable include_defaults
for DLP and response scanning to merge user patterns with the 35
built-in patterns. Remove redundant --mode CLI flag from the Helm
deployment template since mode is now in the config file.
2026-03-03 15:47:51 +01:00

193 lines
7.1 KiB
Ruby

# Dynamic settings the user can change within the app (helpful for self-hosting)
class Setting < RailsSettings::Base
class ValidationError < StandardError; end
cache_prefix { "v1" }
# Third-party API keys
field :twelve_data_api_key, type: :string, default: ENV["TWELVE_DATA_API_KEY"]
field :openai_access_token, type: :string, default: ENV["OPENAI_ACCESS_TOKEN"]
field :openai_uri_base, type: :string, default: ENV["OPENAI_URI_BASE"]
field :openai_model, type: :string, default: ENV["OPENAI_MODEL"]
field :openai_json_mode, type: :string, default: ENV["LLM_JSON_MODE"]
field :external_assistant_url, type: :string, default: ENV["EXTERNAL_ASSISTANT_URL"]
field :external_assistant_token, type: :string, default: ENV["EXTERNAL_ASSISTANT_TOKEN"]
field :external_assistant_agent_id, type: :string, default: ENV.fetch("EXTERNAL_ASSISTANT_AGENT_ID", "main")
field :brand_fetch_client_id, type: :string, default: ENV["BRAND_FETCH_CLIENT_ID"]
field :brand_fetch_high_res_logos, type: :boolean, default: ENV.fetch("BRAND_FETCH_HIGH_RES_LOGOS", "false") == "true"
BRAND_FETCH_LOGO_SIZE_STANDARD = 40
BRAND_FETCH_LOGO_SIZE_HIGH_RES = 120
BRAND_FETCH_URL_PATTERN = %r{(https://cdn\.brandfetch\.io/[^/]+/icon/fallback/lettermark/)w/\d+/h/\d+(\?c=.+)}
def self.brand_fetch_logo_size
brand_fetch_high_res_logos ? BRAND_FETCH_LOGO_SIZE_HIGH_RES : BRAND_FETCH_LOGO_SIZE_STANDARD
end
# Transforms a stored Brandfetch URL to use the current logo size setting
def self.transform_brand_fetch_url(url)
return url unless url.present? && url.match?(BRAND_FETCH_URL_PATTERN)
size = brand_fetch_logo_size
url.gsub(BRAND_FETCH_URL_PATTERN, "\\1w/#{size}/h/#{size}\\2")
end
# Provider selection
field :exchange_rate_provider, type: :string, default: ENV.fetch("EXCHANGE_RATE_PROVIDER", "twelve_data")
field :securities_provider, type: :string, default: ENV.fetch("SECURITIES_PROVIDER", "twelve_data")
# Sync settings - check both provider env vars for default
# Only defaults to true if neither provider explicitly disables pending
SYNCS_INCLUDE_PENDING_DEFAULT = begin
simplefin = ENV.fetch("SIMPLEFIN_INCLUDE_PENDING", "1") == "1"
plaid = ENV.fetch("PLAID_INCLUDE_PENDING", "1") == "1"
simplefin && plaid
end
field :syncs_include_pending, type: :boolean, default: SYNCS_INCLUDE_PENDING_DEFAULT
field :auto_sync_enabled, type: :boolean, default: ENV.fetch("AUTO_SYNC_ENABLED", "1") == "1"
field :auto_sync_time, type: :string, default: ENV.fetch("AUTO_SYNC_TIME", "02:22")
field :auto_sync_timezone, type: :string, default: ENV.fetch("AUTO_SYNC_TIMEZONE", "UTC")
AUTO_SYNC_TIME_FORMAT = /\A([01]?\d|2[0-3]):([0-5]\d)\z/
def self.valid_auto_sync_time?(time_str)
return false if time_str.blank?
AUTO_SYNC_TIME_FORMAT.match?(time_str.to_s.strip)
end
def self.valid_auto_sync_timezone?(timezone_str)
return false if timezone_str.blank?
ActiveSupport::TimeZone[timezone_str].present?
end
# Dynamic fields are now stored as individual entries with "dynamic:" prefix
# This prevents race conditions and ensures each field is independently managed
# Onboarding and app settings
ONBOARDING_STATES = %w[open closed invite_only].freeze
DEFAULT_ONBOARDING_STATE = begin
env_value = ENV["ONBOARDING_STATE"].to_s.presence || "open"
ONBOARDING_STATES.include?(env_value) ? env_value : "open"
end
field :onboarding_state, type: :string, default: DEFAULT_ONBOARDING_STATE
field :require_invite_for_signup, type: :boolean, default: false
field :require_email_confirmation, type: :boolean, default: ENV.fetch("REQUIRE_EMAIL_CONFIRMATION", "true") == "true"
def self.validate_onboarding_state!(state)
return if ONBOARDING_STATES.include?(state)
raise ValidationError, I18n.t("settings.hostings.update.invalid_onboarding_state")
end
class << self
alias_method :raw_onboarding_state, :onboarding_state
alias_method :raw_onboarding_state=, :onboarding_state=
alias_method :raw_openai_model, :openai_model
alias_method :raw_openai_model=, :openai_model=
def onboarding_state
value = raw_onboarding_state
return "invite_only" if value.blank? && require_invite_for_signup
value.presence || DEFAULT_ONBOARDING_STATE
end
def onboarding_state=(state)
validate_onboarding_state!(state)
self.require_invite_for_signup = state == "invite_only"
self.raw_onboarding_state = state
end
def openai_model=(value)
old_value = raw_openai_model
self.raw_openai_model = value
if old_value != value && old_value.present?
Rails.logger.info("OpenAI model changed from #{old_value} to #{value}, clearing AI cache for all families")
Family.find_each do |family|
ClearAiCacheJob.perform_later(family)
end
end
end
# Support dynamic field access via bracket notation
# First checks if it's a declared field, then falls back to individual dynamic entries
def [](key)
key_str = key.to_s
# Check if it's a declared field first
if respond_to?(key_str)
public_send(key_str)
else
# Fall back to individual dynamic entry lookup
find_by(var: dynamic_key_name(key_str))&.value
end
end
def []=(key, value)
key_str = key.to_s
# If it's a declared field, use the setter
if respond_to?("#{key_str}=")
public_send("#{key_str}=", value)
else
# Store as individual dynamic entry
dynamic_key = dynamic_key_name(key_str)
if value.nil?
where(var: dynamic_key).destroy_all
clear_cache
else
# Use upsert for atomic insert/update to avoid race conditions
upsert({ var: dynamic_key, value: value.to_yaml }, unique_by: :var)
clear_cache
end
end
end
# Check if a dynamic field exists (useful to distinguish nil value vs missing key)
def key?(key)
key_str = key.to_s
return true if respond_to?(key_str)
# Check if dynamic entry exists
where(var: dynamic_key_name(key_str)).exists?
end
# Delete a dynamic field
def delete(key)
key_str = key.to_s
return nil if respond_to?(key_str) # Can't delete declared fields
dynamic_key = dynamic_key_name(key_str)
value = self[key_str]
where(var: dynamic_key).destroy_all
clear_cache
value
end
# List all dynamic field keys (excludes declared fields)
def dynamic_keys
where("var LIKE ?", "dynamic:%").pluck(:var).map { |var| var.sub(/^dynamic:/, "") }
end
private
def dynamic_key_name(key_str)
"dynamic:#{key_str}"
end
end
# Validates OpenAI configuration requires model when custom URI base is set
def self.validate_openai_config!(uri_base: nil, model: nil)
# Use provided values or current settings
uri_base_value = uri_base.nil? ? openai_uri_base : uri_base
model_value = model.nil? ? openai_model : model
# If custom URI base is set, model must also be set
if uri_base_value.present? && model_value.blank?
raise ValidationError, "OpenAI model is required when custom URI base is configured"
end
end
end