mirror of
https://github.com/we-promise/sure.git
synced 2026-06-04 18:29:02 +00:00
* feat(ai): add Anthropic provider with chat parity (1/5)
Introduces Provider::Anthropic alongside Provider::Openai, implementing
the LlmConcept chat_response contract over the official anthropic Ruby
SDK. Batch ops, PDF, and RAG land in follow-up PRs.
- Provider::Anthropic uses Messages API for sync and streaming responses
- ChatConfig builds requests with ephemeral prompt-cache markers on the
system prompt and the last tool definition
- MessageFormatter reconstructs multi-turn history (text + tool_use +
tool_result blocks) from raw Message records, including the paired
user-role tool_result turn Anthropic requires after every tool_use
- ChatParser maps Anthropic Message into the shared ChatResponse Data
- Registry, Setting, User, Chat default model wired for ANTHROPIC_*
envs and Setting.anthropic_*; LLM_PROVIDER selects between providers
- Responder forwards raw conversation_history (Array<Message>) so
providers without hosted conversation state can rebuild context
- OpenAI provider accepts and ignores the new kwarg (no behavior change)
Tests cover provider init, model gating, MessageFormatter for all turn
shapes, ChatConfig request building (max_tokens, system cache, tool
conversion), ChatParser for text / tool_use / mixed blocks, Registry
discovery, and mocked chat_response success / error / function_request
paths. Live VCR cassettes recorded in a follow-up with a real key.
Stacked PRs: 2/5 batch ops + cost ledger, 3/5 PDF, 4/5 pgvector RAG,
5/5 settings UI + disclosure.
* fix(ai): address PR review on Anthropic provider foundation
Surface fixes raised by Codex + CodeRabbit on PR 1/5:
- Provider::Anthropic#chat_response now accepts (and ignores) a
`messages:` kwarg. Assistant::Responder passes both `messages:`
(OpenAI-shape) and `conversation_history:` (raw Message records) for
cross-provider parity, so the previous signature raised
ArgumentError on the first chat turn through the Anthropic provider.
- Provider::Anthropic#supports_model? bypasses the `claude` prefix
gate when a custom base_url is configured, mirroring the OpenAI
provider. Bedrock-shaped IDs like
`anthropic.claude-sonnet-4-5-20250929-v1:0` and
`claude-opus-4@20250514` are otherwise rejected by
Assistant::Provided#get_model_provider and the chat dies.
- Setting.anthropic_access_token is now in
EncryptedSettingFields::ENCRYPTED_FIELDS so the Anthropic API key
is encrypted at rest like every other provider secret. Previously
plaintext while siblings (openai_access_token, twelve_data_api_key,
external_assistant_token) were ciphertext.
- Chat.default_model falls back to whichever provider is actually
configured. Previously, with LLM_PROVIDER=anthropic but no
Anthropic credentials, the default model resolved to a Claude ID
that no registered provider supported, so chats failed even when
OpenAI was fully configured. Adds Provider::{Anthropic,Openai}#configured?
class methods for the readable callsite.
- Provider::Anthropic.effective_model uses
`ENV["ANTHROPIC_MODEL"].presence || Setting.anthropic_model` so the
Setting lookup is only performed when the env var is absent — the
previous `ENV.fetch(KEY, default)` evaluated the default arg
eagerly on every call.
- Provider::Anthropic::ChatConfig#anthropic_input_schema strips both
`:strict` and `"strict"` keys so JSON-decoded schemas with string
keys cannot leak the OpenAI-only flag through to Anthropic.
Test coverage added: supports_model? bypass on custom endpoints,
chat_response messages: kwarg compatibility, default_model fallback
in the three credential combinations, configured? against ENV +
Setting, strict-flag stripping for both key types, and a
`Setting.expects(:anthropic_model).never` assertion proving the
ENV-precedence test now exercises the lazy path.
All 4365 tests pass (1 pre-existing libvips env error unrelated).
* test(chat): make default_model tests resilient to ENV model overrides
CodeRabbit flagged on PR review: the new default_model tests asserted
against Provider::*::DEFAULT_MODEL, but Chat.default_model actually
returns Provider::*.effective_model.presence (which reads
OPENAI_MODEL / ANTHROPIC_MODEL from the environment). With either env
var set, the tests would fail intermittently even though routing was
correct.
- New default_model tests now assert against the provider's
effective_model directly, so they verify the routing decision
(which provider's value wins) without coupling to the constant.
- Pre-existing "creates with default model" assertions had the same
brittleness; switch them to compare against Chat.default_model so
the chosen model is whatever the env / Setting cascade resolves to.
Verified by running `ANTHROPIC_MODEL=claude-haiku-4-5 OPENAI_MODEL=gpt-4o
bin/rails test test/models/chat_test.rb` — 16 runs, 0 failures
(previously 2 pre-existing failures + 0 from the new tests).
* fix(ai): address local review on Anthropic foundation
- Provider::Anthropic#supports_pdf_processing? bypasses prefix gate for
custom endpoints, mirroring supports_model?
- Provider::Anthropic#initialize raises Error when custom_endpoint? AND
model.blank?, parity with Provider::Openai
- stream_chat_response captures partial usage on mid-stream errors and
records it via the new on_partial callback so chat_response can skip
the duplicate error row in the outer rescue
- safe_accumulated_message swallows the secondary failure when the SDK
cannot reconstruct a snapshot
- langfuse_client memoizes properly (||= instead of =) so repeated calls
don't churn Langfuse instances
- MessageFormatter sorts tool_calls by created_at then id so the
message array is deterministic across replays; skips tool_calls
missing both provider_call_id and provider_id rather than sending
`id: nil` and getting rejected by Anthropic
- Setting.anthropic_access_token default falls back through
ENV["ANTHROPIC_API_KEY"].presence (was missing .presence, so an
empty-string env value bled through)
- User#openai_configured? / #anthropic_configured? delegate to the
Provider::* class methods — single source of truth
- Assistant::Responder renames the OpenAI-shape history builder
conversation_history → openai_messages_payload so the kwarg name
matches the local method name (messages: openai_messages_payload,
conversation_history: chat_message_records)
- Assistant::Builtin stale-history comment updated to reference both
builders
Adds a streaming chat_response test using ad-hoc subclasses of the
SDK event types so the case/when dispatch matches via is_a? without
stubbing class-level === behavior.
* test(ai): add Anthropic tool_use round-trip + multi-tool turn coverage
Addresses @jjmata's "worth confirming" note on PR #1983: tool-use turns
from prior assistant messages must round-trip correctly when retrieved
from the database.
- New `ChatParser → ToolCall::Function → MessageFormatter` test walks
the full path: Anthropic response with a tool_use block →
ChatFunctionRequest → ToolCall::Function.from_function_request →
persisted on the AssistantMessage → MessageFormatter rebuild on the
next turn. Asserts the original `tool_use.id` is preserved end-to-end
as both `tool_use.id` and the paired `tool_result.tool_use_id`, and
that the original `input` hash and serialized result content survive.
- New multi-tool assistant turn test confirms two tool_use blocks on a
single assistant message render as two tool_use blocks followed by
two paired tool_result blocks in a single user-role follow-up,
matching Anthropic's required alternation.
Both tests exercise the existing PR1 code without behavior changes.
* test(ai): require "ostruct" explicitly in Anthropic provider tests
OpenStruct is moving out of Ruby's default load path (warning in 3.4+,
removed in 3.5+). Tests work today because ActiveSupport transitively
loads it, but that's incidental. Match the existing convention in
test/controllers/settings/hostings_controller_test.rb which explicitly
requires ostruct for the same reason.
* fix(ai): sanitize Langfuse warn logs, normalize tool_use.input, dedup history fetch
Addresses three open CodeRabbit findings on PR #1983.
- Provider::Anthropic Langfuse rescue branches no longer include
`e.full_message` in `Rails.logger.warn`. `full_message` bundles the
backtrace + cause chain and on some SDK error types includes the
serialized request/response payload (prompt, model output). Logs
now report `#{e.class}: #{e.message}` only. Three sites:
create_langfuse_trace, log_langfuse_generation, upsert_langfuse_trace.
Note: Provider::Openai has the same pattern (copy-pasted source) —
harmonization deferred to a follow-up cleanup PR; this commit fixes
only the Anthropic provider to keep PR scope tight.
- MessageFormatter#parse_arguments now coerces any non-Hash parsed
result to `{}`. Anthropic's Messages API requires `tool_use.input`
to be a JSON object (map); a stored ToolCall::Function record whose
arguments parse to a scalar, bool, or array (corrupt row, legacy
data, cross-provider bleed) would otherwise produce a payload the
API rejects. Normal flow stores Hash arguments end-to-end so the
fix is defensive — adds 2 tests covering scalar/array JSON strings
and non-String non-Hash inputs.
- Assistant::Responder dedups the chat-history fetch. The previous
layout fired two near-identical `chat.messages.where(...).includes(
:tool_calls).ordered` queries per LLM turn (one for the OpenAI-shape
payload, one for the raw-records kwarg). A new memoized
`complete_chat_messages` fetches once; `chat_message_records` filters
out the current message via `Array#reject`, `openai_messages_payload`
iterates the cached array unchanged. One SQL query per turn instead
of two. Memoization scope = single Responder instance (per LLM call),
so cache invalidation is not a concern.
All 4370 tests pass (1 pre-existing libvips env error unrelated).
Rubocop + brakeman clean.
* fix(ci): replace sk-ant- prefixed test placeholders
Pipelock secret scanner pattern-matches `sk-ant-*` as a real Anthropic
API key and fails the PR security-scan check. Test stubs and
ClimateControl env values used `sk-ant-test`, `sk-ant-from-setting`,
`sk-ant-x`, `sk-ant-y` as obvious placeholders, but the scanner does
not care about value entropy.
Switched to `fake-anthropic-key-*` / `fake-token-*` strings so the
scanner stops flagging them. No production code touched, no behavior
change — Provider::Anthropic still accepts any non-blank token.
260 lines
9.0 KiB
Ruby
260 lines
9.0 KiB
Ruby
require "test_helper"
|
|
require "ostruct"
|
|
|
|
class Provider::AnthropicTest < ActiveSupport::TestCase
|
|
include LLMInterfaceTest
|
|
|
|
setup do
|
|
@subject = @anthropic = Provider::Anthropic.new(
|
|
ENV.fetch("ANTHROPIC_API_KEY", "test-anthropic-token")
|
|
)
|
|
@subject_model = "claude-sonnet-4-6"
|
|
end
|
|
|
|
test "provider_name returns Anthropic for standard provider" do
|
|
assert_equal "Anthropic", @subject.provider_name
|
|
end
|
|
|
|
test "provider_name returns custom info for custom base_url" do
|
|
custom = Provider::Anthropic.new(
|
|
"test-token",
|
|
base_url: "https://bedrock.example.com/anthropic",
|
|
model: "claude-opus-4-7"
|
|
)
|
|
|
|
assert_equal "Custom Anthropic-compatible (https://bedrock.example.com/anthropic)", custom.provider_name
|
|
end
|
|
|
|
test "supports_model? returns true for claude prefix" do
|
|
assert @subject.supports_model?("claude-sonnet-4-6")
|
|
assert @subject.supports_model?("claude-opus-4-7")
|
|
assert @subject.supports_model?("claude-haiku-4-5")
|
|
assert_not @subject.supports_model?("gpt-4.1")
|
|
end
|
|
|
|
test "supports_model? bypasses the prefix gate for custom endpoints" do
|
|
custom = Provider::Anthropic.new(
|
|
"test-token",
|
|
base_url: "https://bedrock.example.com/anthropic",
|
|
model: "anthropic.claude-sonnet-4-5-20250929-v1:0"
|
|
)
|
|
|
|
# Bedrock-shaped IDs start with "anthropic", not "claude" — would fail the
|
|
# default prefix check, but custom endpoints must accept any model.
|
|
assert custom.supports_model?("anthropic.claude-sonnet-4-5-20250929-v1:0")
|
|
assert custom.supports_model?("claude-opus-4@20250514")
|
|
assert custom.supports_model?("any-string-the-endpoint-accepts")
|
|
end
|
|
|
|
test "supported_models_description returns prefixes for standard provider" do
|
|
assert_equal "models starting with: claude", @subject.supported_models_description
|
|
end
|
|
|
|
test "supports_pdf_processing? true for claude models" do
|
|
assert @subject.supports_pdf_processing?(model: "claude-sonnet-4-6")
|
|
assert_not @subject.supports_pdf_processing?(model: "gpt-4o")
|
|
end
|
|
|
|
test "effective_model defers to ENV when set without consulting Setting" do
|
|
ClimateControl.modify("ANTHROPIC_MODEL" => "claude-haiku-4-5") do
|
|
Setting.expects(:anthropic_model).never
|
|
assert_equal "claude-haiku-4-5", Provider::Anthropic.effective_model
|
|
end
|
|
end
|
|
|
|
test "configured? reflects ENV and Setting presence" do
|
|
ClimateControl.modify("ANTHROPIC_ACCESS_TOKEN" => nil, "ANTHROPIC_API_KEY" => nil) do
|
|
Setting.stubs(:anthropic_access_token).returns(nil)
|
|
assert_not Provider::Anthropic.configured?
|
|
|
|
Setting.stubs(:anthropic_access_token).returns("fake-token-1")
|
|
assert Provider::Anthropic.configured?
|
|
end
|
|
|
|
ClimateControl.modify("ANTHROPIC_API_KEY" => "fake-token-2") do
|
|
Setting.stubs(:anthropic_access_token).returns(nil)
|
|
assert Provider::Anthropic.configured?
|
|
end
|
|
end
|
|
|
|
test "effective_model falls back to default when nothing set" do
|
|
ClimateControl.modify("ANTHROPIC_MODEL" => nil) do
|
|
Setting.stubs(:anthropic_model).returns(nil)
|
|
assert_equal Provider::Anthropic::DEFAULT_MODEL, Provider::Anthropic.effective_model
|
|
end
|
|
end
|
|
|
|
test "chat_response wraps Anthropic SDK errors in Provider::Anthropic::Error" do
|
|
fake_client = mock
|
|
@subject.instance_variable_set(:@client, fake_client)
|
|
messages = mock
|
|
fake_client.stubs(:messages).returns(messages)
|
|
messages.expects(:create).raises(StandardError.new("rate limit exceeded"))
|
|
|
|
response = @subject.chat_response("hi", model: @subject_model)
|
|
|
|
assert_not response.success?
|
|
assert_kind_of Provider::Anthropic::Error, response.error
|
|
assert_match(/rate limit/i, response.error.message)
|
|
end
|
|
|
|
test "chat_response accepts messages: kwarg passed by Responder without raising" do
|
|
# The OpenAI-shaped `messages:` array is passed alongside `conversation_history:`
|
|
# for cross-provider parity. Anthropic ignores it but must still accept it as
|
|
# a keyword argument — historical regression that broke the first chat turn.
|
|
fake_client = stub_anthropic_client_with(
|
|
build_anthropic_message(
|
|
id: "msg_kw",
|
|
model: @subject_model,
|
|
text_blocks: [ "ok" ],
|
|
tool_use_blocks: [],
|
|
usage: { input_tokens: 1, output_tokens: 1 }
|
|
)
|
|
)
|
|
@subject.instance_variable_set(:@client, fake_client)
|
|
|
|
response = @subject.chat_response(
|
|
"hi",
|
|
model: @subject_model,
|
|
messages: [ { role: "user", content: "hi" } ],
|
|
conversation_history: []
|
|
)
|
|
|
|
assert response.success?
|
|
end
|
|
|
|
test "chat_response returns parsed ChatResponse on success" do
|
|
fake_client = stub_anthropic_client_with(
|
|
build_anthropic_message(
|
|
id: "msg_abc",
|
|
model: @subject_model,
|
|
text_blocks: [ "Hello there." ],
|
|
tool_use_blocks: [],
|
|
usage: { input_tokens: 12, output_tokens: 5 }
|
|
)
|
|
)
|
|
@subject.instance_variable_set(:@client, fake_client)
|
|
|
|
response = @subject.chat_response("hi", model: @subject_model)
|
|
|
|
assert response.success?
|
|
assert_equal "msg_abc", response.data.id
|
|
assert_equal @subject_model, response.data.model
|
|
assert_equal 1, response.data.messages.size
|
|
assert_equal "Hello there.", response.data.messages.first.output_text
|
|
assert_empty response.data.function_requests
|
|
end
|
|
|
|
test "chat_response streams text deltas and emits a final response chunk" do
|
|
final_message = build_anthropic_message(
|
|
id: "msg_stream",
|
|
model: @subject_model,
|
|
text_blocks: [ "Hello world" ],
|
|
tool_use_blocks: [],
|
|
usage: { input_tokens: 7, output_tokens: 3 }
|
|
)
|
|
# Use ad-hoc subclasses of the SDK event types so the case/when dispatch
|
|
# inside `stream_chat_response` matches them via `is_a?` without needing
|
|
# to stub class-level `===` behavior.
|
|
text_event_cls = Class.new(::Anthropic::Streaming::TextEvent) do
|
|
def initialize(text:, snapshot:)
|
|
@text = text
|
|
@snapshot = snapshot
|
|
end
|
|
attr_reader :text, :snapshot
|
|
end
|
|
stop_event_cls = Class.new(::Anthropic::Streaming::MessageStopEvent) do
|
|
def initialize(message:)
|
|
@message = message
|
|
end
|
|
attr_reader :message
|
|
end
|
|
events = [
|
|
text_event_cls.new(text: "Hello ", snapshot: "Hello "),
|
|
text_event_cls.new(text: "world", snapshot: "Hello world"),
|
|
stop_event_cls.new(message: final_message)
|
|
]
|
|
|
|
fake_stream = mock
|
|
fake_stream.stubs(:each).multiple_yields(*events.map { |e| [ e ] })
|
|
fake_stream.stubs(:accumulated_message).returns(final_message)
|
|
|
|
messages = mock
|
|
messages.stubs(:stream).returns(fake_stream)
|
|
client = mock
|
|
client.stubs(:messages).returns(messages)
|
|
@subject.instance_variable_set(:@client, client)
|
|
|
|
collected = []
|
|
response = @subject.chat_response(
|
|
"hi",
|
|
model: @subject_model,
|
|
streamer: ->(chunk) { collected << chunk }
|
|
)
|
|
|
|
assert response.success?
|
|
text_chunks = collected.select { |c| c.type == "output_text" }
|
|
response_chunks = collected.select { |c| c.type == "response" }
|
|
|
|
assert_equal 2, text_chunks.size
|
|
assert_equal [ "Hello ", "world" ], text_chunks.map(&:data)
|
|
assert_equal 1, response_chunks.size
|
|
assert_equal "msg_stream", response_chunks.first.data.id
|
|
assert_equal 10, response_chunks.first.usage["total_tokens"]
|
|
end
|
|
|
|
test "chat_response surfaces tool_use blocks as function_requests" do
|
|
fake_client = stub_anthropic_client_with(
|
|
build_anthropic_message(
|
|
id: "msg_xyz",
|
|
model: @subject_model,
|
|
text_blocks: [],
|
|
tool_use_blocks: [ { id: "toolu_1", name: "get_net_worth", input: { currency: "USD" } } ],
|
|
usage: { input_tokens: 20, output_tokens: 8 }
|
|
)
|
|
)
|
|
@subject.instance_variable_set(:@client, fake_client)
|
|
|
|
response = @subject.chat_response(
|
|
"What is my net worth?",
|
|
model: @subject_model,
|
|
functions: [ {
|
|
name: "get_net_worth",
|
|
description: "Gets a user's net worth",
|
|
params_schema: { type: "object", properties: {}, required: [], additionalProperties: false },
|
|
strict: true
|
|
} ]
|
|
)
|
|
|
|
assert response.success?
|
|
assert_equal 1, response.data.function_requests.size
|
|
|
|
req = response.data.function_requests.first
|
|
assert_equal "toolu_1", req.call_id
|
|
assert_equal "get_net_worth", req.function_name
|
|
assert_equal({ currency: "USD" }.to_json, req.function_args)
|
|
end
|
|
|
|
private
|
|
def stub_anthropic_client_with(message)
|
|
messages = mock
|
|
messages.stubs(:create).returns(message)
|
|
client = mock
|
|
client.stubs(:messages).returns(messages)
|
|
client
|
|
end
|
|
|
|
def build_anthropic_message(id:, model:, text_blocks:, tool_use_blocks:, usage:)
|
|
OpenStruct.new(
|
|
id: id,
|
|
model: model,
|
|
content: text_blocks.map { |t| OpenStruct.new(type: :text, text: t) } +
|
|
tool_use_blocks.map { |t| OpenStruct.new(type: :tool_use, id: t[:id], name: t[:name], input: t[:input]) },
|
|
usage: OpenStruct.new(
|
|
input_tokens: usage[:input_tokens],
|
|
output_tokens: usage[:output_tokens]
|
|
)
|
|
)
|
|
end
|
|
end
|