feat(ai): add Anthropic provider with chat parity (1/5) (#1983)

* feat(ai): add Anthropic provider with chat parity (1/5) Introduces Provider::Anthropic alongside Provider::Openai, implementing the LlmConcept chat_response contract over the official anthropic Ruby SDK. Batch ops, PDF, and RAG land in follow-up PRs. - Provider::Anthropic uses Messages API for sync and streaming responses - ChatConfig builds requests with ephemeral prompt-cache markers on the system prompt and the last tool definition - MessageFormatter reconstructs multi-turn history (text + tool_use + tool_result blocks) from raw Message records, including the paired user-role tool_result turn Anthropic requires after every tool_use - ChatParser maps Anthropic Message into the shared ChatResponse Data - Registry, Setting, User, Chat default model wired for ANTHROPIC_* envs and Setting.anthropic_*; LLM_PROVIDER selects between providers - Responder forwards raw conversation_history (Array<Message>) so providers without hosted conversation state can rebuild context - OpenAI provider accepts and ignores the new kwarg (no behavior change) Tests cover provider init, model gating, MessageFormatter for all turn shapes, ChatConfig request building (max_tokens, system cache, tool conversion), ChatParser for text / tool_use / mixed blocks, Registry discovery, and mocked chat_response success / error / function_request paths. Live VCR cassettes recorded in a follow-up with a real key. Stacked PRs: 2/5 batch ops + cost ledger, 3/5 PDF, 4/5 pgvector RAG, 5/5 settings UI + disclosure. * fix(ai): address PR review on Anthropic provider foundation Surface fixes raised by Codex + CodeRabbit on PR 1/5: - Provider::Anthropic#chat_response now accepts (and ignores) a `messages:` kwarg. Assistant::Responder passes both `messages:` (OpenAI-shape) and `conversation_history:` (raw Message records) for cross-provider parity, so the previous signature raised ArgumentError on the first chat turn through the Anthropic provider. - Provider::Anthropic#supports_model? bypasses the `claude` prefix gate when a custom base_url is configured, mirroring the OpenAI provider. Bedrock-shaped IDs like `anthropic.claude-sonnet-4-5-20250929-v1:0` and `claude-opus-4@20250514` are otherwise rejected by Assistant::Provided#get_model_provider and the chat dies. - Setting.anthropic_access_token is now in EncryptedSettingFields::ENCRYPTED_FIELDS so the Anthropic API key is encrypted at rest like every other provider secret. Previously plaintext while siblings (openai_access_token, twelve_data_api_key, external_assistant_token) were ciphertext. - Chat.default_model falls back to whichever provider is actually configured. Previously, with LLM_PROVIDER=anthropic but no Anthropic credentials, the default model resolved to a Claude ID that no registered provider supported, so chats failed even when OpenAI was fully configured. Adds Provider::{Anthropic,Openai}#configured? class methods for the readable callsite. - Provider::Anthropic.effective_model uses `ENV["ANTHROPIC_MODEL"].presence || Setting.anthropic_model` so the Setting lookup is only performed when the env var is absent — the previous `ENV.fetch(KEY, default)` evaluated the default arg eagerly on every call. - Provider::Anthropic::ChatConfig#anthropic_input_schema strips both `:strict` and `"strict"` keys so JSON-decoded schemas with string keys cannot leak the OpenAI-only flag through to Anthropic. Test coverage added: supports_model? bypass on custom endpoints, chat_response messages: kwarg compatibility, default_model fallback in the three credential combinations, configured? against ENV + Setting, strict-flag stripping for both key types, and a `Setting.expects(:anthropic_model).never` assertion proving the ENV-precedence test now exercises the lazy path. All 4365 tests pass (1 pre-existing libvips env error unrelated). * test(chat): make default_model tests resilient to ENV model overrides CodeRabbit flagged on PR review: the new default_model tests asserted against Provider::*::DEFAULT_MODEL, but Chat.default_model actually returns Provider::*.effective_model.presence (which reads OPENAI_MODEL / ANTHROPIC_MODEL from the environment). With either env var set, the tests would fail intermittently even though routing was correct. - New default_model tests now assert against the provider's effective_model directly, so they verify the routing decision (which provider's value wins) without coupling to the constant. - Pre-existing "creates with default model" assertions had the same brittleness; switch them to compare against Chat.default_model so the chosen model is whatever the env / Setting cascade resolves to. Verified by running `ANTHROPIC_MODEL=claude-haiku-4-5 OPENAI_MODEL=gpt-4o bin/rails test test/models/chat_test.rb` — 16 runs, 0 failures (previously 2 pre-existing failures + 0 from the new tests). * fix(ai): address local review on Anthropic foundation - Provider::Anthropic#supports_pdf_processing? bypasses prefix gate for custom endpoints, mirroring supports_model? - Provider::Anthropic#initialize raises Error when custom_endpoint? AND model.blank?, parity with Provider::Openai - stream_chat_response captures partial usage on mid-stream errors and records it via the new on_partial callback so chat_response can skip the duplicate error row in the outer rescue - safe_accumulated_message swallows the secondary failure when the SDK cannot reconstruct a snapshot - langfuse_client memoizes properly (||= instead of =) so repeated calls don't churn Langfuse instances - MessageFormatter sorts tool_calls by created_at then id so the message array is deterministic across replays; skips tool_calls missing both provider_call_id and provider_id rather than sending `id: nil` and getting rejected by Anthropic - Setting.anthropic_access_token default falls back through ENV["ANTHROPIC_API_KEY"].presence (was missing .presence, so an empty-string env value bled through) - User#openai_configured? / #anthropic_configured? delegate to the Provider::* class methods — single source of truth - Assistant::Responder renames the OpenAI-shape history builder conversation_history → openai_messages_payload so the kwarg name matches the local method name (messages: openai_messages_payload, conversation_history: chat_message_records) - Assistant::Builtin stale-history comment updated to reference both builders Adds a streaming chat_response test using ad-hoc subclasses of the SDK event types so the case/when dispatch matches via is_a? without stubbing class-level === behavior. * test(ai): add Anthropic tool_use round-trip + multi-tool turn coverage Addresses @jjmata's "worth confirming" note on PR #1983: tool-use turns from prior assistant messages must round-trip correctly when retrieved from the database. - New `ChatParser → ToolCall::Function → MessageFormatter` test walks the full path: Anthropic response with a tool_use block → ChatFunctionRequest → ToolCall::Function.from_function_request → persisted on the AssistantMessage → MessageFormatter rebuild on the next turn. Asserts the original `tool_use.id` is preserved end-to-end as both `tool_use.id` and the paired `tool_result.tool_use_id`, and that the original `input` hash and serialized result content survive. - New multi-tool assistant turn test confirms two tool_use blocks on a single assistant message render as two tool_use blocks followed by two paired tool_result blocks in a single user-role follow-up, matching Anthropic's required alternation. Both tests exercise the existing PR1 code without behavior changes. * test(ai): require "ostruct" explicitly in Anthropic provider tests OpenStruct is moving out of Ruby's default load path (warning in 3.4+, removed in 3.5+). Tests work today because ActiveSupport transitively loads it, but that's incidental. Match the existing convention in test/controllers/settings/hostings_controller_test.rb which explicitly requires ostruct for the same reason. * fix(ai): sanitize Langfuse warn logs, normalize tool_use.input, dedup history fetch Addresses three open CodeRabbit findings on PR #1983. - Provider::Anthropic Langfuse rescue branches no longer include `e.full_message` in `Rails.logger.warn`. `full_message` bundles the backtrace + cause chain and on some SDK error types includes the serialized request/response payload (prompt, model output). Logs now report `#{e.class}: #{e.message}` only. Three sites: create_langfuse_trace, log_langfuse_generation, upsert_langfuse_trace. Note: Provider::Openai has the same pattern (copy-pasted source) — harmonization deferred to a follow-up cleanup PR; this commit fixes only the Anthropic provider to keep PR scope tight. - MessageFormatter#parse_arguments now coerces any non-Hash parsed result to `{}`. Anthropic's Messages API requires `tool_use.input` to be a JSON object (map); a stored ToolCall::Function record whose arguments parse to a scalar, bool, or array (corrupt row, legacy data, cross-provider bleed) would otherwise produce a payload the API rejects. Normal flow stores Hash arguments end-to-end so the fix is defensive — adds 2 tests covering scalar/array JSON strings and non-String non-Hash inputs. - Assistant::Responder dedups the chat-history fetch. The previous layout fired two near-identical `chat.messages.where(...).includes( :tool_calls).ordered` queries per LLM turn (one for the OpenAI-shape payload, one for the raw-records kwarg). A new memoized `complete_chat_messages` fetches once; `chat_message_records` filters out the current message via `Array#reject`, `openai_messages_payload` iterates the cached array unchanged. One SQL query per turn instead of two. Memoization scope = single Responder instance (per LLM call), so cache invalidation is not a concern. All 4370 tests pass (1 pre-existing libvips env error unrelated). Rubocop + brakeman clean. * fix(ci): replace sk-ant- prefixed test placeholders Pipelock secret scanner pattern-matches `sk-ant-*` as a real Anthropic API key and fails the PR security-scan check. Test stubs and ClimateControl env values used `sk-ant-test`, `sk-ant-from-setting`, `sk-ant-x`, `sk-ant-y` as obvious placeholders, but the scanner does not care about value entropy. Switched to `fake-anthropic-key-*` / `fake-token-*` strings so the scanner stops flagging them. No production code touched, no behavior change — Provider::Anthropic still accepts any non-blank token.
2026-06-07 11:49:02 +00:00 · 2026-05-31 16:11:28 +02:00
parent cc070853b7
commit 8251b7e4d6
20 changed files with 1512 additions and 20 deletions
--- a/test/models/chat_test.rb
+++ b/test/models/chat_test.rb
@@ -47,7 +47,7 @@ class ChatTest < ActiveSupport::TestCase
      chat = @user.chats.start!(prompt, model: nil)

      assert_equal 2, chat.messages.count
-      assert_equal Provider::Openai::DEFAULT_MODEL, chat.messages.find_by!(type: "UserMessage").ai_model
+      assert_equal Chat.default_model, chat.messages.find_by!(type: "UserMessage").ai_model
    end
  end

@@ -58,10 +58,37 @@ class ChatTest < ActiveSupport::TestCase
      chat = @user.chats.start!(prompt, model: "")

      assert_equal 2, chat.messages.count
-      assert_equal Provider::Openai::DEFAULT_MODEL, chat.messages.find_by!(type: "UserMessage").ai_model
+      assert_equal Chat.default_model, chat.messages.find_by!(type: "UserMessage").ai_model
    end
  end

+  # These three tests assert routing (which provider's effective_model wins),
+  # not the constant value itself — the assertion side reads through
+  # Provider::*.effective_model so ENV overrides like ANTHROPIC_MODEL /
+  # OPENAI_MODEL don't make the tests flake.
+  test "default_model returns Anthropic's effective_model when LLM_PROVIDER=anthropic and Anthropic is configured" do
+    Provider::Anthropic.stubs(:configured?).returns(true)
+    Setting.stubs(:llm_provider).returns("anthropic")
+
+    assert_equal Provider::Anthropic.effective_model, Chat.default_model
+  end
+
+  test "default_model falls back to OpenAI's effective_model when Anthropic is preferred but unconfigured" do
+    Provider::Anthropic.stubs(:configured?).returns(false)
+    Provider::Openai.stubs(:configured?).returns(true)
+    Setting.stubs(:llm_provider).returns("anthropic")
+
+    assert_equal Provider::Openai.effective_model, Chat.default_model
+  end
+
+  test "default_model uses Anthropic's effective_model when OpenAI is unconfigured" do
+    Provider::Anthropic.stubs(:configured?).returns(true)
+    Provider::Openai.stubs(:configured?).returns(false)
+    Setting.stubs(:llm_provider).returns("openai")
+
+    assert_equal Provider::Anthropic.effective_model, Chat.default_model
+  end
+
  test "creates with configured model when OPENAI_MODEL env is set" do
    prompt = "Test prompt"

--- a/test/models/provider/anthropic/chat_config_test.rb
+++ b/test/models/provider/anthropic/chat_config_test.rb
@@ -0,0 +1,94 @@
+require "test_helper"
+
+class Provider::Anthropic::ChatConfigTest < ActiveSupport::TestCase
+  test "builds request with default max_tokens and prompt message" do
+    config = Provider::Anthropic::ChatConfig.new(prompt: "hello")
+
+    req = config.build_request(model: "claude-sonnet-4-6")
+
+    assert_equal "claude-sonnet-4-6", req[:model]
+    assert_equal 4096, req[:max_tokens]
+    assert_equal [ { role: "user", content: "hello" } ], req[:messages]
+    assert_nil req[:system_]
+    assert_nil req[:tools]
+  end
+
+  test "honors caller-provided default_max_tokens" do
+    config = Provider::Anthropic::ChatConfig.new(prompt: "hi", default_max_tokens: 8192)
+
+    req = config.build_request(model: "claude-sonnet-4-6")
+
+    assert_equal 8192, req[:max_tokens]
+  end
+
+  test "wraps instructions as cacheable system block" do
+    config = Provider::Anthropic::ChatConfig.new(prompt: "hi", instructions: "Be terse.")
+
+    req = config.build_request(model: "claude-sonnet-4-6")
+
+    assert_equal [ {
+      type: "text",
+      text: "Be terse.",
+      cache_control: { type: "ephemeral" }
+    } ], req[:system_]
+  end
+
+  test "converts function definitions to Anthropic tool blocks and caches the last one" do
+    config = Provider::Anthropic::ChatConfig.new(
+      prompt: "hi",
+      functions: [
+        {
+          name: "get_net_worth",
+          description: "Returns net worth",
+          params_schema: { type: "object", properties: {}, required: [], additionalProperties: false },
+          strict: true
+        },
+        {
+          name: "get_accounts",
+          description: "Returns accounts",
+          params_schema: { type: "object", properties: {}, required: [], additionalProperties: false },
+          strict: true
+        }
+      ]
+    )
+
+    req = config.build_request(model: "claude-sonnet-4-6")
+
+    assert_equal 2, req[:tools].size
+    assert_equal "get_net_worth", req[:tools][0][:name]
+    assert_equal "Returns net worth", req[:tools][0][:description]
+    assert_equal({ type: "object", properties: {}, required: [], additionalProperties: false }, req[:tools][0][:input_schema])
+    assert_nil req[:tools][0][:cache_control]
+
+    assert_equal({ type: "ephemeral" }, req[:tools][1][:cache_control])
+
+    # Anthropic schemas must not carry the OpenAI-specific `strict` flag.
+    req[:tools].each { |t| assert_not t[:input_schema].key?(:strict) }
+  end
+
+  test "strips both symbol and string-keyed `strict` flags from input_schema" do
+    config = Provider::Anthropic::ChatConfig.new(
+      prompt: "hi",
+      functions: [
+        {
+          name: "fn_with_string_strict",
+          description: "schema arrived from JSON.parse with string keys",
+          params_schema: {
+            "type" => "object",
+            "properties" => {},
+            "required" => [],
+            "additionalProperties" => false,
+            "strict" => true
+          },
+          strict: true
+        }
+      ]
+    )
+
+    req = config.build_request(model: "claude-sonnet-4-6")
+
+    schema = req[:tools].first[:input_schema]
+    assert_not schema.key?(:strict)
+    assert_not schema.key?("strict")
+  end
+end
--- a/test/models/provider/anthropic/chat_parser_test.rb
+++ b/test/models/provider/anthropic/chat_parser_test.rb
@@ -0,0 +1,85 @@
+require "test_helper"
+require "ostruct"
+
+class Provider::Anthropic::ChatParserTest < ActiveSupport::TestCase
+  test "parses text-only message into ChatResponse with single output_text" do
+    raw = build_message(
+      id: "msg_1",
+      model: "claude-sonnet-4-6",
+      content: [
+        OpenStruct.new(type: :text, text: "Hello"),
+        OpenStruct.new(type: :text, text: "world")
+      ]
+    )
+
+    parsed = Provider::Anthropic::ChatParser.new(raw).parsed
+
+    assert_equal "msg_1", parsed.id
+    assert_equal "claude-sonnet-4-6", parsed.model
+    assert_equal 1, parsed.messages.size
+    assert_equal "Hello\nworld", parsed.messages.first.output_text
+    assert_empty parsed.function_requests
+  end
+
+  test "parses tool_use blocks into ChatFunctionRequest" do
+    raw = build_message(
+      id: "msg_2",
+      model: "claude-sonnet-4-6",
+      content: [
+        OpenStruct.new(
+          type: :tool_use,
+          id: "toolu_abc",
+          name: "get_transactions",
+          input: { "page" => 1, "order" => "asc" }
+        )
+      ]
+    )
+
+    parsed = Provider::Anthropic::ChatParser.new(raw).parsed
+
+    assert_empty parsed.messages
+    assert_equal 1, parsed.function_requests.size
+    req = parsed.function_requests.first
+    assert_equal "toolu_abc", req.id
+    assert_equal "toolu_abc", req.call_id
+    assert_equal "get_transactions", req.function_name
+    assert_equal({ "page" => 1, "order" => "asc" }.to_json, req.function_args)
+  end
+
+  test "parses mixed content blocks" do
+    raw = build_message(
+      id: "msg_3",
+      model: "claude-sonnet-4-6",
+      content: [
+        OpenStruct.new(type: :text, text: "Looking up your transactions..."),
+        OpenStruct.new(type: :tool_use, id: "toolu_42", name: "get_transactions", input: {})
+      ]
+    )
+
+    parsed = Provider::Anthropic::ChatParser.new(raw).parsed
+
+    assert_equal 1, parsed.messages.size
+    assert_equal "Looking up your transactions...", parsed.messages.first.output_text
+    assert_equal 1, parsed.function_requests.size
+    assert_equal "toolu_42", parsed.function_requests.first.call_id
+  end
+
+  test "accepts hash-shaped content blocks" do
+    raw = OpenStruct.new(
+      id: "msg_4",
+      model: "claude-sonnet-4-6",
+      content: [
+        { type: :text, text: "from hash" }
+      ]
+    )
+
+    parsed = Provider::Anthropic::ChatParser.new(raw).parsed
+
+    assert_equal "from hash", parsed.messages.first.output_text
+  end
+
+  private
+    def build_message(id:, model:, content:)
+      OpenStruct.new(id: id, model: model, content: content)
+    end
+end
--- a/test/models/provider/anthropic/message_formatter_test.rb
+++ b/test/models/provider/anthropic/message_formatter_test.rb
@@ -0,0 +1,237 @@
+require "test_helper"
+require "ostruct"
+
+class Provider::Anthropic::MessageFormatterTest < ActiveSupport::TestCase
+  test "builds a single user turn from prompt alone" do
+    formatter = Provider::Anthropic::MessageFormatter.new(prompt: "hi")
+
+    messages = formatter.build
+
+    assert_equal 1, messages.size
+    assert_equal({ role: "user", content: "hi" }, messages.first)
+  end
+
+  test "skips empty content from history" do
+    history = [ stub_user_message("") ]
+
+    messages = Provider::Anthropic::MessageFormatter.new(prompt: "next", conversation_history: history).build
+
+    assert_equal [ { role: "user", content: "next" } ], messages
+  end
+
+  test "renders text-only assistant history with no tool calls" do
+    history = [
+      stub_user_message("first question"),
+      stub_assistant_message("first answer")
+    ]
+
+    messages = Provider::Anthropic::MessageFormatter.new(prompt: "second question", conversation_history: history).build
+
+    assert_equal({ role: "user", content: "first question" }, messages[0])
+    assert_equal "assistant", messages[1][:role]
+    assert_equal [ { type: "text", text: "first answer" } ], messages[1][:content]
+    assert_equal({ role: "user", content: "second question" }, messages[2])
+  end
+
+  test "renders assistant tool_call history with paired tool_result turn" do
+    tool_call = stub_tool_call(
+      id: "toolu_1",
+      name: "get_net_worth",
+      arguments: { "currency" => "USD" },
+      result: { "amount" => 12345, "currency" => "USD" }
+    )
+    assistant = stub_assistant_message("Your net worth is $12,345.", tool_calls: [ tool_call ])
+    history = [ stub_user_message("net worth?"), assistant ]
+
+    messages = Provider::Anthropic::MessageFormatter.new(prompt: "anything else?", conversation_history: history).build
+
+    assert_equal({ role: "user", content: "net worth?" }, messages[0])
+    assert_equal "assistant", messages[1][:role]
+    assert_equal "tool_use", messages[1][:content].first[:type]
+    assert_equal "toolu_1", messages[1][:content].first[:id]
+    assert_equal "get_net_worth", messages[1][:content].first[:name]
+    assert_equal({ "currency" => "USD" }, messages[1][:content].first[:input])
+    assert_equal "text", messages[1][:content].last[:type]
+
+    assert_equal "user", messages[2][:role]
+    assert_equal "tool_result", messages[2][:content].first[:type]
+    assert_equal "toolu_1", messages[2][:content].first[:tool_use_id]
+    assert_equal({ "amount" => 12345, "currency" => "USD" }.to_json, messages[2][:content].first[:content])
+
+    assert_equal({ role: "user", content: "anything else?" }, messages[3])
+  end
+
+  test "renders in-flight function_results as assistant tool_use + user tool_result" do
+    formatter = Provider::Anthropic::MessageFormatter.new(
+      prompt: "what is my net worth?",
+      function_results: [ {
+        call_id: "toolu_42",
+        name: "get_net_worth",
+        arguments: { "currency" => "USD" }.to_json,
+        output: { amount: 99, currency: "USD" }
+      } ]
+    )
+
+    messages = formatter.build
+
+    assert_equal({ role: "user", content: "what is my net worth?" }, messages[0])
+    assert_equal "assistant", messages[1][:role]
+    assert_equal "tool_use", messages[1][:content].first[:type]
+    assert_equal "toolu_42", messages[1][:content].first[:id]
+    assert_equal({ "currency" => "USD" }, messages[1][:content].first[:input])
+
+    assert_equal "user", messages[2][:role]
+    assert_equal "tool_result", messages[2][:content].first[:type]
+    assert_equal "toolu_42", messages[2][:content].first[:tool_use_id]
+    assert_includes messages[2][:content].first[:content], "99"
+  end
+
+  # Confirms the round-trip flagged in PR #1983 review: an Anthropic tool_use
+  # block returned by the model → ChatFunctionRequest → ToolCall::Function
+  # persisted on the AssistantMessage → MessageFormatter rebuild on the next
+  # turn produces an Anthropic-compatible history where tool_use_id pairs back
+  # to the original block.
+  test "ChatParser → ToolCall::Function → MessageFormatter round-trips tool_use_id" do
+    anthropic_response = OpenStruct.new(
+      id: "msg_abc",
+      model: "claude-sonnet-4-6",
+      content: [
+        OpenStruct.new(type: :tool_use, id: "toolu_round_trip", name: "get_net_worth", input: { "currency" => "USD" })
+      ]
+    )
+
+    parsed = Provider::Anthropic::ChatParser.new(anthropic_response).parsed
+    function_request = parsed.function_requests.first
+
+    persisted_tool_call = ToolCall::Function.from_function_request(
+      function_request,
+      { "amount" => 12345, "currency" => "USD" }
+    )
+
+    assistant = stub_assistant_message("Your net worth is $12,345.", tool_calls: [ persisted_tool_call ])
+    history = [ stub_user_message("net worth?"), assistant ]
+
+    rebuilt = Provider::Anthropic::MessageFormatter.new(prompt: "follow-up", conversation_history: history).build
+
+    tool_use_block = rebuilt[1][:content].find { |b| b[:type] == "tool_use" }
+    tool_result_block = rebuilt[2][:content].first
+
+    assert_equal "toolu_round_trip", tool_use_block[:id]
+    assert_equal "toolu_round_trip", tool_result_block[:tool_use_id]
+    assert_equal({ "currency" => "USD" }, tool_use_block[:input])
+    assert_equal({ "amount" => 12345, "currency" => "USD" }.to_json, tool_result_block[:content])
+  end
+
+  test "renders multi-tool assistant turn with all pairings preserved" do
+    tool_a = stub_tool_call(
+      id: "toolu_a",
+      name: "get_accounts",
+      arguments: {},
+      result: [ { "id" => 1, "name" => "Checking" } ]
+    )
+    tool_b = stub_tool_call(
+      id: "toolu_b",
+      name: "get_holdings",
+      arguments: {},
+      result: [ { "ticker" => "VTI", "qty" => 10 } ]
+    )
+    assistant = stub_assistant_message("Looked up your accounts and holdings.", tool_calls: [ tool_a, tool_b ])
+
+    messages = Provider::Anthropic::MessageFormatter.new(
+      prompt: "follow-up",
+      conversation_history: [ stub_user_message("accounts and holdings?"), assistant ]
+    ).build
+
+    tool_uses = messages[1][:content].select { |b| b[:type] == "tool_use" }
+    tool_results = messages[2][:content]
+
+    assert_equal 2, tool_uses.size
+    assert_equal 2, tool_results.size
+    assert_equal [ "toolu_a", "toolu_b" ], tool_uses.map { |b| b[:id] }
+    assert_equal [ "toolu_a", "toolu_b" ], tool_results.map { |b| b[:tool_use_id] }
+    # Anthropic requires the user turn to follow the assistant turn that used tools
+    assert_equal "assistant", messages[1][:role]
+    assert_equal "user", messages[2][:role]
+  end
+
+  test "parses string arguments and nil outputs gracefully" do
+    formatter = Provider::Anthropic::MessageFormatter.new(
+      prompt: "go",
+      function_results: [ {
+        call_id: "toolu_x",
+        name: "noop",
+        arguments: "",
+        output: nil
+      } ]
+    )
+
+    messages = formatter.build
+
+    assert_equal({}, messages[1][:content].first[:input])
+    assert_equal "", messages[2][:content].first[:content]
+  end
+
+  # Anthropic's tool_use.input MUST be a JSON object (map). If a stored
+  # ToolCall::Function record carries arguments that parse to a scalar or
+  # array (corrupt row, legacy data, OpenAI cross-bleed), the formatter
+  # must coerce them to `{}` so we don't ship an invalid payload.
+  test "coerces non-Hash parsed arguments to empty Hash" do
+    [ '"hello"', "123", "true", "[1,2,3]" ].each do |non_object_json|
+      formatter = Provider::Anthropic::MessageFormatter.new(
+        prompt: "go",
+        function_results: [ {
+          call_id: "toolu_x",
+          name: "noop",
+          arguments: non_object_json,
+          output: nil
+        } ]
+      )
+
+      messages = formatter.build
+
+      assert_equal({}, messages[1][:content].first[:input],
+        "expected empty Hash for arguments=#{non_object_json.inspect}")
+    end
+  end
+
+  test "coerces non-Hash non-String arguments to empty Hash" do
+    formatter = Provider::Anthropic::MessageFormatter.new(
+      prompt: "go",
+      function_results: [ {
+        call_id: "toolu_x",
+        name: "noop",
+        arguments: [ 1, 2, 3 ],
+        output: nil
+      } ]
+    )
+
+    messages = formatter.build
+
+    assert_equal({}, messages[1][:content].first[:input])
+  end
+
+  private
+    def stub_user_message(content)
+      msg = UserMessage.new(content: content, ai_model: "claude-sonnet-4-6")
+      msg.id = SecureRandom.uuid
+      msg
+    end
+
+    def stub_assistant_message(content, tool_calls: [])
+      msg = AssistantMessage.new(content: content, ai_model: "claude-sonnet-4-6")
+      msg.id = SecureRandom.uuid
+      msg.stubs(:tool_calls).returns(tool_calls)
+      msg
+    end
+
+    def stub_tool_call(id:, name:, arguments:, result:)
+      tc = ToolCall::Function.new(
+        function_name: name,
+        function_arguments: arguments,
+        function_result: result
+      )
+      tc.stubs(:provider_call_id).returns(id)
+      tc.stubs(:provider_id).returns(id)
+      tc
+    end
+end
--- a/test/models/provider/anthropic_test.rb
+++ b/test/models/provider/anthropic_test.rb
@@ -0,0 +1,259 @@
+require "test_helper"
+require "ostruct"
+
+class Provider::AnthropicTest < ActiveSupport::TestCase
+  include LLMInterfaceTest
+
+  setup do
+    @subject = @anthropic = Provider::Anthropic.new(
+      ENV.fetch("ANTHROPIC_API_KEY", "test-anthropic-token")
+    )
+    @subject_model = "claude-sonnet-4-6"
+  end
+
+  test "provider_name returns Anthropic for standard provider" do
+    assert_equal "Anthropic", @subject.provider_name
+  end
+
+  test "provider_name returns custom info for custom base_url" do
+    custom = Provider::Anthropic.new(
+      "test-token",
+      base_url: "https://bedrock.example.com/anthropic",
+      model: "claude-opus-4-7"
+    )
+
+    assert_equal "Custom Anthropic-compatible (https://bedrock.example.com/anthropic)", custom.provider_name
+  end
+
+  test "supports_model? returns true for claude prefix" do
+    assert @subject.supports_model?("claude-sonnet-4-6")
+    assert @subject.supports_model?("claude-opus-4-7")
+    assert @subject.supports_model?("claude-haiku-4-5")
+    assert_not @subject.supports_model?("gpt-4.1")
+  end
+
+  test "supports_model? bypasses the prefix gate for custom endpoints" do
+    custom = Provider::Anthropic.new(
+      "test-token",
+      base_url: "https://bedrock.example.com/anthropic",
+      model: "anthropic.claude-sonnet-4-5-20250929-v1:0"
+    )
+
+    # Bedrock-shaped IDs start with "anthropic", not "claude" — would fail the
+    # default prefix check, but custom endpoints must accept any model.
+    assert custom.supports_model?("anthropic.claude-sonnet-4-5-20250929-v1:0")
+    assert custom.supports_model?("claude-opus-4@20250514")
+    assert custom.supports_model?("any-string-the-endpoint-accepts")
+  end
+
+  test "supported_models_description returns prefixes for standard provider" do
+    assert_equal "models starting with: claude", @subject.supported_models_description
+  end
+
+  test "supports_pdf_processing? true for claude models" do
+    assert @subject.supports_pdf_processing?(model: "claude-sonnet-4-6")
+    assert_not @subject.supports_pdf_processing?(model: "gpt-4o")
+  end
+
+  test "effective_model defers to ENV when set without consulting Setting" do
+    ClimateControl.modify("ANTHROPIC_MODEL" => "claude-haiku-4-5") do
+      Setting.expects(:anthropic_model).never
+      assert_equal "claude-haiku-4-5", Provider::Anthropic.effective_model
+    end
+  end
+
+  test "configured? reflects ENV and Setting presence" do
+    ClimateControl.modify("ANTHROPIC_ACCESS_TOKEN" => nil, "ANTHROPIC_API_KEY" => nil) do
+      Setting.stubs(:anthropic_access_token).returns(nil)
+      assert_not Provider::Anthropic.configured?
+
+      Setting.stubs(:anthropic_access_token).returns("fake-token-1")
+      assert Provider::Anthropic.configured?
+    end
+
+    ClimateControl.modify("ANTHROPIC_API_KEY" => "fake-token-2") do
+      Setting.stubs(:anthropic_access_token).returns(nil)
+      assert Provider::Anthropic.configured?
+    end
+  end
+
+  test "effective_model falls back to default when nothing set" do
+    ClimateControl.modify("ANTHROPIC_MODEL" => nil) do
+      Setting.stubs(:anthropic_model).returns(nil)
+      assert_equal Provider::Anthropic::DEFAULT_MODEL, Provider::Anthropic.effective_model
+    end
+  end
+
+  test "chat_response wraps Anthropic SDK errors in Provider::Anthropic::Error" do
+    fake_client = mock
+    @subject.instance_variable_set(:@client, fake_client)
+    messages = mock
+    fake_client.stubs(:messages).returns(messages)
+    messages.expects(:create).raises(StandardError.new("rate limit exceeded"))
+
+    response = @subject.chat_response("hi", model: @subject_model)
+
+    assert_not response.success?
+    assert_kind_of Provider::Anthropic::Error, response.error
+    assert_match(/rate limit/i, response.error.message)
+  end
+
+  test "chat_response accepts messages: kwarg passed by Responder without raising" do
+    # The OpenAI-shaped `messages:` array is passed alongside `conversation_history:`
+    # for cross-provider parity. Anthropic ignores it but must still accept it as
+    # a keyword argument — historical regression that broke the first chat turn.
+    fake_client = stub_anthropic_client_with(
+      build_anthropic_message(
+        id: "msg_kw",
+        model: @subject_model,
+        text_blocks: [ "ok" ],
+        tool_use_blocks: [],
+        usage: { input_tokens: 1, output_tokens: 1 }
+      )
+    )
+    @subject.instance_variable_set(:@client, fake_client)
+
+    response = @subject.chat_response(
+      "hi",
+      model: @subject_model,
+      messages: [ { role: "user", content: "hi" } ],
+      conversation_history: []
+    )
+
+    assert response.success?
+  end
+
+  test "chat_response returns parsed ChatResponse on success" do
+    fake_client = stub_anthropic_client_with(
+      build_anthropic_message(
+        id: "msg_abc",
+        model: @subject_model,
+        text_blocks: [ "Hello there." ],
+        tool_use_blocks: [],
+        usage: { input_tokens: 12, output_tokens: 5 }
+      )
+    )
+    @subject.instance_variable_set(:@client, fake_client)
+
+    response = @subject.chat_response("hi", model: @subject_model)
+
+    assert response.success?
+    assert_equal "msg_abc", response.data.id
+    assert_equal @subject_model, response.data.model
+    assert_equal 1, response.data.messages.size
+    assert_equal "Hello there.", response.data.messages.first.output_text
+    assert_empty response.data.function_requests
+  end
+
+  test "chat_response streams text deltas and emits a final response chunk" do
+    final_message = build_anthropic_message(
+      id: "msg_stream",
+      model: @subject_model,
+      text_blocks: [ "Hello world" ],
+      tool_use_blocks: [],
+      usage: { input_tokens: 7, output_tokens: 3 }
+    )
+    # Use ad-hoc subclasses of the SDK event types so the case/when dispatch
+    # inside `stream_chat_response` matches them via `is_a?` without needing
+    # to stub class-level `===` behavior.
+    text_event_cls = Class.new(::Anthropic::Streaming::TextEvent) do
+      def initialize(text:, snapshot:)
+        @text = text
+        @snapshot = snapshot
+      end
+      attr_reader :text, :snapshot
+    end
+    stop_event_cls = Class.new(::Anthropic::Streaming::MessageStopEvent) do
+      def initialize(message:)
+        @message = message
+      end
+      attr_reader :message
+    end
+    events = [
+      text_event_cls.new(text: "Hello ", snapshot: "Hello "),
+      text_event_cls.new(text: "world", snapshot: "Hello world"),
+      stop_event_cls.new(message: final_message)
+    ]
+
+    fake_stream = mock
+    fake_stream.stubs(:each).multiple_yields(*events.map { |e| [ e ] })
+    fake_stream.stubs(:accumulated_message).returns(final_message)
+
+    messages = mock
+    messages.stubs(:stream).returns(fake_stream)
+    client = mock
+    client.stubs(:messages).returns(messages)
+    @subject.instance_variable_set(:@client, client)
+
+    collected = []
+    response = @subject.chat_response(
+      "hi",
+      model: @subject_model,
+      streamer: ->(chunk) { collected << chunk }
+    )
+
+    assert response.success?
+    text_chunks = collected.select { |c| c.type == "output_text" }
+    response_chunks = collected.select { |c| c.type == "response" }
+
+    assert_equal 2, text_chunks.size
+    assert_equal [ "Hello ", "world" ], text_chunks.map(&:data)
+    assert_equal 1, response_chunks.size
+    assert_equal "msg_stream", response_chunks.first.data.id
+    assert_equal 10, response_chunks.first.usage["total_tokens"]
+  end
+
+  test "chat_response surfaces tool_use blocks as function_requests" do
+    fake_client = stub_anthropic_client_with(
+      build_anthropic_message(
+        id: "msg_xyz",
+        model: @subject_model,
+        text_blocks: [],
+        tool_use_blocks: [ { id: "toolu_1", name: "get_net_worth", input: { currency: "USD" } } ],
+        usage: { input_tokens: 20, output_tokens: 8 }
+      )
+    )
+    @subject.instance_variable_set(:@client, fake_client)
+
+    response = @subject.chat_response(
+      "What is my net worth?",
+      model: @subject_model,
+      functions: [ {
+        name: "get_net_worth",
+        description: "Gets a user's net worth",
+        params_schema: { type: "object", properties: {}, required: [], additionalProperties: false },
+        strict: true
+      } ]
+    )
+
+    assert response.success?
+    assert_equal 1, response.data.function_requests.size
+
+    req = response.data.function_requests.first
+    assert_equal "toolu_1", req.call_id
+    assert_equal "get_net_worth", req.function_name
+    assert_equal({ currency: "USD" }.to_json, req.function_args)
+  end
+
+  private
+    def stub_anthropic_client_with(message)
+      messages = mock
+      messages.stubs(:create).returns(message)
+      client = mock
+      client.stubs(:messages).returns(messages)
+      client
+    end
+
+    def build_anthropic_message(id:, model:, text_blocks:, tool_use_blocks:, usage:)
+      OpenStruct.new(
+        id: id,
+        model: model,
+        content: text_blocks.map { |t| OpenStruct.new(type: :text, text: t) } +
+                 tool_use_blocks.map { |t| OpenStruct.new(type: :tool_use, id: t[:id], name: t[:name], input: t[:input]) },
+        usage: OpenStruct.new(
+          input_tokens: usage[:input_tokens],
+          output_tokens: usage[:output_tokens]
+        )
+      )
+    end
+end
--- a/test/models/provider/registry_test.rb
+++ b/test/models/provider/registry_test.rb
@@ -2,9 +2,14 @@ require "test_helper"

 class Provider::RegistryTest < ActiveSupport::TestCase
  test "providers filters out nil values when provider is not configured" do
-    # Ensure OpenAI is not configured
-    ClimateControl.modify("OPENAI_ACCESS_TOKEN" => nil) do
+    # Ensure no LLM provider is configured
+    ClimateControl.modify(
+      "OPENAI_ACCESS_TOKEN" => nil,
+      "ANTHROPIC_ACCESS_TOKEN" => nil,
+      "ANTHROPIC_API_KEY" => nil
+    ) do
      Setting.stubs(:openai_access_token).returns(nil)
+      Setting.stubs(:anthropic_access_token).returns(nil)

      registry = Provider::Registry.for_concept(:llm)

@@ -45,6 +50,44 @@ class Provider::RegistryTest < ActiveSupport::TestCase
    end
  end

+  test "anthropic provider returns nil when no credentials are configured" do
+    ClimateControl.modify(
+      "ANTHROPIC_ACCESS_TOKEN" => nil,
+      "ANTHROPIC_API_KEY" => nil
+    ) do
+      Setting.stubs(:anthropic_access_token).returns(nil)
+
+      assert_nil Provider::Registry.get_provider(:anthropic)
+    end
+  end
+
+  test "anthropic provider initializes from ANTHROPIC_API_KEY env" do
+    ClimateControl.modify("ANTHROPIC_API_KEY" => "fake-anthropic-key-for-tests", "ANTHROPIC_ACCESS_TOKEN" => nil) do
+      Setting.stubs(:anthropic_access_token).returns(nil)
+
+      provider = Provider::Registry.get_provider(:anthropic)
+
+      assert_instance_of Provider::Anthropic, provider
+    end
+  end
+
+  test "anthropic provider falls back to Setting when ENV is empty" do
+    ClimateControl.modify(
+      "ANTHROPIC_ACCESS_TOKEN" => "",
+      "ANTHROPIC_API_KEY" => "",
+      "ANTHROPIC_BASE_URL" => "",
+      "ANTHROPIC_MODEL" => ""
+    ) do
+      Setting.stubs(:anthropic_access_token).returns("fake-anthropic-key-from-setting")
+      Setting.stubs(:anthropic_base_url).returns(nil)
+      Setting.stubs(:anthropic_model).returns(nil)
+
+      provider = Provider::Registry.get_provider(:anthropic)
+
+      assert_instance_of Provider::Anthropic, provider
+    end
+  end
+
  test "openai provider falls back to Setting when ENV is empty string" do
    # Mock ENV to return empty string (common in Docker/env files)
    # Use stub_env helper which properly stubs ENV access