mirror of
https://github.com/we-promise/sure.git
synced 2026-05-30 07:49:01 +00:00
Implements auto_categorize, auto_detect_merchants, and
enhance_provider_merchants on Provider::Anthropic via forced tool calls,
plus the cost-ledger plumbing they need.
- Provider::Anthropic::AutoCategorizer, AutoMerchantDetector,
ProviderMerchantEnhancer each define a single output tool whose
input_schema mirrors the desired output, then force the model to call
it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }.
Anthropic guarantees the tool_use.input matches the schema, so there
is no JSON parsing fragility, no <think> tag stripping, and no
json_object/json_schema fallback ladders.
- Concerns::UsageRecorder mirrors the OpenAI sibling but persists
cache_creation_input_tokens / cache_read_input_tokens to dedicated
columns instead of metadata.
- Migration adds cache_creation_tokens, cache_read_tokens (nullable
integers) to llm_usages. OpenAI rows leave them null.
- LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6
$3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic"
for claude-* via the existing exact/prefix lookup.
- Provider::Anthropic#chat_response now persists cache columns directly
rather than stashing them in metadata.
- 25-transaction batch cap mirrors the OpenAI provider so the cost
ledger sees the same shape regardless of which provider ran a batch.
Tests cover the forced-tool-call path, null/None normalization,
case-insensitive merchant matching, the missing-tool_use error path,
and Anthropic-specific pricing + provider inference on LlmUsage.
Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.
36 lines
1.3 KiB
Ruby
36 lines
1.3 KiB
Ruby
require "test_helper"
|
|
|
|
class LlmUsageTest < ActiveSupport::TestCase
|
|
test "infer_provider returns anthropic for claude models" do
|
|
assert_equal "anthropic", LlmUsage.infer_provider("claude-sonnet-4-6")
|
|
assert_equal "anthropic", LlmUsage.infer_provider("claude-opus-4-7")
|
|
assert_equal "anthropic", LlmUsage.infer_provider("claude-haiku-4-5")
|
|
end
|
|
|
|
test "infer_provider still returns openai for gpt models" do
|
|
assert_equal "openai", LlmUsage.infer_provider("gpt-4.1")
|
|
assert_equal "openai", LlmUsage.infer_provider("gpt-5")
|
|
end
|
|
|
|
test "calculate_cost returns Anthropic pricing for Claude models" do
|
|
cost = LlmUsage.calculate_cost(model: "claude-sonnet-4-6", prompt_tokens: 1_000_000, completion_tokens: 100_000)
|
|
|
|
# 1M input * $3/MTok + 100K output * $15/MTok = $3.00 + $1.50 = $4.50
|
|
assert_in_delta 4.5, cost, 0.0001
|
|
end
|
|
|
|
test "calculate_cost uses higher pricing for Opus" do
|
|
cost = LlmUsage.calculate_cost(model: "claude-opus-4-7", prompt_tokens: 1_000_000, completion_tokens: 0)
|
|
|
|
# 1M input * $15/MTok = $15.00
|
|
assert_in_delta 15.0, cost, 0.0001
|
|
end
|
|
|
|
test "calculate_cost uses lower pricing for Haiku" do
|
|
cost = LlmUsage.calculate_cost(model: "claude-haiku-4-5", prompt_tokens: 1_000_000, completion_tokens: 1_000_000)
|
|
|
|
# $1 in + $5 out = $6.00
|
|
assert_in_delta 6.0, cost, 0.0001
|
|
end
|
|
end
|