* feat(ai): add Anthropic provider with chat parity (1/5)
Introduces Provider::Anthropic alongside Provider::Openai, implementing
the LlmConcept chat_response contract over the official anthropic Ruby
SDK. Batch ops, PDF, and RAG land in follow-up PRs.
- Provider::Anthropic uses Messages API for sync and streaming responses
- ChatConfig builds requests with ephemeral prompt-cache markers on the
system prompt and the last tool definition
- MessageFormatter reconstructs multi-turn history (text + tool_use +
tool_result blocks) from raw Message records, including the paired
user-role tool_result turn Anthropic requires after every tool_use
- ChatParser maps Anthropic Message into the shared ChatResponse Data
- Registry, Setting, User, Chat default model wired for ANTHROPIC_*
envs and Setting.anthropic_*; LLM_PROVIDER selects between providers
- Responder forwards raw conversation_history (Array<Message>) so
providers without hosted conversation state can rebuild context
- OpenAI provider accepts and ignores the new kwarg (no behavior change)
Tests cover provider init, model gating, MessageFormatter for all turn
shapes, ChatConfig request building (max_tokens, system cache, tool
conversion), ChatParser for text / tool_use / mixed blocks, Registry
discovery, and mocked chat_response success / error / function_request
paths. Live VCR cassettes recorded in a follow-up with a real key.
Stacked PRs: 2/5 batch ops + cost ledger, 3/5 PDF, 4/5 pgvector RAG,
5/5 settings UI + disclosure.
* fix(ai): address PR review on Anthropic provider foundation
Surface fixes raised by Codex + CodeRabbit on PR 1/5:
- Provider::Anthropic#chat_response now accepts (and ignores) a
`messages:` kwarg. Assistant::Responder passes both `messages:`
(OpenAI-shape) and `conversation_history:` (raw Message records) for
cross-provider parity, so the previous signature raised
ArgumentError on the first chat turn through the Anthropic provider.
- Provider::Anthropic#supports_model? bypasses the `claude` prefix
gate when a custom base_url is configured, mirroring the OpenAI
provider. Bedrock-shaped IDs like
`anthropic.claude-sonnet-4-5-20250929-v1:0` and
`claude-opus-4@20250514` are otherwise rejected by
Assistant::Provided#get_model_provider and the chat dies.
- Setting.anthropic_access_token is now in
EncryptedSettingFields::ENCRYPTED_FIELDS so the Anthropic API key
is encrypted at rest like every other provider secret. Previously
plaintext while siblings (openai_access_token, twelve_data_api_key,
external_assistant_token) were ciphertext.
- Chat.default_model falls back to whichever provider is actually
configured. Previously, with LLM_PROVIDER=anthropic but no
Anthropic credentials, the default model resolved to a Claude ID
that no registered provider supported, so chats failed even when
OpenAI was fully configured. Adds Provider::{Anthropic,Openai}#configured?
class methods for the readable callsite.
- Provider::Anthropic.effective_model uses
`ENV["ANTHROPIC_MODEL"].presence || Setting.anthropic_model` so the
Setting lookup is only performed when the env var is absent — the
previous `ENV.fetch(KEY, default)` evaluated the default arg
eagerly on every call.
- Provider::Anthropic::ChatConfig#anthropic_input_schema strips both
`:strict` and `"strict"` keys so JSON-decoded schemas with string
keys cannot leak the OpenAI-only flag through to Anthropic.
Test coverage added: supports_model? bypass on custom endpoints,
chat_response messages: kwarg compatibility, default_model fallback
in the three credential combinations, configured? against ENV +
Setting, strict-flag stripping for both key types, and a
`Setting.expects(:anthropic_model).never` assertion proving the
ENV-precedence test now exercises the lazy path.
All 4365 tests pass (1 pre-existing libvips env error unrelated).
* test(chat): make default_model tests resilient to ENV model overrides
CodeRabbit flagged on PR review: the new default_model tests asserted
against Provider::*::DEFAULT_MODEL, but Chat.default_model actually
returns Provider::*.effective_model.presence (which reads
OPENAI_MODEL / ANTHROPIC_MODEL from the environment). With either env
var set, the tests would fail intermittently even though routing was
correct.
- New default_model tests now assert against the provider's
effective_model directly, so they verify the routing decision
(which provider's value wins) without coupling to the constant.
- Pre-existing "creates with default model" assertions had the same
brittleness; switch them to compare against Chat.default_model so
the chosen model is whatever the env / Setting cascade resolves to.
Verified by running `ANTHROPIC_MODEL=claude-haiku-4-5 OPENAI_MODEL=gpt-4o
bin/rails test test/models/chat_test.rb` — 16 runs, 0 failures
(previously 2 pre-existing failures + 0 from the new tests).
* fix(ai): address local review on Anthropic foundation
- Provider::Anthropic#supports_pdf_processing? bypasses prefix gate for
custom endpoints, mirroring supports_model?
- Provider::Anthropic#initialize raises Error when custom_endpoint? AND
model.blank?, parity with Provider::Openai
- stream_chat_response captures partial usage on mid-stream errors and
records it via the new on_partial callback so chat_response can skip
the duplicate error row in the outer rescue
- safe_accumulated_message swallows the secondary failure when the SDK
cannot reconstruct a snapshot
- langfuse_client memoizes properly (||= instead of =) so repeated calls
don't churn Langfuse instances
- MessageFormatter sorts tool_calls by created_at then id so the
message array is deterministic across replays; skips tool_calls
missing both provider_call_id and provider_id rather than sending
`id: nil` and getting rejected by Anthropic
- Setting.anthropic_access_token default falls back through
ENV["ANTHROPIC_API_KEY"].presence (was missing .presence, so an
empty-string env value bled through)
- User#openai_configured? / #anthropic_configured? delegate to the
Provider::* class methods — single source of truth
- Assistant::Responder renames the OpenAI-shape history builder
conversation_history → openai_messages_payload so the kwarg name
matches the local method name (messages: openai_messages_payload,
conversation_history: chat_message_records)
- Assistant::Builtin stale-history comment updated to reference both
builders
Adds a streaming chat_response test using ad-hoc subclasses of the
SDK event types so the case/when dispatch matches via is_a? without
stubbing class-level === behavior.
* test(ai): add Anthropic tool_use round-trip + multi-tool turn coverage
Addresses @jjmata's "worth confirming" note on PR #1983: tool-use turns
from prior assistant messages must round-trip correctly when retrieved
from the database.
- New `ChatParser → ToolCall::Function → MessageFormatter` test walks
the full path: Anthropic response with a tool_use block →
ChatFunctionRequest → ToolCall::Function.from_function_request →
persisted on the AssistantMessage → MessageFormatter rebuild on the
next turn. Asserts the original `tool_use.id` is preserved end-to-end
as both `tool_use.id` and the paired `tool_result.tool_use_id`, and
that the original `input` hash and serialized result content survive.
- New multi-tool assistant turn test confirms two tool_use blocks on a
single assistant message render as two tool_use blocks followed by
two paired tool_result blocks in a single user-role follow-up,
matching Anthropic's required alternation.
Both tests exercise the existing PR1 code without behavior changes.
* test(ai): require "ostruct" explicitly in Anthropic provider tests
OpenStruct is moving out of Ruby's default load path (warning in 3.4+,
removed in 3.5+). Tests work today because ActiveSupport transitively
loads it, but that's incidental. Match the existing convention in
test/controllers/settings/hostings_controller_test.rb which explicitly
requires ostruct for the same reason.
* fix(ai): sanitize Langfuse warn logs, normalize tool_use.input, dedup history fetch
Addresses three open CodeRabbit findings on PR #1983.
- Provider::Anthropic Langfuse rescue branches no longer include
`e.full_message` in `Rails.logger.warn`. `full_message` bundles the
backtrace + cause chain and on some SDK error types includes the
serialized request/response payload (prompt, model output). Logs
now report `#{e.class}: #{e.message}` only. Three sites:
create_langfuse_trace, log_langfuse_generation, upsert_langfuse_trace.
Note: Provider::Openai has the same pattern (copy-pasted source) —
harmonization deferred to a follow-up cleanup PR; this commit fixes
only the Anthropic provider to keep PR scope tight.
- MessageFormatter#parse_arguments now coerces any non-Hash parsed
result to `{}`. Anthropic's Messages API requires `tool_use.input`
to be a JSON object (map); a stored ToolCall::Function record whose
arguments parse to a scalar, bool, or array (corrupt row, legacy
data, cross-provider bleed) would otherwise produce a payload the
API rejects. Normal flow stores Hash arguments end-to-end so the
fix is defensive — adds 2 tests covering scalar/array JSON strings
and non-String non-Hash inputs.
- Assistant::Responder dedups the chat-history fetch. The previous
layout fired two near-identical `chat.messages.where(...).includes(
:tool_calls).ordered` queries per LLM turn (one for the OpenAI-shape
payload, one for the raw-records kwarg). A new memoized
`complete_chat_messages` fetches once; `chat_message_records` filters
out the current message via `Array#reject`, `openai_messages_payload`
iterates the cached array unchanged. One SQL query per turn instead
of two. Memoization scope = single Responder instance (per LLM call),
so cache invalidation is not a concern.
All 4370 tests pass (1 pre-existing libvips env error unrelated).
Rubocop + brakeman clean.
* fix(ci): replace sk-ant- prefixed test placeholders
Pipelock secret scanner pattern-matches `sk-ant-*` as a real Anthropic
API key and fails the PR security-scan check. Test stubs and
ClimateControl env values used `sk-ant-test`, `sk-ant-from-setting`,
`sk-ant-x`, `sk-ant-y` as obvious placeholders, but the scanner does
not care about value entropy.
Switched to `fake-anthropic-key-*` / `fake-token-*` strings so the
scanner stops flagging them. No production code touched, no behavior
change — Provider::Anthropic still accepts any non-blank token.
* feat(ai): add Anthropic batch ops + LLM cost ledger (2/5)
Implements auto_categorize, auto_detect_merchants, and
enhance_provider_merchants on Provider::Anthropic via forced tool calls,
plus the cost-ledger plumbing they need.
- Provider::Anthropic::AutoCategorizer, AutoMerchantDetector,
ProviderMerchantEnhancer each define a single output tool whose
input_schema mirrors the desired output, then force the model to call
it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }.
Anthropic guarantees the tool_use.input matches the schema, so there
is no JSON parsing fragility, no <think> tag stripping, and no
json_object/json_schema fallback ladders.
- Concerns::UsageRecorder mirrors the OpenAI sibling but persists
cache_creation_input_tokens / cache_read_input_tokens to dedicated
columns instead of metadata.
- Migration adds cache_creation_tokens, cache_read_tokens (nullable
integers) to llm_usages. OpenAI rows leave them null.
- LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6
$3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic"
for claude-* via the existing exact/prefix lookup.
- Provider::Anthropic#chat_response now persists cache columns directly
rather than stashing them in metadata.
- 25-transaction batch cap mirrors the OpenAI provider so the cost
ledger sees the same shape regardless of which provider ran a batch.
Tests cover the forced-tool-call path, null/None normalization,
case-insensitive merchant matching, the missing-tool_use error path,
and Anthropic-specific pricing + provider inference on LlmUsage.
Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.
* fix(ai): attribute Bedrock model IDs to anthropic + clean nil enum
- LlmUsage.infer_provider now returns "anthropic" for Bedrock /
Vertex shaped IDs (anthropic.* and anthropic/*), so cost-ledger
filtering by provider stays correct even when no per-MTok rate is
stored. Previously these IDs fell through to the "openai" default.
- AutoCategorizer drops the redundant nil sentinel from the
category_name enum — the union type [string, null] already permits
null, and some JSON Schema validators reject nil literals inside
enum arrays.
* test(ai): require "ostruct" in Anthropic batch op tests
Same rationale as the PR1 ostruct fix — explicit require so the tests
don't depend on ActiveSupport's transitive load when Ruby 3.5+ removes
OpenStruct from the default load path.
* feat(ai): Anthropic native PDF processing (3/5)
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.
- Provider::Anthropic::PdfProcessor classifies the document, summarizes
it, and extracts statement metadata via a forced report_document_analysis
tool whose input_schema mirrors the existing Provider::Openai output
(document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
{ transactions, period, account_holder, account_number, bank_name,
opening_balance, closing_balance } shape via report_bank_statement so
downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
{ type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
— Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
including cache_creation/cache_read tokens.
Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.
Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
* fix(ai): guard PDF size + surface bank-statement truncation
- PdfProcessor and BankStatementExtractor raise upfront when
pdf_content.bytesize exceeds MAX_PDF_BYTES (32 MB, matching
Anthropic's hard limit). Previously a 100 MB PDF would be
base64-encoded (~133 MB) and packed into the JSON body before
the API rejected it — peak heap ~270 MB per Sidekiq worker.
- BankStatementExtractor inspects response.stop_reason; when the
model hit max_tokens it logs a warning and flags result[:truncated]
so downstream callers know the transaction list may be incomplete.
- ISO date pattern added to statement_period_start/end schema in
PdfProcessor so the model can't return "March 2026" — Anthropic
enforces the regex via the tool's input_schema.
Tests cover the size guard (raises before any client.messages call),
truncated-result flagging, and the warning log path.
* test(ai): require "ostruct" in Anthropic PDF tests
Match the explicit ostruct require added in PR1/PR2 — same Ruby 3.5+
load-path reason.
* fix(llm-usage): include Anthropic cache tokens in estimated_cost
calculate_cost only priced prompt + completion tokens, so estimated_cost
under-reported every cached call — the cache_creation/cache_read columns this PR
added were tracked but never billed. Verified against the Anthropic dashboard: a
cached chat turn billed $0.05 but the ledger recorded $0.038; the gap was exactly
the unpriced cache tokens.
Price them relative to the input rate (Anthropic: cache write 1.25x, read 0.1x)
and thread the cache counts from both recorders (chat + batch). OpenAI rows leave
the columns null (treated as 0), so they're unaffected. Ledger now reproduces the
dashboard ($0.054 for the test turn).
* chore(ai): guard chat usage double-record; flag deferred Anthropic batch wiring
- Hardening: guard the success-path record_llm_usage with
`unless partial_usage_recorded` so a future change that emits partial usage on
a normal stream can't silently double-bill (the symptom investigated in the
#1984 review). No behavior change today — on_partial only fires from the
mid-stream-error rescue, which re-raises past this line.
- Notice: the family auto-categorize / merchant-detect / merchant-enhance flows
still hardcode get_provider(:openai). Provider::Anthropic now implements those
batch ops but they aren't wired into the family flows yet — documented with
TODOs at each site for the follow-up.
* chore(ai): point family-flow TODOs at tracking issue #2113
* chore(ai): flag deferred Anthropic PDF wiring (TODO #2113)
The PDF import + bank-statement-extract flows hardcode get_provider(:openai).
Provider::Anthropic implements process_pdf / extract_bank_statement (this PR)
but they aren't wired into these paths yet — documented with TODOs at each
site. Tracked in #2113 (broadened to cover batch ops + PDF).
* chore(anthropic-pdf): drop redundant strip_heredoc; document no-dedup
- The squiggly heredoc (<<~) already strips indentation, so the trailing
.strip_heredoc was a no-op in both PDF extractors.
- Document why BankStatementExtractor intentionally does NOT deduplicate (unlike
the OpenAI extractor): we send the whole PDF as one native document block, so
there are no overlapping-chunk artifacts to dedupe, and deduping would wrongly
merge legitimate same-day, same-amount transactions.
* fix(anthropic): cap PDF bytes below the base64-encoded request limit
Anthropic's 32 MB limit is on the Messages request body, and the PDF is sent
base64-encoded (~4/3 larger) alongside the JSON envelope, so a 32 MB raw PDF
encodes to ~42 MB and is rejected. Cap the raw bytes at 3/4 of the request
budget minus a 1 MB envelope reserve (~23 MiB). Addresses Codex review on #1985.
Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文
Sure: The personal finance app for everyone
Get involved: Discord • Website • Issues
Important
This repository is a community fork of the now-abandoned Maybe Finance project.
Learn more in their final release doc.
Backstory
The Maybe Finance (archived/abandoned repo) team spent most of 2021–2022 building a full-featured personal finance and wealth management app. It even included an “Ask an Advisor” feature that connected users with a real CFP/CFA — all included with your subscription.
The business end of things didn't work out, and so they stopped developing the app in mid-2023.
After spending nearly $1 million on development (employees, contractors, data providers, infra, etc.), the team open-sourced the app. Their goal was to let users self-host it for free — and eventually launch a hosted version for a small fee.
They actually did launch that hosted version … briefly.
That also didn’t work out — at least not as a sustainable B2C business — so now here we are: hosting a community-maintained fork to keep the codebase alive and see where this can go next.
Join us!
Hosting Sure
Sure is a fully working personal finance app that can be self hosted with Docker.
Forking and Attribution
This repo is a community fork of the archived Maybe Finance repo. You’re free to fork it under the AGPLv3 license — but we’d love it if you stuck around and contributed here instead.
To stay compliant and avoid trademark issues:
- Be sure to include the original AGPLv3 license and clearly state in your README that your fork is based on Maybe Finance but is not affiliated with or endorsed by Maybe Finance Inc.
- "Maybe" is a trademark of Maybe Finance Inc. and therefore, use of it is NOT allowed in forked repositories (or the logo)
Performance Issues
With data-heavy apps, inevitably, there are performance issues. We've set up a public dashboard showing the problematic requests seen on the demo site, along with the stacktraces to help debug them.
https://www.skylight.io/app/applications/s6PEZSKwcklL/recent/6h/endpoints
Any contributions that help improve performance are very much welcome.
Local Development Setup
If you are trying to self-host the app, read this guide to get started.
The instructions below are for developers to get started with contributing to the app.
Requirements
- See
.ruby-versionfile for required Ruby version - PostgreSQL >9.3 (latest stable version recommended)
- Redis > 5.4 (latest stable version recommended)
Getting Started
cd sure
cp .env.local.example .env.local
bin/setup
bin/dev
# Optionally, load demo data
rake demo_data:default
Visit http://localhost:3000 to view the app.
If you loaded the optional demo data, log in with these credentials:
- Email:
user@example.com - Password:
Password1!
For further instructions, see guides below.
Setup Guides
- Mac dev setup
- Linux dev setup
- Windows dev setup
- Dev containers - visit this guide
One-click Install
Managed OpenClaw for Sure Finances
License and Trademarks
Maybe and Sure are both distributed under an AGPLv3 license.
- "Maybe" is a trademark of Maybe Finance, Inc.
- "Sure" is not, and refers to this community fork.
