* Initial implementation
* FIX keys
* Add langfuse evals support
* FIX trace upload
* Delete .claude/settings.local.json
Signed-off-by: soky srm <sokysrm@gmail.com>
* Update client.rb
* Small LLMs improvements
* Keep batch size normal
* Update categorizer
* FIX json mode
* Add reasonable alternative to matching
* FIX thinking blocks for llms
* Implement json mode support with AUTO mode
* Make auto default for everyone
* FIX linter
* Address review
* Allow export manual categories
* FIX user export
* FIX oneshot example pollution
* Update categorization_golden_v1.yml
* Update categorization_golden_v1.yml
* Trim to 100 items
* Update auto_categorizer.rb
* FIX for auto retry in auto mode
* Separate the Eval Logic from the Auto-Categorizer
The expected_null_count parameter conflates eval-specific logic with production categorization logic.
* Force json mode on evals
* Introduce a more mixed dataset
150 items, performance from a local model:
By Difficulty:
easy: 93.22% accuracy (55/59)
medium: 93.33% accuracy (42/45)
hard: 92.86% accuracy (26/28)
edge_case: 100.0% accuracy (18/18)
* Improve datasets
Remove Data leakage from prompts
* Create eval runs as "pending"
---------
Signed-off-by: soky srm <sokysrm@gmail.com>
Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>
Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>
* Implement support for generic OpenAI api
- Implements support to route requests to any openAI capable provider ( Deepsek, Qwen, VLLM, LM Studio, Ollama ).
- Keeps support for pure OpenAI and uses the new better responses api
- Uses the /chat/completions api for the generic providers
- If uri_base is not set, uses default implementation.
* Fix json handling and indentation
* Fix linter error indent
* Fix tests to set env vars
* Fix updating settings
* Change to prefix checking for OAI models
* FIX check model if custom uri is set
* Change chat to sync calls
Some local models don't support streaming. Revert to sync calls for generic OAI api
* Fix tests
* Fix tests
* Fix for gpt5 message extraction
- Finds the message output by filtering for "type" == "message" instead of assuming it's at index 0
- Safely extracts the text using safe navigation operators (&.)
- Raises a clear error if no message content is found
- Parses the JSON as before
* Add more langfuse logging
- Add Langfuse to auto categorizer and merchant detector
- Fix monitoring on streaming chat responses
- Add Langfuse traces also for model errors now
* Update app/models/provider/openai.rb
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: soky srm <sokysrm@gmail.com>
* handle nil function results explicitly
* Exposing some config vars.
* Linter and nitpick comments
* Drop back to `gpt-4.1` as default for now
* Linter
* Fix for strict tool schema in Gemini
- This fixes tool calling in Gemini OpenAI api
- Fix for getTransactions function, page size is not used.
---------
Signed-off-by: soky srm <sokysrm@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>