Files
sure/app/models
soky srm 88952e4714 Small llms improvements (#400)
* Initial implementation

* FIX keys

* Add langfuse evals support

* FIX trace upload

* Delete .claude/settings.local.json

Signed-off-by: soky srm <sokysrm@gmail.com>

* Update client.rb

* Small LLMs improvements

* Keep batch size normal

* Update categorizer

* FIX json mode

* Add reasonable alternative to matching

* FIX thinking blocks for llms

* Implement json mode support with AUTO mode

* Make auto default for everyone

* FIX linter

* Address review

* Allow export manual categories

* FIX user export

* FIX oneshot example pollution

* Update categorization_golden_v1.yml

* Update categorization_golden_v1.yml

* Trim to 100 items

* Update auto_categorizer.rb

* FIX for auto retry in auto mode

* Separate the Eval Logic from the Auto-Categorizer

The expected_null_count parameter conflates eval-specific logic with production categorization logic.

* Force json mode on evals

* Introduce a more mixed dataset

150 items, performance from a local model:

By Difficulty:
  easy: 93.22% accuracy (55/59)
  medium: 93.33% accuracy (42/45)
  hard: 92.86% accuracy (26/28)
  edge_case: 100.0% accuracy (18/18)

* Improve datasets

Remove Data leakage from prompts

* Create eval runs as "pending"

---------

Signed-off-by: soky srm <sokysrm@gmail.com>
Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>
Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>
2025-12-07 18:11:34 +01:00
..
2025-10-28 19:32:27 +01:00
2025-03-28 13:08:22 -04:00
2025-10-22 19:14:03 +02:00
2025-12-07 18:11:34 +01:00
2025-11-24 17:54:18 +01:00
2025-09-24 00:19:51 +02:00
2025-11-25 20:21:29 +01:00
2025-11-27 15:24:34 +01:00
2025-10-28 19:32:27 +01:00
2025-10-28 19:32:27 +01:00
2025-12-07 18:11:34 +01:00
2025-10-28 19:32:27 +01:00
2024-02-02 09:05:04 -06:00
2025-10-24 19:39:42 +02:00
2024-10-18 11:26:58 -05:00
2025-10-28 19:32:27 +01:00
2025-11-23 20:43:55 +05:30
2024-10-18 11:26:58 -05:00
2024-11-08 09:58:35 -06:00
2024-08-23 10:06:24 -04:00
2025-05-23 18:58:22 -04:00
2025-11-25 20:21:29 +01:00
2024-08-23 09:33:42 -04:00
2025-11-01 09:12:42 +01:00
2025-10-28 19:32:27 +01:00
2025-11-22 02:14:29 +01:00
2025-12-07 18:11:34 +01:00
2025-11-25 20:21:29 +01:00
2024-05-23 08:09:33 -04:00
2025-03-28 13:08:22 -04:00
2025-07-10 18:40:38 -04:00