Small llms improvements (#400)

* Initial implementation

* FIX keys

* Add langfuse evals support

* FIX trace upload

* Delete .claude/settings.local.json

Signed-off-by: soky srm <sokysrm@gmail.com>

* Update client.rb

* Small LLMs improvements

* Keep batch size normal

* Update categorizer

* FIX json mode

* Add reasonable alternative to matching

* FIX thinking blocks for llms

* Implement json mode support with AUTO mode

* Make auto default for everyone

* FIX linter

* Address review

* Allow export manual categories

* FIX user export

* FIX oneshot example pollution

* Update categorization_golden_v1.yml

* Update categorization_golden_v1.yml

* Trim to 100 items

* Update auto_categorizer.rb

* FIX for auto retry in auto mode

* Separate the Eval Logic from the Auto-Categorizer

The expected_null_count parameter conflates eval-specific logic with production categorization logic.

* Force json mode on evals

* Introduce a more mixed dataset

150 items, performance from a local model:

By Difficulty:
  easy: 93.22% accuracy (55/59)
  medium: 93.33% accuracy (42/45)
  hard: 92.86% accuracy (26/28)
  edge_case: 100.0% accuracy (18/18)

* Improve datasets

Remove Data leakage from prompts

* Create eval runs as "pending"

---------

Signed-off-by: soky srm <sokysrm@gmail.com>
Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>
Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>
This commit is contained in:
soky srm
2025-12-07 18:11:34 +01:00
committed by GitHub
parent bf90cad9a0
commit 88952e4714
34 changed files with 11027 additions and 42 deletions

View File

@@ -47,5 +47,20 @@
inputmode: "text",
disabled: ENV["OPENAI_MODEL"].present?,
data: { "auto-submit-form-target": "auto" } %>
<%= form.select :openai_json_mode,
options_for_select(
[
[t(".json_mode_auto"), ""],
[t(".json_mode_strict"), "strict"],
[t(".json_mode_none"), "none"],
[t(".json_mode_json_object"), "json_object"]
],
Setting.openai_json_mode
),
{ label: t(".json_mode_label") },
{ disabled: ENV["LLM_JSON_MODE"].present?,
data: { "auto-submit-form-target": "auto" } } %>
<p class="text-xs text-secondary mt-1"><%= t(".json_mode_help") %></p>
<% end %>
</div>