* feat: add SSL_CA_FILE and SSL_VERIFY environment variables to support self-signed certificates in self-hosted environments
* fix: NoMethodError by defining SSL helper methods before configure block executes
* refactor: Refactor SessionsController to use shared SslConfigurable module and simplify SSL initializer redundant checks
* refactor: improve SSL configuration robustness and error detection accuracy
* fix:HTTParty SSL options, add file validation guards, prevent Tempfile GC, and redact URLs in error logs
* fix: Fix SSL concern indentation and stub Simplefin POST correctly in tests
* fix: normalize ssl_verify to always return boolean instead of nil
* fix: solve failing SimpleFin test
* refactor: trim unused error-handling code from SslConfigurable, replace Tempfile with fixed-path CA bundle, fix namespace pollution in initializers, and add unit tests for core SSL configuration and Langfuse CRL callback.
* fix: added require ileutils in the initializer and require ostruct in the test file.
* fix: solve autoload conflict that broke provider loading, validate all certs in PEM bundles, and add missing requires.
* Initial implementation
* FIX keys
* Add langfuse evals support
* FIX trace upload
* Delete .claude/settings.local.json
Signed-off-by: soky srm <sokysrm@gmail.com>
* Update client.rb
* Small LLMs improvements
* Keep batch size normal
* Update categorizer
* FIX json mode
* Add reasonable alternative to matching
* FIX thinking blocks for llms
* Implement json mode support with AUTO mode
* Make auto default for everyone
* FIX linter
* Address review
* Allow export manual categories
* FIX user export
* FIX oneshot example pollution
* Update categorization_golden_v1.yml
* Update categorization_golden_v1.yml
* Trim to 100 items
* Update auto_categorizer.rb
* FIX for auto retry in auto mode
* Separate the Eval Logic from the Auto-Categorizer
The expected_null_count parameter conflates eval-specific logic with production categorization logic.
* Force json mode on evals
* Introduce a more mixed dataset
150 items, performance from a local model:
By Difficulty:
easy: 93.22% accuracy (55/59)
medium: 93.33% accuracy (42/45)
hard: 92.86% accuracy (26/28)
edge_case: 100.0% accuracy (18/18)
* Improve datasets
Remove Data leakage from prompts
* Create eval runs as "pending"
---------
Signed-off-by: soky srm <sokysrm@gmail.com>
Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>
Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>