* feat(ci): improve LLM eval visibility in GitHub Actions
- Add step summary output for each eval run (shows in GH UI)
- Add new 'summarize_evals' job that aggregates results from all matrix runs
- Generate markdown table with accuracy, cost, and duration for all evals
- Add threshold checking (fails workflow if accuracy < 70%)
- Include status icons (✅/❌) for quick visual assessment
- Show overall pass/fail status at the end of summary
* Fix LLM eval workflow summary
---------
Co-authored-by: SureBot <sure-bot@we-promise.com>
Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>