evaluation Automations - Page 1

Automate AI skill and codebase improvement by iteratively testing, updating, and refining files using objective metrics ...

Skill Quality Evaluator - Assess and score AI agent skill output quality. Trigger on: 'evaluate', 'quality check', 'scor...

Production-grade coding workflow, execution scaffolding, and tuning skills for OpenClaw agents, part of the MyClaw.ai ec...

面向 OpenClaw / Codex Agent 的决策导向 deep research skill，强调任务路由、证据可追溯、current-state verification、反证约束与可审计交付。

Fellow: experimentation engine — runs controlled benchmark experiments to validate skill improvements.

Tag: evaluation