🐙 GitHub Detail
jfrog/agent-belt
By jfrog
Reproducible evaluation for AI coding agents. Multi-turn scenarios against Claude Code, Codex, Copilot, Cursor, Gemini CLI, Goose, OpenCode, or any custom agent you plug in; verify behavior with rule checks, workspace diffs, multi-judge LLM consensus; pin reliability with pass^k variance across trials. Git worktrees, optional Docker sandbox.
Live Snapshot
⭐
Stars
15
🍴
Forks
1
📄
License
Apache License 2.0
🧩
Type
Python
About this open-source project
Live information fetched from GitHub.
Reproducible evaluation for AI coding agents. Multi-turn scenarios against Claude Code, Codex, Copilot, Cursor, Gemini CLI, Goose, OpenCode, or any custom agent you plug in; verify behavior with rule checks, workspace diffs, multi-judge LLM consensus; pin reliability with pass^k variance across trials. Git worktrees, optional Docker sandbox.
Default Branch
main
Open Issues
4
Watchers
15