GAIA Architecture Comparison Skill Compare ruflo's GAIA benchmark harness against the Princeton HAL reference implementation and other open-source harnesses to understand capability gaps and prioritize improvements. When to use - Planning the next iteration of GAIA work - Evaluating which architectural change has the highest pass-rate ROI - Onboarding a new contributor to the benchmark codebase Architecture overview ruflo harness (current) HAL reference (Princeton) HAL uses a similar loop but with: - OpenAI function calling as the tool interface - BrowserBase / Playwright for real browser aut…