ML Training Debugger Version : 1.0.0 Type : Agent-based skill with SDK implementation Domain : Machine learning training diagnostics Description Diagnose machine learning training failures including loss divergence, mode collapse, gradient issues, architecture problems, and optimization failures. This skill spawns a specialist ML debugging agent that systematically analyzes training artifacts to identify root causes and propose evidence-based fixes. Use this skill when encountering training failures, when loss curves exhibit pathological behavior, when models produce degenerate outputs, when…