LLM Supervisor šŸ”® Handles rate limits and model fallbacks gracefully. Behavior On Rate Limit / Overload Errors When I encounter rate limits or overload errors from cloud providers (Anthropic, OpenAI): 1. Tell the user immediately — Don't silently fail or retry endlessly 2. Offer local fallback — Ask if they want to switch to Ollama 3. Wait for confirmation — Never auto-switch for code generation tasks Confirmation Required Before using local models for code generation, ask: "Cloud is rate-limited. Switch to local Ollama ( )? Reply 'yes' to confirm." For simple queries (chat, summaries), can s…