Local LLM Provider Connect to local LLM endpoints (Ollama, llama.cpp, vLLM) with automatic fallback to cloud providers. This skill enables the agent to leverage local GPU/CPU inference while maintaining reliability through intelligent fallback. When to Use - Running LLM inference locally for privacy (data never leaves your machine) - Using models not available via cloud APIs (e.g., fine-tuned models, Llama variants) - Reducing API costs for high-volume tasks - Working offline or with intermittent connectivity - Need low-latency responses for interactive tasks Setup No additional setup require…