LLM Caching Cut LLM costs and latency with exact match, semantic, and provider-side caching layers. When to Use This Skill Use this skill when: - The same or similar queries are asked repeatedly (FAQ bots, support tools) - LLM API costs are growing and you need immediate savings - Serving high request volumes where repeated queries cause bottlenecks - Implementing prompt caching for long system prompts (Anthropic/OpenAI) - Building offline-capable AI features that need response persistence Caching Layers Layer 1: Exact Match Cache (Redis) Layer 2: Semantic Cache (GPTCache) Custom Semantic Cac…