Perplexity Load & Scale Overview Load testing and capacity planning for Perplexity Sonar API. Key constraint: Perplexity rate limits at 50 RPM (default tier), and every request performs a live web search with variable latency. Load testing must respect these limits to avoid burning through credits. Capacity Constraints | Constraint | Default Limit | Impact | |-----------|--------------|--------| | RPM (requests per minute) | 50 | Hard ceiling on throughput | | Context window | 127K tokens | Limits conversation history | | latency | 1-3s | Throughput: 20-50 concurrent | | latency | 3-8s | Thro…