LangChain Cost Tuning (Python) Overview An engineer shipped a new research agent Tuesday. By Friday the Anthropic bill had grown 6x while traffic grew 1.4x. The cost dashboard — wired to — showed spend up maybe 2x. Reconciling against the provider console on Monday surfaced two compounding bugs: (1) the agent's fallback kept the default , so each logical call billed as up to 7 requests (P30); (2) retry middleware was registered below token accounting, so every retry fired twice — the aggregator summed both emissions while LangSmith deduped them by generation ID, undercounting the dashboard by…