Podium Webhook Reliability Overview Receive Podium webhooks in production without forged events, double-charged AI side-effects, lost notifications, or out-of-order conversation events. This is not an introductory webhook walkthrough — it is the receiver code your integration runs when Podium retries a 5xx response six times over 24 hours, when a leaked secret lets an attacker POST forged events, when a batch delivery arrives with ahead of , and when on-call needs to drain and replay 800 failed events without re-firing the ones that already succeeded. The six production failures this skill pr…

|| true)\n[ -z \"$STAGED\" ] && exit 0\n\n# Match \"==\" within 3 lines of \"hmac\" or \"signature\" (case-insensitive).\nif echo \"$STAGED\" | xargs -r grep -nE \"(hmac|signature)\" 2>/dev/null \\\n | awk -F: '{print $1}' | sort -u \\\n | xargs -r grep -nE \"^[^#]*\\b==\\b\" \\\n | grep -iE \"(hmac|signature|sig)\" >/dev/null; then\n echo \"ERR_WHK_002 risk: '==' compare detected near hmac/signature in staged files.\"\n echo \"Use hmac.compare_digest() (Python) or crypto.timingSafeEqual() (Node).\"\n exit 1\nfi\n```\n\nInstall with: `chmod +x .git/hooks/pre-commit`. The grep is conservative (false positives possible on incidental matches); the engineer's intent is to surface every site that needs human review.\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":7579,"content_sha256":"72e916b21e845932d57018cb657de5d8ed03fceee4c0706c2444cd8f59761123"},{"filename":"references/implementation.md","content":"# Implementation Reference — podium-webhook-reliability\n\nLanguage-portability layer plus Redis schema plus DLQ backend choices plus operator workflow.\n\n## Node.js / TypeScript port\n\nThe Python receiver translates to Node + Express + ioredis. The two notable differences: `hmac.compare_digest` becomes `crypto.timingSafeEqual` (which requires the two Buffers to be the same length — wrap with an explicit length check), and raw body access requires the `express.raw()` middleware because `express.json()` consumes the body before signature verification can run.\n\n```typescript\nimport express from \"express\";\nimport * as crypto from \"crypto\";\nimport IORedis from \"ioredis\";\n\nconst SECRET = Buffer.from(process.env.PODIUM_WEBHOOK_SECRET!, \"utf-8\");\nconst R = new IORedis(process.env.REDIS_URL || \"redis://localhost:6379/0\");\nconst app = express();\n\n// CRITICAL: raw middleware MUST come before json — signature is on raw bytes.\napp.post(\"/webhooks/podium\",\n express.raw({ type: \"application/json\", limit: \"512kb\" }),\n async (req, res) => {\n const raw = req.body as Buffer; // raw bytes; do not re-encode\n const header = req.header(\"X-Podium-Signature\") || \"\";\n\n const parts: Record\u003cstring, string> = {};\n for (const p of header.split(\",\")) {\n const [k, v] = p.split(\"=\", 2);\n if (k && v) parts[k] = v;\n }\n const ts = parts.t, sig = parts.v1;\n if (!ts || !sig) return res.status(401).send(\"missing signature parts\");\n\n const signedPayload = Buffer.concat([Buffer.from(`${ts}.`, \"utf-8\"), raw]);\n const expected = crypto.createHmac(\"sha256\", SECRET).update(signedPayload).digest(\"hex\");\n\n // Constant-time compare. Buffers MUST be the same length or timingSafeEqual throws.\n const recvBuf = Buffer.from(sig, \"utf-8\");\n const expBuf = Buffer.from(expected, \"utf-8\");\n if (recvBuf.length !== expBuf.length || !crypto.timingSafeEqual(recvBuf, expBuf)) {\n return res.status(401).send(\"signature mismatch\");\n }\n\n if (Math.abs(Date.now() / 1000 - parseInt(ts, 10)) > 300) {\n return res.status(401).send(\"replay window exceeded\");\n }\n\n const event = JSON.parse(raw.toString(\"utf-8\"));\n const claimed = await R.set(`podium:evt:${event.id}`, \"1\", \"EX\", 86400, \"NX\");\n if (claimed !== \"OK\") return res.status(200).json({ status: \"duplicate\", event_id: event.id });\n\n try {\n await dispatch(event);\n return res.status(200).json({ status: \"ok\", event_id: event.id });\n } catch (e: any) {\n await R.lpush(\"podium:dlq\", JSON.stringify({\n event_id: event.id, raw_body: raw.toString(\"utf-8\"),\n signature_header: header, received_at: Date.now() / 1000,\n exception: `${e?.name}: ${e?.message}`,\n }));\n return res.status(500).send(\"dispatch failed\");\n }\n }\n);\n```\n\n## Redis schema\n\n| Key pattern | Type | TTL | Purpose |\n|---|---|---|---|\n| `podium:evt:{event_id}` | string | 86400s (24h) | Dedup claim. Value is `\"1\"`, only existence matters. |\n| `podium:dlq` | list | none | Failed event entries. LPUSH on persist, LRANGE for inspection, BRPOPLPUSH for atomic drain. |\n| `podium:dlq:replayed` | list | 30 days | Audit trail of replayed entries. Append after successful replay. |\n| `podium:rate:{client_ip}` | string | 60s | Optional inbound rate-limit counter (defense in depth against probing). |\n\n**ACL recommendation**: a dedicated Redis user for the receiver with only the commands it needs:\n\n- `SET`, `GET`, `EXPIRE` (dedup)\n- `LPUSH`, `LRANGE`, `LLEN`, `BRPOPLPUSH` (DLQ)\n- `INCR`, `EXPIRE` (rate-limit, if used)\n\nNever grant `FLUSHDB`, `KEYS`, or `CONFIG`.\n\n## DLQ backend choice matrix\n\n| Backend | Throughput ceiling | Durability | When to choose |\n|---|---|---|---|\n| Redis list | ~50k/s LPUSH; bounded by Redis persistence config | AOF + replicas | Default for prod. Pair with an hourly archiver to S3/GCS. |\n| SQLite (WAL mode) | ~1k/s sustained writes | fsync per commit | Single-node deployments; dev. Backup the file. |\n| JSONL file | ~10k/s with buffered writes | Depends on FS flush | Fallback when nothing else is available. Easy to grep, ugly to drain. |\n| Cloud queues (SQS / Pub/Sub) | ~unbounded | Provider SLA | Skip — adds latency to the failure path and is not needed at this scale. |\n\n### Hourly archive (Redis → S3)\n\n```python\nimport boto3, redis, json, gzip, time\nfrom io import BytesIO\n\nR = redis.from_url(os.environ[\"REDIS_URL\"])\nS3 = boto3.client(\"s3\")\nBUCKET = \"podium-dlq-archive\"\n\ndef archive_once() -> int:\n pipe = R.pipeline()\n pipe.lrange(\"podium:dlq\", 0, -1)\n pipe.delete(\"podium:dlq\")\n entries, _ = pipe.execute()\n if not entries:\n return 0\n buf = BytesIO()\n with gzip.GzipFile(fileobj=buf, mode=\"wb\") as gz:\n for entry in entries:\n gz.write(entry + b\"\\n\")\n key = f\"dlq/{int(time.time())}-{len(entries)}.jsonl.gz\"\n S3.put_object(Bucket=BUCKET, Key=key, Body=buf.getvalue())\n return len(entries)\n```\n\nNote: `LRANGE + DEL` is not atomic. If new entries arrive between the two, they're lost. For prod, use `BRPOPLPUSH` into a `podium:dlq:archiving` list, archive that, then delete it. Or just accept the race and accept the duplicate-archive on the next run (the entries are durable in the archive).\n\n## In-memory dedup fallback (dev only)\n\nFor local dev without Redis, the receiver falls back to an in-process `dict` + lock + periodic eviction. Documented constraints:\n\n- **Not shared across processes.** Two uvicorn workers will each have their own cache and duplicates can leak.\n- **Lost on restart.** Within the first 24h after restart, replays may be processed twice.\n- **Bounded size.** Hard-cap at 100k entries; eviction drops the oldest 10% when over the cap.\n\nProduction deployments MUST use Redis or SQLite. The in-memory backend is for `pytest` and `uvicorn --reload`, nothing else.\n\n## Operator workflow\n\n### Drain the DLQ after a handler fix\n\n```bash\n# 1. Verify the fix is deployed\ncurl -fsS https://your-receiver.example.com/healthz\n\n# 2. Inspect the DLQ to confirm what will replay\nredis-cli LLEN podium:dlq\nredis-cli LRANGE podium:dlq 0 4 | jq .\n\n# 3. Drain at a conservative rate that the downstream handler can absorb\npython3 scripts/dlq_replay.py \\\n --target-url https://your-receiver.example.com/webhooks/podium \\\n --secret-env PODIUM_WEBHOOK_SECRET \\\n --batch-size 25 --rate-per-sec 10\n\n# 4. Confirm the DLQ is drained\nredis-cli LLEN podium:dlq\n\n# 5. Spot-check a replayed event landed correctly in the downstream system\n```\n\n### Signature-failure incident response\n\n```bash\n# 1. Identify the source IP(s) of failed signatures from access logs\n# 2. Compute the rate — if > a few per minute, it's a probe\n# 3. Add an iptables / Caddy / nginx-level block for the source IP(s)\n# 4. Confirm the legitimate Podium delivery IP ranges are still allowed\n# 5. Audit recent rotations — if the signing secret was rotated and a\n# Podium-side webhook config was NOT updated, all current deliveries fail.\n```\n\n## Library packaging notes\n\nThis skill ships the receiver inline in `scripts/webhook_server.py` rather than as a separate pip package. The rationale: the receiver is ~200 lines, every integration needs a custom dispatch function and secret-store binding, and an extracted package would require versioning that adds maintenance overhead without enabling reuse. If three concrete callers depend on identical behavior, promote to `@intentsolutions/podium-webhook` on npm or `intent-podium-webhook` on PyPI.\n\n## Testing matrix (what `tests/` should cover when this skill is integrated)\n\n| Test | Type | What it proves |\n|---|---|---|\n| `test_signature_verify_happy_path` | unit | Correct body + secret + header → True |\n| `test_signature_verify_mutated_body` | unit | One byte changed → False |\n| `test_signature_verify_uses_compare_digest` | unit | Verify via mocking that `==` is NOT used |\n| `test_replay_window_skew_positive` | unit | ts > now + 300s → False |\n| `test_replay_window_skew_negative` | unit | ts \u003c now - 300s → False |\n| `test_dedup_claim_atomic` | integration | 100 concurrent claims of same id → exactly 1 succeeds |\n| `test_dedup_ttl_matches_retry_ceiling` | unit | TTL is exactly 86400s |\n| `test_dlq_persist_before_5xx` | integration | Handler raise → DLQ entry exists BEFORE response sent |\n| `test_dlq_replay_skips_duplicates` | integration | Replay 100 entries; 80 already processed → only 20 dispatch |\n| `test_batch_sort_by_occurred_at` | unit | Out-of-order input → sorted output |\n| `test_out_of_order_defers_to_dlq` | integration | delete-before-create defers, replay succeeds |\n| `test_fail_closed_on_redis_outage` | chaos | Redis killed → receiver returns 503, not 200 |\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":8673,"content_sha256":"ed7a5c5d22b17c5b10bc950ef81e9d3a164e9e9d2c8091ebda1fa67064056c84"},{"filename":"scripts/dedup_check.py","content":"#!/usr/bin/env python3\n\"\"\"dedup_check.py — check whether an event_id is already in the Podium webhook dedup cache.\n\nRead-only inspection. Does NOT mutate the cache. Useful for confirming whether\na specific event_id was already processed (and would be rejected as a duplicate\non replay) before a manual replay.\n\nUsage:\n dedup_check.py --event-id evt_\u003cyour-event-identifier> \\\\\n [--redis-url redis://localhost:6379/0] \\\\\n [--key-prefix podium:evt:]\n\nExit codes:\n 0 first sight — event_id is NOT cached; a replay would be dispatched\n 1 duplicate — event_id IS cached; a replay would be rejected as duplicate\n 2 backend unreachable / configuration error\n\"\"\"\n\nfrom __future__ import annotations\nimport argparse\nimport os\nimport sys\n\n\ndef main() -> int:\n ap = argparse.ArgumentParser(\n description=__doc__,\n formatter_class=argparse.RawDescriptionHelpFormatter,\n )\n ap.add_argument(\"--event-id\", required=True)\n ap.add_argument(\"--redis-url\", default=os.environ.get(\"REDIS_URL\", \"redis://localhost:6379/0\"))\n ap.add_argument(\"--key-prefix\", default=\"podium:evt:\")\n args = ap.parse_args()\n\n try:\n import redis # type: ignore\n except ImportError:\n print(\"ERR_WHK_CFG redis package not installed — `pip install redis`\", file=sys.stderr)\n return 2\n\n try:\n r = redis.from_url(args.redis_url, decode_responses=True)\n r.ping()\n except Exception as e:\n print(f\"ERR_WHK_006 dedup_backend_unavailable: {e}\", file=sys.stderr)\n return 2\n\n key = f\"{args.key_prefix}{args.event_id}\"\n try:\n exists = r.exists(key)\n ttl = r.ttl(key) if exists else None\n except Exception as e:\n print(f\"ERR_WHK_006 dedup query failed: {e}\", file=sys.stderr)\n return 2\n\n if exists:\n print(f\"duplicate: key={key} ttl_remaining={ttl}s\", file=sys.stderr)\n return 1\n print(f\"first_sight: key={key} (not cached)\", file=sys.stderr)\n return 0\n\n\nif __name__ == \"__main__\":\n sys.exit(main())\n","content_type":"text/x-python; charset=utf-8","language":"python","size":2048,"content_sha256":"ec17a390260bbcfb630bac6436bf7ffa918715d66950d9aacba531ae9e1a3814"},{"filename":"scripts/dlq_replay.py","content":"#!/usr/bin/env python3\n\"\"\"dlq_replay.py — drain the Podium webhook DLQ and re-POST entries to a target receiver.\n\nEach DLQ entry carries the original raw_body and signature_header captured at\npersist time. Replay re-POSTs them as-is so the receiver's signature verification,\nreplay-window check, and dedup all run on the replayed delivery — events that\nhave already been processed are correctly rejected as duplicates.\n\nUsage:\n dlq_replay.py \\\\\n --target-url https://your-receiver.example.com/webhooks/podium \\\\\n [--redis-url redis://localhost:6379/0] \\\\\n [--dlq-key podium:dlq] \\\\\n [--batch-size 25] \\\\\n [--rate-per-sec 10] \\\\\n [--max-events 0] # 0 = drain entire queue\n [--ignore-replay-window] # add header to bypass replay check at receiver\n\nExit codes:\n 0 drain complete — all available entries replayed (or max-events reached)\n 1 one or more replays returned non-2xx — see stderr for the count\n 2 configuration error (missing redis, missing url)\n 3 DLQ empty at start (nothing to do)\n\"\"\"\n\nfrom __future__ import annotations\nimport argparse\nimport json\nimport os\nimport sys\nimport time\nimport urllib.request\nimport urllib.error\n\n\ndef post(target: str, body: bytes, signature_header: str, timeout: float = 10.0) -> int:\n req = urllib.request.Request(\n target,\n data=body,\n method=\"POST\",\n headers={\n \"Content-Type\": \"application/json\",\n \"X-Podium-Signature\": signature_header,\n \"X-Podium-Replay\": \"1\",\n },\n )\n try:\n with urllib.request.urlopen(req, timeout=timeout) as resp:\n return resp.status\n except urllib.error.HTTPError as e:\n return e.code\n except urllib.error.URLError as e:\n print(f\" transport error: {e}\", file=sys.stderr)\n return 0\n\n\ndef main() -> int:\n ap = argparse.ArgumentParser(\n description=__doc__,\n formatter_class=argparse.RawDescriptionHelpFormatter,\n )\n ap.add_argument(\"--target-url\", required=True)\n ap.add_argument(\"--redis-url\", default=os.environ.get(\"REDIS_URL\", \"redis://localhost:6379/0\"))\n ap.add_argument(\"--dlq-key\", default=\"podium:dlq\")\n ap.add_argument(\"--batch-size\", type=int, default=25)\n ap.add_argument(\"--rate-per-sec\", type=float, default=10.0)\n ap.add_argument(\"--max-events\", type=int, default=0)\n args = ap.parse_args()\n\n try:\n import redis # type: ignore\n except ImportError:\n print(\"ERR_WHK_CFG redis package not installed\", file=sys.stderr)\n return 2\n\n try:\n r = redis.from_url(args.redis_url, decode_responses=True)\n r.ping()\n except Exception as e:\n print(f\"ERR_WHK_006 dedup_backend_unavailable: {e}\", file=sys.stderr)\n return 2\n\n total = r.llen(args.dlq_key)\n if total == 0:\n print(\"DLQ empty — nothing to replay\", file=sys.stderr)\n return 3\n\n cap = args.max_events if args.max_events > 0 else total\n cap = min(cap, total)\n print(f\"draining up to {cap} of {total} DLQ entries from {args.dlq_key}\", file=sys.stderr)\n\n sleep_per = 1.0 / max(args.rate_per_sec, 0.01)\n drained = 0\n succeeded = 0\n duplicate = 0\n failed = 0\n\n while drained \u003c cap:\n batch = []\n # Use RPOP to drain from oldest end; entries were LPUSHed at persist time.\n for _ in range(min(args.batch_size, cap - drained)):\n raw = r.rpop(args.dlq_key)\n if raw is None:\n break\n batch.append(raw)\n if not batch:\n break\n\n for raw_entry in batch:\n drained += 1\n try:\n entry = json.loads(raw_entry)\n except json.JSONDecodeError as e:\n print(f\" skip: malformed DLQ entry: {e}\", file=sys.stderr)\n failed += 1\n continue\n\n body = entry.get(\"raw_body\", \"\").encode(\"utf-8\")\n sig = entry.get(\"signature_header\", \"\")\n if not body or not sig:\n print(f\" skip: entry missing body or signature (event_id={entry.get('event_id')})\", file=sys.stderr)\n failed += 1\n continue\n\n status = post(args.target_url, body, sig)\n if 200 \u003c= status \u003c 300:\n # Distinguish duplicate (expected) from genuine ok via a heuristic\n # — both are 2xx; we count both as success here. The receiver's\n # JSON response carries the explicit status: \"duplicate\" tag.\n succeeded += 1\n # Crude duplicate count via response body would require parsing;\n # left to the caller's log review.\n else:\n failed += 1\n print(f\" fail event_id={entry.get('event_id')} status={status}\", file=sys.stderr)\n\n time.sleep(sleep_per)\n\n print(\n json.dumps(\n {\n \"drained\": drained,\n \"succeeded_2xx\": succeeded,\n \"failed_non_2xx\": failed,\n \"duplicate_inferred\": duplicate,\n \"remaining\": r.llen(args.dlq_key),\n },\n indent=2,\n )\n )\n return 0 if failed == 0 else 1\n\n\nif __name__ == \"__main__\":\n sys.exit(main())\n","content_type":"text/x-python; charset=utf-8","language":"python","size":5248,"content_sha256":"f3d1c08ecb8eefe5343ae12a005be4638aa620f4bc615b8acbbf05acdd779267"},{"filename":"scripts/signature_verify.py","content":"#!/usr/bin/env python3\n\"\"\"signature_verify.py — verify a captured Podium webhook payload against a signing secret.\n\nUse this for incident forensics: was a captured POST genuine, or forged?\nReads the body from a file (preserving raw bytes) and parses the signature header\ninto (t, v1), then runs the same HMAC-SHA256 + constant-time compare the receiver uses.\n\nUsage:\n signature_verify.py \\\\\n --body-file /tmp/captured_webhook_body.json \\\\\n --signature-header \"t=\u003cunix_ts>,v1=\u003chex_hmac>\" \\\\\n --secret-env PODIUM_WEBHOOK_SECRET \\\\\n [--replay-window-seconds 300] \\\\\n [--ignore-replay-window]\n\nExit codes:\n 0 signature valid AND within replay window\n 1 signature mismatch\n 2 signature valid BUT replay window exceeded\n 3 configuration error (missing env, missing file, malformed header)\n\"\"\"\n\nfrom __future__ import annotations\nimport argparse\nimport hashlib\nimport hmac\nimport os\nimport sys\nimport time\nfrom pathlib import Path\n\n\ndef parse_header(header_value: str) -> dict[str, str]:\n out: dict[str, str] = {}\n for p in header_value.split(\",\"):\n if \"=\" in p:\n k, v = p.split(\"=\", 1)\n out[k.strip()] = v.strip()\n return out\n\n\ndef main() -> int:\n ap = argparse.ArgumentParser(\n description=__doc__,\n formatter_class=argparse.RawDescriptionHelpFormatter,\n )\n ap.add_argument(\"--body-file\", required=True, type=Path)\n ap.add_argument(\"--signature-header\", required=True, help='Full header value, e.g. \"t=\u003cunix_ts>,v1=\u003chex_hmac>\"')\n ap.add_argument(\"--secret-env\", required=True, help=\"Env var name holding the webhook signing secret\")\n ap.add_argument(\"--replay-window-seconds\", type=int, default=300)\n ap.add_argument(\n \"--ignore-replay-window\", action=\"store_true\", help=\"Verify signature only; do not check the timestamp window\"\n )\n args = ap.parse_args()\n\n secret = os.environ.get(args.secret_env)\n if not secret:\n print(f\"ERR_WHK_CFG missing env var {args.secret_env}\", file=sys.stderr)\n return 3\n\n if not args.body_file.exists():\n print(f\"ERR_WHK_CFG body file not found: {args.body_file}\", file=sys.stderr)\n return 3\n\n raw = args.body_file.read_bytes()\n parts = parse_header(args.signature_header)\n ts, sig = parts.get(\"t\"), parts.get(\"v1\")\n if not ts or not sig:\n print(f\"ERR_WHK_012 signature_format_invalid — got keys: {list(parts)}\", file=sys.stderr)\n return 3\n\n signed_payload = f\"{ts}.\".encode(\"utf-8\") + raw\n expected = hmac.new(secret.encode(\"utf-8\"), signed_payload, hashlib.sha256).hexdigest()\n\n if not hmac.compare_digest(expected, sig):\n print(\"ERR_WHK_002 signature_mismatch\", file=sys.stderr)\n print(f\" expected: {expected[:8]}... ({len(expected)} hex chars)\", file=sys.stderr)\n print(f\" received: {sig[:8]}... ({len(sig)} hex chars)\", file=sys.stderr)\n return 1\n\n if not args.ignore_replay_window:\n try:\n ts_int = int(ts)\n except ValueError:\n print(f\"ERR_WHK_012 timestamp not an integer: {ts!r}\", file=sys.stderr)\n return 3\n skew = abs(time.time() - ts_int)\n if skew > args.replay_window_seconds:\n print(\n f\"ERR_WHK_003 replay_window_exceeded — skew={skew:.0f}s > {args.replay_window_seconds}s\",\n file=sys.stderr,\n )\n return 2\n\n print(\"ok: signature valid and within window\", file=sys.stderr)\n return 0\n\n\nif __name__ == \"__main__\":\n sys.exit(main())\n","content_type":"text/x-python; charset=utf-8","language":"python","size":3520,"content_sha256":"39b3ccd61af2495fe9dab3c9a44108767248d4a83277a813cc735ee8f70ca229"},{"filename":"scripts/webhook_server.py","content":"#!/usr/bin/env python3\n\"\"\"webhook_server.py — FastAPI receiver for Podium webhooks with the full reliability pipeline.\n\nPipeline: signature verify → replay window → JSON parse → dedup claim → batch sort →\nsafe dispatch (try/except → DLQ).\n\nRun:\n export PODIUM_WEBHOOK_SECRET={your-webhook-secret}\n export REDIS_URL=redis://localhost:6379/0\n uvicorn scripts.webhook_server:app --host 0.0.0.0 --port 8080\n\nEndpoints:\n POST /webhooks/podium — receive a webhook delivery\n GET /healthz — health probe (returns dedup backend status)\n\"\"\"\n\nfrom __future__ import annotations\nimport hmac\nimport hashlib\nimport json\nimport os\nimport sys\nimport time\nimport logging\nfrom typing import Any\n\nfrom fastapi import FastAPI, Request, HTTPException, Header\n\ntry:\n import redis.asyncio as redis_async\n\n _HAS_REDIS = True\nexcept ImportError:\n _HAS_REDIS = False\n\nREPLAY_WINDOW_SECONDS = int(os.environ.get(\"PODIUM_REPLAY_WINDOW\", \"300\"))\nDEDUP_TTL_SECONDS = int(os.environ.get(\"PODIUM_DEDUP_TTL\", \"86400\"))\nKEY_PREFIX = \"podium:evt:\"\nDLQ_KEY = \"podium:dlq\"\n\nSECRET_BYTES: bytes = os.environ.get(\"PODIUM_WEBHOOK_SECRET\", \"\").encode(\"utf-8\")\nif not SECRET_BYTES:\n print(\"FATAL: PODIUM_WEBHOOK_SECRET not set\", file=sys.stderr)\n # Do not exit at import time — let healthz expose the misconfiguration\n\nREDIS_URL = os.environ.get(\"REDIS_URL\", \"redis://localhost:6379/0\")\n\nlogging.basicConfig(level=logging.INFO, format='{\"ts\": \"%(asctime)s\", \"lvl\": \"%(levelname)s\", \"msg\": \"%(message)s\"}')\nlog = logging.getLogger(\"podium-webhook\")\n\napp = FastAPI()\n\n_redis_client: Any = None\n_memory_dedup: dict[str, float] = {}\n_memory_dlq: list[dict] = []\n\n\nasync def _get_redis():\n global _redis_client\n if _redis_client is None and _HAS_REDIS:\n _redis_client = redis_async.from_url(REDIS_URL, decode_responses=True)\n return _redis_client\n\n\ndef verify_signature(body: bytes, header_value: str) -> tuple[bool, str | None]:\n \"\"\"Returns (is_valid, timestamp_str). Constant-time compare via hmac.compare_digest.\"\"\"\n if not header_value or not SECRET_BYTES:\n return False, None\n parts: dict[str, str] = {}\n for p in header_value.split(\",\"):\n if \"=\" in p:\n k, v = p.split(\"=\", 1)\n parts[k.strip()] = v.strip()\n ts, sig = parts.get(\"t\"), parts.get(\"v1\")\n if not ts or not sig:\n return False, None\n signed_payload = f\"{ts}.\".encode(\"utf-8\") + body\n expected = hmac.new(SECRET_BYTES, signed_payload, hashlib.sha256).hexdigest()\n return hmac.compare_digest(expected, sig), ts\n\n\ndef within_replay_window(ts_str: str | None, window: int = REPLAY_WINDOW_SECONDS) -> bool:\n if not ts_str:\n return False\n try:\n ts = int(ts_str)\n except (TypeError, ValueError):\n return False\n return abs(time.time() - ts) \u003c= window\n\n\nasync def claim_event(event_id: str) -> bool:\n \"\"\"Atomic dedup claim. Returns True on first sight, False on duplicate.\n Raises on backend unavailable — caller decides fail-open vs fail-closed.\"\"\"\n r = await _get_redis()\n if r is not None:\n try:\n res = await r.set(f\"{KEY_PREFIX}{event_id}\", \"1\", nx=True, ex=DEDUP_TTL_SECONDS)\n return bool(res)\n except Exception as e:\n log.error(f\"ERR_WHK_006 dedup_backend_unavailable: {e}\")\n raise\n # Memory fallback — dev only.\n now = time.time()\n expired = [k for k, v in _memory_dedup.items() if v \u003c now]\n for k in expired:\n _memory_dedup.pop(k, None)\n if event_id in _memory_dedup:\n return False\n _memory_dedup[event_id] = now + DEDUP_TTL_SECONDS\n return True\n\n\nasync def dlq_persist(entry: dict) -> None:\n entry[\"dlq_persisted_at\"] = time.time()\n r = await _get_redis()\n if r is not None:\n try:\n await r.lpush(DLQ_KEY, json.dumps(entry))\n return\n except Exception as e:\n log.error(f\"ERR_WHK_008 dlq_persist_failed (redis): {e}\")\n # fall through to memory backup\n _memory_dlq.append(entry)\n\n\nasync def dispatch(event: dict) -> None:\n \"\"\"Application-specific. Override in your deployment.\n Default no-op logs the event type.\"\"\"\n log.info(f\"dispatch event_type={event.get('type')} event_id={event.get('id')}\")\n\n\nasync def safe_dispatch(event: dict, raw: bytes, sig_header: str) -> None:\n try:\n await dispatch(event)\n except Exception as e:\n await dlq_persist(\n {\n \"event_id\": event.get(\"id\"),\n \"event_type\": event.get(\"type\"),\n \"raw_body\": raw.decode(\"utf-8\", errors=\"replace\"),\n \"signature_header\": sig_header,\n \"occurred_at\": event.get(\"occurred_at\"),\n \"received_at\": time.time(),\n \"exception\": f\"{type(e).__name__}: {e}\",\n }\n )\n raise\n\n\[email protected](\"/webhooks/podium\")\nasync def receive(request: Request, x_podium_signature: str | None = Header(default=None)):\n raw = await request.body()\n if not x_podium_signature:\n raise HTTPException(401, \"ERR_WHK_001 missing signature header\")\n\n ok, ts = verify_signature(raw, x_podium_signature)\n if not ok:\n raise HTTPException(401, \"ERR_WHK_002 signature mismatch\")\n if not within_replay_window(ts):\n raise HTTPException(401, \"ERR_WHK_003 replay window exceeded\")\n\n try:\n body = json.loads(raw)\n except json.JSONDecodeError:\n raise HTTPException(400, \"ERR_WHK_004 body not parseable\")\n\n events = body.get(\"events\") if isinstance(body, dict) and \"events\" in body else [body]\n if not isinstance(events, list):\n events = [body]\n\n # Within-batch ordering by (occurred_at, id) for stable causal order.\n events.sort(key=lambda e: (e.get(\"occurred_at\", 0), e.get(\"id\", \"\")))\n\n results = []\n for event in events:\n event_id = event.get(\"id\")\n if not event_id:\n results.append({\"status\": \"skipped_no_id\"})\n continue\n try:\n first_sight = await claim_event(event_id)\n except Exception:\n raise HTTPException(503, \"ERR_WHK_006 dedup backend unavailable\")\n if not first_sight:\n results.append({\"event_id\": event_id, \"status\": \"duplicate\"})\n continue\n await safe_dispatch(event, raw, x_podium_signature)\n results.append({\"event_id\": event_id, \"status\": \"ok\"})\n return {\"results\": results}\n\n\[email protected](\"/healthz\")\nasync def healthz():\n r = await _get_redis()\n backend = \"redis\" if r else \"memory\"\n redis_ok: bool | None = None\n if r is not None:\n try:\n await r.ping()\n redis_ok = True\n except Exception:\n redis_ok = False\n return {\n \"ok\": True,\n \"secret_loaded\": bool(SECRET_BYTES),\n \"dedup_backend\": backend,\n \"redis_ok\": redis_ok,\n }\n","content_type":"text/x-python; charset=utf-8","language":"python","size":6876,"content_sha256":"950e7ef50199c37bdd652e818cd911cccfd3d836a7d5dc3f9d38042435d043ac"}],"content_json":{"type":"doc","content":[{"type":"heading","attrs":{"level":1},"content":[{"text":"Podium Webhook Reliability","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Overview","type":"text"}]},{"type":"paragraph","content":[{"text":"Receive Podium webhooks in production without forged events, double-charged AI side-effects, lost notifications, or out-of-order conversation events. This is not an introductory webhook walkthrough — it is the receiver code your integration runs when Podium retries a 5xx response six times over 24 hours, when a leaked secret lets an attacker POST forged events, when a batch delivery arrives with ","type":"text"},{"text":"conversation.deleted","type":"text","marks":[{"type":"code_inline"}]},{"text":" ahead of ","type":"text"},{"text":"conversation.created","type":"text","marks":[{"type":"code_inline"}]},{"text":", and when on-call needs to drain and replay 800 failed events without re-firing the ones that already succeeded.","type":"text"}]},{"type":"paragraph","content":[{"text":"The six production failures this skill prevents:","type":"text"}]},{"type":"ordered_list","attrs":{"order":1,"listStyle":"number"},"content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Missing signature verification","type":"text","marks":[{"type":"strong"}]},{"text":" — a webhook endpoint that accepts any POST will accept forged events. An attacker who learns the URL can create phantom contacts, fire phantom review requests, or impersonate a real customer in a webchat. HMAC-SHA256 over the raw request body is non-optional and must run before any handler logic.","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Replay attacks against a stateless handler","type":"text","marks":[{"type":"strong"}]},{"text":" — a valid signed event POSTed twice (or 1000 times) re-runs every side effect each time. Signature validity alone is not enough — the receiver must reject events whose timestamp falls outside a 5-minute window AND whose nonce has already been seen.","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Duplicate event processing from Podium retries","type":"text","marks":[{"type":"strong"}]},{"text":" — Podium retries webhook delivery on 5xx for up to 24 hours. Without an idempotency cache, every retry re-runs the handler (writes the contact again, fires the review request again, double-charges an AI call). ","type":"text"},{"text":"SET NX EX 86400","type":"text","marks":[{"type":"code_inline"}]},{"text":" on the event_id is the cheapest fix that exists.","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Lost events without a dead-letter queue","type":"text","marks":[{"type":"strong"}]},{"text":" — if a handler raises and Podium retries six times and gives up, the event is gone. On-call has nothing to replay. Every handler exception must persist the raw signed payload to a DLQ before the response returns 5xx, so the event is recoverable independent of Podium's retry clock.","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Batch event reordering","type":"text","marks":[{"type":"strong"}]},{"text":" — Podium can deliver multiple events in one POST and ordering across deliveries is not guaranteed. A naive handler processes ","type":"text"},{"text":"conversation.deleted","type":"text","marks":[{"type":"code_inline"}]},{"text":" before ","type":"text"},{"text":"conversation.created","type":"text","marks":[{"type":"code_inline"}]},{"text":" and the system observes a delete on a contact that does not exist. Within a batch, sort by ","type":"text"},{"text":"occurred_at","type":"text","marks":[{"type":"code_inline"}]},{"text":" before dispatch; across batches, gate causally-dependent handlers on the precondition existing.","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Timing-attack vulnerability on signature compare","type":"text","marks":[{"type":"strong"}]},{"text":" — ","type":"text"},{"text":"received_sig == computed_sig","type":"text","marks":[{"type":"code_inline"}]},{"text":" with ","type":"text"},{"text":"==","type":"text","marks":[{"type":"code_inline"}]},{"text":" short-circuits on the first byte mismatch. An attacker measures response latency to recover the signature byte-by-byte over a few thousand probes. Always use ","type":"text"},{"text":"hmac.compare_digest","type":"text","marks":[{"type":"code_inline"}]},{"text":", which is constant-time over the longer of the two inputs.","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Prerequisites","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Python 3.10+ with ","type":"text"},{"text":"fastapi","type":"text","marks":[{"type":"code_inline"}]},{"text":", ","type":"text"},{"text":"uvicorn","type":"text","marks":[{"type":"code_inline"}]},{"text":", ","type":"text"},{"text":"httpx","type":"text","marks":[{"type":"code_inline"}]},{"text":", and ","type":"text"},{"text":"redis","type":"text","marks":[{"type":"code_inline"}]},{"text":" (in-memory fallback for dev is provided)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Podium account with an OAuth app authorized for webhook delivery: Settings → Developer → Apps → Webhooks","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"The webhook signing secret from the app's Webhooks tab (saved to a secret store — never committed)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"A receiver URL reachable from Podium (publicly resolvable HTTPS endpoint with valid cert)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Redis 6+ for production dedup + DLQ; an in-memory dict + SQLite file fallback exists for dev","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"A ","type":"text"},{"text":"podium-auth","type":"text","marks":[{"type":"code_inline"}]},{"text":" instance if your handler needs to call back into the Podium API after processing","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Instructions","type":"text"}]},{"type":"paragraph","content":[{"text":"Build in this order. Each section neutralizes one production failure mode.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"1. HMAC-SHA256 signature verification on the raw body (neutralizes forgery)","type":"text"}]},{"type":"paragraph","content":[{"text":"Verify the signature against the ","type":"text"},{"text":"raw, unparsed","type":"text","marks":[{"type":"strong"}]},{"text":" request body. Any framework middleware that JSON-decodes-and-re-encodes before signature check will fail because whitespace and key ordering change. Read the body once, verify, then parse:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"import hmac, hashlib\nfrom fastapi import FastAPI, Request, HTTPException, Header\n\napp = FastAPI()\nSIGNING_SECRET = os.environ[\"PODIUM_WEBHOOK_SECRET\"].encode(\"utf-8\")\n\[email protected](\"/webhooks/podium\")\nasync def receive(request: Request, x_podium_signature: str = Header(None)):\n raw = await request.body() # bytes — DO NOT decode/re-encode\n if not x_podium_signature:\n raise HTTPException(401, \"missing X-Podium-Signature\")\n if not verify_signature(raw, x_podium_signature):\n raise HTTPException(401, \"signature mismatch\")\n # ... continue with replay/dedup/dispatch","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"def verify_signature(body: bytes, header_value: str) -> bool:\n # Podium signature header format: \"t=\u003cunix_ts>,v1=\u003chex_hmac>\"\n # Adapt to current spec — verify against the Podium developer docs at integration time.\n parts = dict(p.split(\"=\", 1) for p in header_value.split(\",\") if \"=\" in p)\n ts, sig = parts.get(\"t\"), parts.get(\"v1\")\n if not ts or not sig:\n return False\n signed_payload = f\"{ts}.\".encode(\"utf-8\") + body\n expected = hmac.new(SIGNING_SECRET, signed_payload, hashlib.sha256).hexdigest()\n return hmac.compare_digest(expected, sig) # constant-time, byte-by-byte safe","type":"text"}]},{"type":"paragraph","content":[{"text":"The ","type":"text"},{"text":"t=","type":"text","marks":[{"type":"code_inline"}]},{"text":" timestamp is what makes the next mitigation possible. A signature alone with no timestamp is replayable forever.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"2. Replay-attack window (neutralizes timestamp replay)","type":"text"}]},{"type":"paragraph","content":[{"text":"Reject any event whose signed timestamp is more than 5 minutes from now (in either direction — clock skew goes both ways). This bounds the replay window an attacker has even if they capture a valid signed event off the wire:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"import time\n\nREPLAY_WINDOW_SECONDS = 300 # 5 minutes; tune to your clock-skew tolerance\n\ndef within_replay_window(ts_str: str) -> bool:\n try:\n ts = int(ts_str)\n except (TypeError, ValueError):\n return False\n return abs(time.time() - ts) \u003c= REPLAY_WINDOW_SECONDS","type":"text"}]},{"type":"paragraph","content":[{"text":"Wire ","type":"text"},{"text":"within_replay_window(parts[\"t\"])","type":"text","marks":[{"type":"code_inline"}]},{"text":" immediately after signature verification. A failed window check is a 401 — do not return 200, do not enqueue, do not log the body (the attacker is probing).","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"3. Idempotent dedup with ","type":"text"},{"text":"SET NX EX 86400","type":"text","marks":[{"type":"code_inline"}]},{"text":" (neutralizes duplicate processing)","type":"text"}]},{"type":"paragraph","content":[{"text":"Every Podium webhook carries an ","type":"text"},{"text":"event_id","type":"text","marks":[{"type":"code_inline"}]},{"text":" (or equivalent unique identifier — verify against the current schema). Reject any event whose ","type":"text"},{"text":"event_id","type":"text","marks":[{"type":"code_inline"}]},{"text":" is already in the dedup cache. Use Redis ","type":"text"},{"text":"SET key value NX EX 86400","type":"text","marks":[{"type":"code_inline"}]},{"text":" so the check and the claim are atomic; 86400 seconds matches Podium's 24-hour retry ceiling:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"import redis.asyncio as redis\n\nREDIS = redis.from_url(os.environ.get(\"REDIS_URL\", \"redis://localhost:6379/0\"))\n\nasync def claim_event(event_id: str) -> bool:\n # Returns True if this process is the first to see this event_id.\n # Returns False if the event_id is already in the cache (duplicate).\n return await REDIS.set(f\"podium:evt:{event_id}\", \"1\", nx=True, ex=86400)","type":"text"}]},{"type":"paragraph","content":[{"text":"In the handler:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"event = json.loads(raw)\nevent_id = event[\"id\"]\nif not await claim_event(event_id):\n return {\"status\": \"duplicate\", \"event_id\": event_id} # 200 — Podium stops retrying","type":"text"}]},{"type":"paragraph","content":[{"text":"Returning 200 on duplicate is correct — Podium has correctly delivered, the receiver has correctly identified it as already processed. The handler is idempotent by construction.","type":"text"}]},{"type":"paragraph","content":[{"text":"For dev / smoke environments without Redis, fall back to an in-memory ","type":"text"},{"text":"set()","type":"text","marks":[{"type":"code_inline"}]},{"text":" with a periodic eviction loop. Documented in ","type":"text"},{"text":"references/implementation.md","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"4. Dead-letter queue before responding 5xx (neutralizes silent event loss)","type":"text"}]},{"type":"paragraph","content":[{"text":"Wrap every handler invocation in a try/except. On any exception, persist the ","type":"text"},{"text":"raw signed payload plus the timestamp plus the signature","type":"text","marks":[{"type":"strong"}]},{"text":" to the DLQ before letting the exception bubble. The DLQ entry is the recovery anchor — ","type":"text"},{"text":"dlq_replay.py","type":"text","marks":[{"type":"code_inline"}]},{"text":" can re-POST it to the handler later:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"async def safe_dispatch(event: dict, raw: bytes, sig_header: str):\n try:\n await dispatch(event)\n except Exception as e:\n await dlq_persist({\n \"event_id\": event.get(\"id\"),\n \"event_type\": event.get(\"type\"),\n \"raw_body\": raw.decode(\"utf-8\", errors=\"replace\"),\n \"signature_header\": sig_header,\n \"occurred_at\": event.get(\"occurred_at\"),\n \"received_at\": time.time(),\n \"exception\": f\"{type(e).__name__}: {e}\",\n })\n raise # let FastAPI return 5xx; Podium will retry","type":"text"}]},{"type":"paragraph","content":[{"text":"DLQ backend options (in priority order):","type":"text"}]},{"type":"table","attrs":{"layout":null},"content":[{"type":"tr","content":[{"type":"th","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Backend","type":"text"}]}]},{"type":"th","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"When","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Redis list ","type":"text"},{"text":"LPUSH podium:dlq","type":"text","marks":[{"type":"code_inline"}]},{"text":" + scheduled archiver to S3/GCS","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Default for prod","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"SQLite file at ","type":"text"},{"text":"/var/lib/podium-dlq.sqlite","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Single-node deployments, dev","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Append-only JSONL at ","type":"text"},{"text":"/var/log/podium-dlq.jsonl","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Fallback when nothing else is available — durable, parseable, ugly","type":"text"}]}]}]}]},{"type":"paragraph","content":[{"text":"The DLQ is durable independent of the Redis dedup cache. If Redis dies, dedup is degraded but events are still recoverable.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"5. Batch event ordering by ","type":"text"},{"text":"occurred_at","type":"text","marks":[{"type":"code_inline"}]},{"text":" (neutralizes reordering)","type":"text"}]},{"type":"paragraph","content":[{"text":"Podium can deliver multiple events in one POST. Within the batch, sort by ","type":"text"},{"text":"occurred_at","type":"text","marks":[{"type":"code_inline"}]},{"text":" ascending before dispatch. Across batches, do not assume earlier-timestamped events arrived first — guard causally-dependent handlers with an existence check:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"async def dispatch_batch(events: list[dict]):\n events.sort(key=lambda e: (e.get(\"occurred_at\", 0), e.get(\"id\", \"\")))\n for event in events:\n await safe_dispatch_one(event)\n\nasync def handle_conversation_deleted(event: dict):\n convo_id = event[\"data\"][\"conversation_id\"]\n # Guard: if the create event hasn't been processed yet, defer this delete.\n if not await convo_exists(convo_id):\n await dlq_persist({\n \"reason\": \"out_of_order_delete_before_create\",\n \"event_id\": event[\"id\"],\n \"raw_body\": json.dumps(event),\n \"received_at\": time.time(),\n })\n return\n await delete_conversation_locally(convo_id)","type":"text"}]},{"type":"paragraph","content":[{"text":"Sorting within a batch is cheap and correct. Cross-batch ordering is undecidable from the receiver side — the DLQ + replay path is the recovery mechanism when out-of-order delivery violates a precondition.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"6. Constant-time HMAC compare (neutralizes signature-byte timing leak)","type":"text"}]},{"type":"paragraph","content":[{"text":"The single most common implementation bug in webhook receivers is ","type":"text"},{"text":"received == expected","type":"text","marks":[{"type":"code_inline"}]},{"text":" with ","type":"text"},{"text":"==","type":"text","marks":[{"type":"code_inline"}]},{"text":". Python string ","type":"text"},{"text":"==","type":"text","marks":[{"type":"code_inline"}]},{"text":" short-circuits on the first differing byte; an attacker measures response latency over a few thousand probes and reconstructs the signature byte by byte.","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"# WRONG — leaks signature byte-by-byte via timing\nif received_sig == expected_sig:\n return True\n\n# CORRECT — constant-time over the longer of the two inputs\nif hmac.compare_digest(received_sig, expected_sig):\n return True","type":"text"}]},{"type":"paragraph","content":[{"text":"hmac.compare_digest","type":"text","marks":[{"type":"code_inline"}]},{"text":" is the only acceptable comparison. The same rule applies to Node (","type":"text"},{"text":"crypto.timingSafeEqual","type":"text","marks":[{"type":"code_inline"}]},{"text":"), Go (","type":"text"},{"text":"hmac.Equal","type":"text","marks":[{"type":"code_inline"}]},{"text":"), and Rust (","type":"text"},{"text":"subtle::ConstantTimeEq","type":"text","marks":[{"type":"code_inline"}]},{"text":").","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Error Handling","type":"text"}]},{"type":"table","attrs":{"layout":null},"content":[{"type":"tr","content":[{"type":"th","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"HTTP returned","type":"text"}]}]},{"type":"th","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Internal condition","type":"text"}]}]},{"type":"th","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Caller (Podium) behavior","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"401 Unauthorized","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Signature mismatch, missing header, replay window failed","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Podium does NOT retry — log + audit","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"400 Bad Request","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Body is not parseable JSON post-signature-verify","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Podium does NOT retry — investigate Podium-side payload","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"200 OK (duplicate)","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"event_id","type":"text","marks":[{"type":"code_inline"}]},{"text":" already in dedup cache","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Podium stops retrying — system is idempotent","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"200 OK (processed)","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Handler dispatched successfully","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Podium stops retrying — normal path","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"200 OK (deferred)","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Out-of-order event written to DLQ; will resolve via replay","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Podium stops retrying — recovery is internal","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"500 Internal Server Error","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Handler raised; DLQ entry persisted","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Podium retries with exponential backoff up to 24h","type":"text"}]}]}]},{"type":"tr","content":[{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"503 Service Unavailable","type":"text","marks":[{"type":"code_inline"}]}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Redis dedup unreachable; handler refuses","type":"text"}]}]},{"type":"td","attrs":{"colspan":1,"rowspan":1,"colwidth":null,"alignment":""},"content":[{"type":"paragraph","content":[{"text":"Podium retries — fail-closed is the safe default","type":"text"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Examples","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Verify a captured webhook payload from the command line","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"# Use the CLI bundled with the skill to verify a captured payload + header against the secret.\npython3 scripts/signature_verify.py \\\n --body-file /tmp/captured_webhook_body.json \\\n --signature-header \"t={your-timestamp},v1={your-podium-signature}\" \\\n --secret-env PODIUM_WEBHOOK_SECRET\n# exit 0 = valid; exit 1 = signature mismatch; exit 2 = replay window exceeded","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Manually check if an event_id has been seen","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"python3 scripts/dedup_check.py --event-id evt_{your-event-identifier} --redis-url redis://localhost:6379/0\n# exit 0 = first sight (would be processed); exit 1 = duplicate (would be rejected)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Drain the DLQ and replay events through the handler","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"# After a handler bug is fixed, replay DLQ entries through the receiver.\n# The replay path goes through the SAME endpoint as Podium, so signature + dedup still apply.\npython3 scripts/dlq_replay.py \\\n --target-url https://your-receiver.example.com/webhooks/podium \\\n --secret-env PODIUM_WEBHOOK_SECRET \\\n --batch-size 25 \\\n --rate-per-sec 10","type":"text"}]},{"type":"paragraph","content":[{"text":"The replay script reuses the original signature header captured at DLQ-persist time — Podium's signing secret is the same secret your replayer uses to compute the header, so no re-signing is required for events captured within the secret's lifetime.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Boot the receiver locally for development","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"export PODIUM_WEBHOOK_SECRET={your-webhook-secret}\nexport REDIS_URL=redis://localhost:6379/0 # or unset to use in-memory fallback\nuvicorn scripts.webhook_server:app --host 0.0.0.0 --port 8080 --reload","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Output","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"FastAPI receiver with HMAC verification on the raw body, replay window, dedup, DLQ, and batch ordering","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Signature verifier CLI (","type":"text"},{"text":"signature_verify.py","type":"text","marks":[{"type":"code_inline"}]},{"text":") for incident forensics on captured payloads","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Dedup-cache checker CLI (","type":"text"},{"text":"dedup_check.py","type":"text","marks":[{"type":"code_inline"}]},{"text":") for confirming a specific event was already processed","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"DLQ replayer CLI (","type":"text"},{"text":"dlq_replay.py","type":"text","marks":[{"type":"code_inline"}]},{"text":") for draining persisted failures after a handler fix","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Redis-backed dedup with 24h TTL aligned to Podium's retry ceiling","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"DLQ persistence with multiple backend options (Redis list, SQLite, JSONL)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":".gitignore","type":"text","marks":[{"type":"code_inline"}]},{"text":" rules covering the webhook secret + captured payload files","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Resources","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Podium API docs — Webhooks","type":"text","marks":[{"type":"link","attrs":{"href":"https://docs.podium.com/reference/webhooks","title":null}}]}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Podium API docs — Webhook signatures","type":"text","marks":[{"type":"link","attrs":{"href":"https://docs.podium.com/reference/webhook-signatures","title":null}}]}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"config/settings.yaml","type":"text","marks":[{"type":"link","attrs":{"href":"config/settings.yaml","title":null}}]},{"text":" — replay window, dedup TTL, DLQ backend selection, batch sizing","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"references/errors.md","type":"text","marks":[{"type":"link","attrs":{"href":"references/errors.md","title":null}}]},{"text":" — ERR_WHK_* codes with cause + solution","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"references/examples.md","type":"text","marks":[{"type":"link","attrs":{"href":"references/examples.md","title":null}}]},{"text":" — 10 worked examples (single handler, batch, multi-tenant, replay)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"references/implementation.md","type":"text","marks":[{"type":"link","attrs":{"href":"references/implementation.md","title":null}}]},{"text":" — Node.js port, Redis schema, DLQ backends, in-memory fallback","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scripts/webhook_server.py","type":"text","marks":[{"type":"link","attrs":{"href":"scripts/webhook_server.py","title":null}}]},{"text":" — FastAPI receiver with the full pipeline wired","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scripts/signature_verify.py","type":"text","marks":[{"type":"link","attrs":{"href":"scripts/signature_verify.py","title":null}}]},{"text":" — CLI: verify a captured payload + signature","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scripts/dedup_check.py","type":"text","marks":[{"type":"link","attrs":{"href":"scripts/dedup_check.py","title":null}}]},{"text":" — CLI: check if an event_id is already cached","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scripts/dlq_replay.py","type":"text","marks":[{"type":"link","attrs":{"href":"scripts/dlq_replay.py","title":null}}]},{"text":" — CLI: drain the DLQ and re-POST to the receiver","type":"text"}]}]}]},{"type":"hr","attrs":{"markup":"---"}}]},"metadata":{"date":"2026-06-05","name":"podium-webhook-reliability","tags":["podium","webhooks","hmac","idempotency","dlq","security"],"author":"@skillopedia","source":{"stars":2275,"repo_name":"claude-code-plugins-plus-skills","origin_url":"https://github.com/jeremylongshore/claude-code-plugins-plus-skills/blob/HEAD/plugins/saas-packs/podium-pack/skills/podium-webhook-reliability/SKILL.md","repo_owner":"jeremylongshore","body_sha256":"6d869c9e2e52c833c0d3535a1ac271be983629f5ae0cdd5dba199b2fbae4890c","cluster_key":"b28b2148f8a7cba4bc451bdf12ddc761cb233ea379ab4d694e30b6dd78eaf5de","clean_bundle":{"format":"clean-skill-bundle-v1","source":"jeremylongshore/claude-code-plugins-plus-skills/plugins/saas-packs/podium-pack/skills/podium-webhook-reliability/SKILL.md","attachments":[{"id":"cf6d9806-e60e-5e69-ab6c-df588b7d97ea","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/cf6d9806-e60e-5e69-ab6c-df588b7d97ea/attachment.md","path":"ARD.md","size":15259,"sha256":"e7dedf22939e0ee9cfd5aa529d527b13c29c0f3f1c872aa261df958c6751ff37","contentType":"text/markdown; charset=utf-8"},{"id":"2695c548-6d9a-5ec4-b2ba-2865976618ea","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/2695c548-6d9a-5ec4-b2ba-2865976618ea/attachment.md","path":"PRD.md","size":12297,"sha256":"e921a451073f9b5d0e6b96692045571eb8a7196470e0ebb689c2f83acaa53da5","contentType":"text/markdown; charset=utf-8"},{"id":"7a6699a7-245e-5895-9ddd-94e65a61d13b","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/7a6699a7-245e-5895-9ddd-94e65a61d13b/attachment.yaml","path":"config/settings.yaml","size":4156,"sha256":"59c797049452d1410825a8c7a8621bc7b2392a7073eeb733a8888f2118605a57","contentType":"application/yaml; charset=utf-8"},{"id":"3a63805a-b377-5593-94ea-4ba5826f3ac4","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/3a63805a-b377-5593-94ea-4ba5826f3ac4/attachment.md","path":"references/errors.md","size":6546,"sha256":"5396f01b228eda3e94c39c22eeac1da01fc735054d96d295fad7b22765e91ba3","contentType":"text/markdown; charset=utf-8"},{"id":"dd695b57-29f8-57d2-b366-36f932509637","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/dd695b57-29f8-57d2-b366-36f932509637/attachment.md","path":"references/examples.md","size":7579,"sha256":"72e916b21e845932d57018cb657de5d8ed03fceee4c0706c2444cd8f59761123","contentType":"text/markdown; charset=utf-8"},{"id":"6a5010ce-2e98-5eff-8b7f-2794594becc0","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/6a5010ce-2e98-5eff-8b7f-2794594becc0/attachment.md","path":"references/implementation.md","size":8673,"sha256":"ed7a5c5d22b17c5b10bc950ef81e9d3a164e9e9d2c8091ebda1fa67064056c84","contentType":"text/markdown; charset=utf-8"},{"id":"d65f578f-e91a-5a30-b2b3-7679943fe7d5","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/d65f578f-e91a-5a30-b2b3-7679943fe7d5/attachment.py","path":"scripts/dedup_check.py","size":2048,"sha256":"ec17a390260bbcfb630bac6436bf7ffa918715d66950d9aacba531ae9e1a3814","contentType":"text/x-python; charset=utf-8"},{"id":"c1da4b3e-ccfa-52e0-8c2f-8ae76bb16d9b","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/c1da4b3e-ccfa-52e0-8c2f-8ae76bb16d9b/attachment.py","path":"scripts/dlq_replay.py","size":5248,"sha256":"f3d1c08ecb8eefe5343ae12a005be4638aa620f4bc615b8acbbf05acdd779267","contentType":"text/x-python; charset=utf-8"},{"id":"6414b317-c444-5080-89f0-d26a2a3d79ad","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/6414b317-c444-5080-89f0-d26a2a3d79ad/attachment.py","path":"scripts/signature_verify.py","size":3520,"sha256":"39b3ccd61af2495fe9dab3c9a44108767248d4a83277a813cc735ee8f70ca229","contentType":"text/x-python; charset=utf-8"},{"id":"2f5e6e0c-fac7-515c-b1f4-9b0a15bbc0fe","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/2f5e6e0c-fac7-515c-b1f4-9b0a15bbc0fe/attachment.py","path":"scripts/webhook_server.py","size":6876,"sha256":"950e7ef50199c37bdd652e818cd911cccfd3d836a7d5dc3f9d38042435d043ac","contentType":"text/x-python; charset=utf-8"}],"bundle_sha256":"d6ac98d46403ae64a764d0ad0f8af69d30722b8a34740f7228f26b2e5b824123","attachment_count":10,"text_attachments":10,"attachment_storage":"skillopedia-attachments-v1","binary_attachments":0,"excluded_attachments":[]},"cluster_size":1,"skill_md_path":"plugins/saas-packs/podium-pack/skills/podium-webhook-reliability/SKILL.md","import_metadata":{"date":"2026-06-05","author":"@skillopedia","version":"v1","category":"security","category_label":"Security"},"exact_dupes_collapsed_into_this":0},"license":"MIT","version":"v1","category":"security","import_tag":"clean-skills-v1","description":"Operate a Podium webhook receiver that survives the delivery-side failures — forged events without signature verification, replay attacks against a stateless handler, duplicate processing from Podium's 24h retry policy, lost events with no dead-letter queue, out-of-order batch deliveries, and timing-attack-vulnerable HMAC compares. Use when building a webhook endpoint for call transcripts, webchat events, conversation lifecycle, or review notifications; hardening an existing handler that processes events twice or drops them silently; or wiring a DLQ + replay path before the on-call rotation starts. Trigger with \"podium webhook\", \"podium hmac\", \"podium signature\", \"podium webhook idempotency\", \"podium webhook replay\", \"podium dlq\", \"podium webhook retries\".","allowed-tools":"Read, Write, Edit, Bash(curl:*), Bash(jq:*), Bash(python3:*), Bash(redis-cli:*), Grep","compatibility":"Designed for Claude Code"}},"renderedAt":1782980149337}

Podium Webhook Reliability Overview Receive Podium webhooks in production without forged events, double-charged AI side-effects, lost notifications, or out-of-order conversation events. This is not an introductory webhook walkthrough — it is the receiver code your integration runs when Podium retries a 5xx response six times over 24 hours, when a leaked secret lets an attacker POST forged events, when a batch delivery arrives with ahead of , and when on-call needs to drain and replay 800 failed events without re-firing the ones that already succeeded. The six production failures this skill pr…