Entity Resolution for RAG Unstructured ingestion creates the same entity many times: "Acme Corp", "Acme, Inc.", "ACME CORPORATION", "acme". Without resolution, your graph has disconnected islands and your retrieval scatters facts across near-duplicate nodes. ER merges these into a canonical entity with a stable ID and a set of aliases. Pipeline Overview - Normalize : lowercase, strip legal suffixes, punctuation, Unicode NFC. - Block : reduce O(n²) comparisons to O(n·k) via blocks (e.g., first 3 chars, phonetic code, embedding cluster). - Score : string similarity, embedding cosine, LLM judgem…