Genomics and Epigenomics Data Processing ⚠️ TOP-OF-MIND RULE: long-format methylation CSV — count ROWS, not unique positions When the input is a long-format methylation CSV (one row per e.g. columns ), "how many sites are removed when filtering" almost always means rows removed , NOT unique-position removals. The two answers differ by a factor of ≈ . | Question phrasing | What it means | |---|---| | "how many sites are removed when filtering …" | rows removed (= samples × positions failing the filter) | | "how many unique CpG sites pass filter" | unique positions (dedupe by then filter) | ❌ W…