databricks-mlflow-evaluation

MLflow 3 GenAI Evaluation Scope vs upstream The OSS repo ships and related skills ( , , , ) that cover the generic MLflow GenAI evaluation workflow — , scorers/judges, datasets, tracing setup, and the 5-step evaluation loop. This skill layers Databricks-specific patterns on top of that workflow rather than restating it. Use this skill when you need any of: - Unity Catalog trace ingestion — production traces written into UC tables, log-based monitoring ( ). - MemAlign judge alignment via UC SME labeling sessions — aligning custom judges against domain-expert feedback collected in Databricks (…