Azure Databricks Platform & Governance
SECTION 1: UNITY CATALOG (1.5 hours)
Q1: What is Unity Catalog? What is the object hierarchy?
Simple Explanation: In Databricks, you have hundreds of tables, ML models, files, and functions. Unity Catalog is the single place that manages ALL of them — who can access what, where data came from, how it's organized, and what changed.
Think of Unity Catalog as the security guard + librarian + receptionist of your entire data platform:
- Security guard: Controls who can access which tables (access control)
- Librarian: Organizes all data into catalogs/schemas so you can find it easily (discovery)
- Receptionist: Keeps a log of who accessed what and when (auditing)
Why do we need it? Without Unity Catalog: Each team creates tables in random locations, no one knows who has access to what, PII data leaks because there's no control, and you can't trace where data came from. With Unity Catalog: One central place to govern everything — tables, ML models, files, permissions.
The hierarchy (how data is organized):