Azure Databricks Interview Question Bank
TOPIC 1: DELTA LAKE (Internals, Transaction Log, ACID, MERGE, OPTIMIZE, VACUUM, Z-ORDER, Liquid Clustering)
L1 — Direct / Simple Questions
- What is Delta Lake and why was it created?
- What file format does Delta Lake use under the hood?
- What is the
_delta_logdirectory and what does it contain? - What are the four ACID properties and how does Delta Lake guarantee them?
- What is a checkpoint file in the Delta transaction log?
- What is schema enforcement in Delta Lake?
- What is schema evolution and how do you enable it?
- What is Time Travel in Delta Lake? How do you query an older version?
- What is the VACUUM command and what does it do?
- What is the default retention period for VACUUM?
- What does the OPTIMIZE command do?
- What is Z-ORDER and what problem does it solve?
- What are Deletion Vectors in Delta Lake?
- What is the difference between Delta Lake and Apache Parquet?
- What is the DESCRIBE HISTORY command used for?
- What is Change Data Feed (CDF) in Delta Lake?
- What is Liquid Clustering in Delta Lake?
- What is Predictive Optimization in Databricks?
- What are table constraints in Delta Lake (CHECK, NOT NULL)?
- What is the RESTORE command in Delta Lake?
L2
- Explain the anatomy of a Delta Lake transaction — what happens when you write to a Delta table?
- How does optimistic concurrency control work in Delta Lake? What happens during write conflicts?
- Compare Z-ORDER vs Liquid Clustering — when would you use each?
- Explain data skipping in Delta Lake. How does it use min/max statistics?
- What is the difference between OPTIMIZE and VACUUM? Can you run them together?
- Explain how MERGE INTO works internally. What are the performance implications of a full table scan in MERGE?
- How does the Delta transaction log handle concurrent writes from multiple clusters?
- Compare schema enforcement vs schema evolution — give an example where each is appropriate.
- What happens if you run VACUUM with a retention of 0 hours? What are the risks?
- Explain the difference between
OPTIMIZE WHEREand partition-level OPTIMIZE. - How does Delta Lake handle small file compaction? What is the "small file problem"?
- Explain the difference between Copy-on-Write and Merge-on-Read in Delta Lake.
- How do Deletion Vectors improve UPDATE/DELETE performance compared to the traditional approach?
- What is the relationship between file statistics, data skipping, and Z-ORDER?
- Explain how Time Travel works internally — what is stored in each JSON commit file?
- Compare Change Data Feed (CDF) vs reading the transaction log directly for CDC.
- What is the difference between managed and external Delta tables?
- How does Liquid Clustering handle incremental clustering vs Z-ORDER which requires full rewrite?
- What are the trade-offs of over-partitioning a Delta table?
- Explain Delta Lake 4.x features: UniForm, Universal Format. Why do they matter?
L3 — Scenario-Based Questions
- MERGE Optimization: Your MERGE INTO statement takes 45 minutes on a 2 TB Delta table. Walk me through how you would diagnose and optimize this.
- Transaction Log Corruption: A developer accidentally ran
VACUUMwith 0-hour retention and now Time Travel queries fail. What happened and how do you recover? - Small File Problem: Your Bronze table has 50,000 small Parquet files (avg 2 MB each). How do you fix this and prevent it from recurring?
- Concurrent Writes: Two Databricks jobs write to the same Delta table simultaneously and one fails with a
ConcurrentAppendException. Explain why and how you fix it. - Z-ORDER Strategy: You have a 10 TB Delta table queried by
country,date, andcustomer_id. Design the partitioning and Z-ORDER strategy. - Liquid Clustering Migration: Your team wants to migrate from Z-ORDER to Liquid Clustering on a production table. What is your migration plan? Any risks?
- Schema Evolution Crisis: A source system added 5 new columns overnight and your streaming pipeline failed. How do you design for schema evolution in Auto Loader + Delta Lake?
- Time Travel for Audit: Your compliance team needs to prove what data looked like on a specific date 30 days ago. How do you implement this with Delta Lake Time Travel? What are the limitations?
- VACUUM vs Storage Costs: Your Delta table consumes 5x the expected storage due to retained old versions. Design a VACUUM strategy that balances cost vs Time Travel needs.
- CDC with Delta CDF: Design a pipeline where downstream consumers only process changed records from a Silver Delta table. How do you use Change Data Feed?
- Table Restore Scenario: A bad ETL job corrupted your Gold table at 3 AM. It is now 9 AM and 6 versions have been written since. Walk through the recovery process.
- Partition Evolution: Your table was partitioned by
year/month/daybut queries now filter primarily byregion. How do you restructure without downtime? - MERGE with SCD Type 2: Implement a MERGE strategy for SCD Type 2 on a customer dimension table where you need to close old records and insert new ones atomically.
- Delta Sharing: An external partner needs read access to a subset of your Delta table. How do you implement this securely using Delta Sharing?
- Deletion Vectors in Production: After enabling Deletion Vectors, read performance on certain queries degraded. Explain why and how you would resolve this.
TOPIC 2: ETL PIPELINES (Medallion Architecture, SCD Type 2, CDC, Auto Loader, DLT/Lakeflow)
L1 — Direct / Simple Questions
- What is the Medallion Architecture (Bronze/Silver/Gold)?
- What is Auto Loader in Databricks?
- What is the difference between Auto Loader and COPY INTO?
- What is Delta Live Tables (DLT)?
- What is Lakeflow and how does it relate to DLT?
- What is SCD Type 1 vs SCD Type 2?
- What is Change Data Capture (CDC)?
- What is a streaming table vs a materialized view in DLT?
- What are DLT expectations (data quality constraints)?
- What is the difference between
cloudFilesandspark.readStreamon Delta? - What is structured streaming in Databricks?
- What is a checkpoint in Spark Structured Streaming?
- What is the trigger mode
availableNowvsprocessingTime? - What is idempotency and why is it important in ETL pipelines?
- What is the difference between batch and streaming ETL?
- What is an ETL pipeline vs an ELT pipeline?
- What are the three DLT expectation actions:
warn,drop,fail? - What is the
foreachBatchsink in Structured Streaming? - What is event-time processing vs processing-time in streaming?
- What is watermarking in Spark Structured Streaming?
L2
- Explain how Auto Loader's file notification mode works vs directory listing mode. When do you use each?
- How do you handle schema evolution with Auto Loader (
cloudFiles.schemaEvolutionMode)? - Compare Delta Live Tables (DLT) vs hand-coded Structured Streaming pipelines — trade-offs?
- Explain how to implement SCD Type 2 using MERGE INTO with Delta Lake. What are the key columns?
- How does DLT handle pipeline failures and retries? What is the concept of "idempotent recomputation"?
- Compare
trigger(availableNow=True)vstrigger(processingTime='5 minutes')— when to use each? - Explain the role of Bronze, Silver, and Gold layers in terms of data quality, latency, and consumers.
- How does watermarking work in Structured Streaming? What happens to late-arriving data?
- Compare CDC patterns: log-based CDC (Debezium/Kafka) vs query-based CDC vs timestamp-based CDC.
- How do you handle exactly-once semantics in a Databricks streaming pipeline?
- Explain
foreachBatch— when would you use it over a standard Delta sink? - How do DLT expectations compare to Great Expectations or other data quality frameworks?
- What are the different ways to orchestrate dependent DLT pipelines?
- Explain how Auto Loader handles file deduplication. What is the
RocksDBstate store? - How do you test ETL pipelines in Databricks? What frameworks do you use?
- Explain the difference between a complete output mode, append mode, and update mode in streaming.
- How do you monitor and alert on streaming pipeline lag in Databricks?
- What is the
APPLY CHANGES INTOsyntax in DLT and when do you use it? - How do you handle out-of-order events in a Medallion Architecture?
- Explain incremental data loading patterns: append-only vs upsert vs full refresh.
L3 — Scenario-Based Questions
- Oracle CDC Pipeline: Design a CDC pipeline from Oracle to Delta Lake for Amadeus booking data. Oracle does not support log-based CDC natively. What approach do you take?
- Late-Arriving Data: Flight booking amendments arrive 48 hours after the original booking. Design a pipeline that correctly handles these late-arriving events in the Medallion Architecture.
- SCD Type 2 at Scale: You need to maintain SCD Type 2 on a customer dimension table with 500M rows, receiving 2M updates daily. Design the MERGE strategy for performance.
- DLT Pipeline Failure: Your DLT pipeline fails at the Silver layer due to a data quality expectation violation at 2 AM. 50,000 records were dropped. How do you investigate, recover, and prevent recurrence?
- Auto Loader Schema Drift: A source system renames columns from
camelCasetosnake_caseovernight. Your Auto Loader pipeline breaks. Design a resilient schema evolution strategy. - Multi-Source Medallion: You have 20 source systems feeding Bronze. Some are batch (daily files), some are streaming (Kafka). Design the Medallion Architecture to unify them.
- Streaming Backpressure: Your streaming pipeline is processing 100K events/sec but the source is producing 500K events/sec. The lag keeps growing. How do you diagnose and fix this?
- GDPR Delete Pipeline: A GDPR deletion request arrives for a passenger. You need to delete their data across Bronze, Silver, and Gold layers in a Lakehouse. Design the process.
- Deduplication Strategy: Your Kafka source sends duplicate booking events. Design a deduplication strategy at the Bronze and Silver layers that guarantees exactly-once processing.
- Testing & Validation: How would you set up automated testing for a Databricks DLT pipeline? Include unit tests, integration tests, and data quality assertions.
- Hybrid Batch-Streaming: You need near-real-time dashboards (5-minute latency) but also end-of-day reconciliation reports. Design a single pipeline architecture.
- Multi-Hop Streaming: Design a streaming pipeline with three hops (Bronze->Silver->Gold) where each layer applies different transformations. How do you manage checkpoints and failure recovery?
- Slowly Changing Dimension with Deletes: Your source system sends hard deletes (records simply disappear). How do you detect and handle these in an SCD Type 2 pipeline?
- Cost-Efficient Ingestion: You ingest 10 TB/day of raw JSON from ADLS Gen2. Design an ingestion pipeline that minimizes compute cost while maintaining <15 min latency.
- Pipeline Dependency Management: You have 50 DLT pipelines with complex dependencies. Some must run sequentially, others can be parallel. How do you orchestrate this?
TOPIC 3: AZURE PLATFORM & GOVERNANCE (Unity Catalog, Photon, Serverless, ADLS Gen2, GDPR)
L1 — Direct / Simple Questions
- What is Unity Catalog in Databricks?
- What is the three-level namespace in Unity Catalog (catalog.schema.table)?
- What is a metastore in Unity Catalog?
- What is the difference between managed and external tables in Unity Catalog?
- What is a storage credential in Unity Catalog?
- What is an external location in Unity Catalog?
- What is data lineage in Unity Catalog?
- What is Photon engine in Databricks?
- What is Serverless compute in Databricks?
- What is ADLS Gen2 and how does Databricks connect to it?
- What is the difference between a Databricks workspace and a metastore?
- What is row-level security in Unity Catalog?
- What is column masking in Unity Catalog?
- What is a service principal in Databricks on Azure?
- What is the difference between instance profiles and storage credentials?
- What are tags and labels in Unity Catalog for data classification?
- What is the system tables feature in Unity Catalog?
- What is Databricks SQL (DBSQL)?
- What is a SQL Warehouse (Serverless vs Pro vs Classic)?
- What is GDPR and what does it mean for data engineering?
L2
- Explain the Unity Catalog hierarchy: metastore -> catalog -> schema -> table/view/function. How do permissions cascade?
- Compare Unity Catalog vs legacy Hive Metastore — what are the key differences and migration challenges?
- How does Photon engine accelerate queries? What workloads benefit most from Photon?
- Compare Serverless SQL Warehouses vs Classic SQL Warehouses — cost, startup time, scaling.
- Explain how ADLS Gen2 integrates with Databricks — authentication methods (OAuth, service principals, access keys).
- How do you implement GDPR "Right to be Forgotten" in a Lakehouse architecture?
- Explain dynamic views in Unity Catalog for row-level and column-level security.
- How does Unity Catalog handle cross-workspace data sharing?
- Compare Azure Databricks vs Azure Synapse Analytics — when would you recommend each?
- Explain how audit logging works in Unity Catalog. What events are captured?
- How do you implement data classification (PII tagging) using Unity Catalog?
- What are the networking options for Databricks on Azure (VNet injection, Private Link, NSGs)?
- Explain the difference between account-level and workspace-level identity in Databricks.
- How does Unity Catalog system tables help with cost monitoring and query auditing?
- Compare managed identity vs service principal vs access key for ADLS Gen2 access — pros/cons.
- Explain how Databricks handles encryption at rest and in transit on Azure.
- How do you design a multi-region Databricks deployment on Azure?
- What is the role of Azure Key Vault in Databricks? How do you manage secrets?
- Explain the difference between GRANT, DENY, and REVOKE in Unity Catalog's permission model.
- How does Unity Catalog's data lineage differ from tools like Apache Atlas or Purview?
L3 — Scenario-Based Questions
- Unity Catalog Migration: Your organization has 500 tables in Hive Metastore across 3 workspaces. Design a migration plan to Unity Catalog with zero downtime.
- GDPR Compliance Pipeline: Amadeus handles passenger PII (names, passport numbers, emails) across 100 countries. Design a GDPR-compliant data architecture using Unity Catalog, column masking, and deletion pipelines.
- Multi-Team Governance: You have Data Engineering, Data Science, and BI teams sharing a single Databricks deployment. Design the Unity Catalog structure (catalogs, schemas, permissions) for proper isolation and collaboration.
- Cost Optimization: Your Azure Databricks bill is $150K/month. 60% is compute. Design a cost reduction strategy using Serverless, autoscaling, spot instances, and cluster policies.
- Secure External Sharing: A partner airline needs read access to specific Gold tables but must NOT see PII columns. Design this using Unity Catalog, Delta Sharing, and dynamic views.
- Photon Decision: Your team is deciding whether to enable Photon on all clusters. Some workloads are Python UDF-heavy, others are SQL-heavy. How do you evaluate and decide?
- Network Security: Your security team requires all Databricks traffic to stay within the Azure virtual network and never traverse the public internet. Design the network architecture.
- Disaster Recovery: Design a DR strategy for Databricks on Azure. RTO = 4 hours, RPO = 1 hour. Consider metastore, Delta tables, notebooks, and cluster configurations.
- Audit & Compliance: The compliance team needs a report showing who accessed PII data in the last 90 days, what queries they ran, and what data they exported. Design this using system tables.
- ADLS Gen2 Organization: You have 50 data products across 5 business domains. Design the ADLS Gen2 storage layout (containers, folders) and the corresponding Unity Catalog structure.
- Data Mesh on Databricks: Leadership wants to adopt a Data Mesh approach. How would you structure Unity Catalog catalogs, schemas, and ownership to enable domain-oriented data products?
- Cross-Cloud Access: A team in AWS needs to read data from your Azure Databricks Lakehouse. Design the architecture using Delta Sharing.
- PII Detection & Tagging: You inherit 2,000 tables with no documentation. Design an automated PII detection and tagging pipeline using Unity Catalog tags and Databricks notebooks.
- Serverless Migration: Your team runs 200 interactive clusters daily. The CFO wants to move to Serverless. What is your evaluation and migration plan?
- Regulatory Audit: A regulator asks you to prove data lineage from source (Oracle) to final report (Power BI). How do you use Unity Catalog lineage to demonstrate this end-to-end?
TOPIC 4: PRODUCTION & CI/CD (Workflows, Asset Bundles, Cost Management, Debugging)
L1 — Direct / Simple Questions
- What is Databricks Workflows (formerly Jobs)?
- What is the difference between a Task and a Job in Databricks Workflows?
- What are Databricks Asset Bundles (DABs)?
- What is the Databricks CLI?
- What is a job cluster vs an all-purpose (interactive) cluster?
- What is cluster autoscaling and how does it work?
- What is a cluster policy in Databricks?
- What are spot instances and how do they reduce cost?
- What is the Databricks REST API used for?
- What is a Databricks repo (Git integration)?
- What are Databricks Notebooks vs IDE-based development?
- What is the difference between a wheel file and a notebook task in a Workflow?
- What is the Ganglia UI / Spark UI used for in debugging?
- What is a driver log vs an executor log?
- What is the Databricks DBU (Databricks Unit) and how is pricing calculated?
- What are init scripts and when would you use them?
- What is a multi-task workflow (DAG) in Databricks?
- What is the
dbutilslibrary and what are its key modules? - What are widgets in Databricks notebooks?
- What is the difference between
%runanddbutils.notebook.run()?
L2
- Explain Databricks Asset Bundles (DABs) — how do they enable CI/CD for Databricks projects?
- Compare DABs vs Terraform for Databricks infrastructure management — when to use each?
- How do you implement a CI/CD pipeline for Databricks using Azure DevOps?
- Explain how Databricks Workflows handles task dependencies, retries, and conditional execution.
- Compare job clusters vs all-purpose clusters — cost, startup time, use cases.
- How do you debug an OOM (Out of Memory) error in a Databricks Spark job?
- Explain how to read and interpret the Spark UI: stages, tasks, shuffle read/write, spill.
- How do you implement blue-green or canary deployments for Databricks ETL pipelines?
- Explain cluster pool strategy — how do pools reduce cluster startup time and cost?
- How do you manage secrets and environment-specific configurations across dev/staging/prod?
- Compare Databricks Repos (Git integration) vs external CI/CD tools for version control.
- How do you implement data pipeline monitoring and alerting in Databricks?
- Explain the cost implications of spot instances vs on-demand for different workload types.
- How do you diagnose data skew in a Spark job using the Spark UI?
- What is the recommended project structure for a Databricks DABs project?
- How do you implement parameterized jobs with dynamic values in Workflows?
- Explain the difference between task values (
dbutils.jobs.taskValues) and widget parameters. - How do you handle failing tasks in a DAG — retry policies, timeout, conditional logic?
- Compare Serverless jobs vs provisioned clusters for job execution — cost breakeven analysis.
- How do you implement logging and observability for production Databricks pipelines?
L3 — Scenario-Based Questions
- CI/CD Pipeline Design: Design an end-to-end CI/CD pipeline for a Databricks project. Include Git branching, testing, deployment to dev/staging/prod, and rollback strategy. Use DABs + Azure DevOps.
- Production Incident: Your nightly ETL job has been failing intermittently for 3 nights with
SparkException: Job aborted due to stage failure. Walk through your debugging process step by step. - Cost Reduction: Your team's Databricks spend increased 3x in 3 months. You need to reduce it by 40% without impacting SLAs. What do you analyze and what changes do you make?
- Cluster Strategy: You have 50 data engineers, 20 data scientists, and 10 BI analysts. Design the cluster strategy: interactive clusters, job clusters, pools, and policies.
- Job Orchestration: You have 30 ETL jobs. 10 run hourly, 15 run daily, 5 run weekly. Some have dependencies. Design the orchestration using Databricks Workflows.
- OOM Debugging: A Spark job processing a 5 TB dataset fails with OOM after running for 3 hours. You have 30 minutes to fix it before the business deadline. Walk through your approach.
- Migration from Airflow: Your team currently uses Apache Airflow for orchestration. Management wants to migrate to Databricks Workflows. Design the migration plan and address the gaps.
- Multi-Environment Deployment: Design a deployment strategy where the same code deploys to dev (small data, small clusters), staging (prod-like), and prod (full scale) using DABs.
- Data Pipeline SLA: Your Gold table must be refreshed by 6 AM every day. The pipeline takes 2-4 hours depending on data volume. Design the reliability strategy: monitoring, alerting, retry, fallback.
- Runaway Costs: A data scientist launched an interactive cluster with 100 nodes and forgot to terminate it. It ran for 72 hours. How do you prevent this from happening again?
- Spark Debugging: A join between two large tables is taking 6 hours instead of the expected 30 minutes. The Spark UI shows massive shuffle spill to disk. Diagnose and fix.
- Notebook to Production: A data scientist built a prototype in a notebook. You need to productionize it. Describe the steps: refactoring, testing, CI/CD, monitoring.
- DABs Project Setup: You are starting a new project with 3 DLT pipelines, 10 Workflows, and shared libraries. Design the DABs project structure, bundle configuration, and deployment targets.
- Rollback Strategy: A production deployment introduced a bug that corrupted the Silver layer. Design the rollback process: code rollback, data recovery, and communication plan.
- Monitoring Dashboard: Design a production monitoring dashboard for 50 Databricks pipelines. What metrics do you track? What alerting thresholds do you set? What tools do you use?
BONUS: CROSS-CUTTING / FREQUENTLY ASKED "REAL INTERVIEW" QUESTIONS
These are questions that appeared repeatedly across Glassdoor reports, Medium articles, and interview forums for 2025-2026 Senior Data Engineer roles.
Top 20 Most Frequently Asked
- What is the Lakehouse architecture and how does it differ from a Data Lake and a Data Warehouse? (L1)
- Explain the Delta Lake transaction log and how it provides ACID guarantees. (L2)
- Design a Medallion Architecture for [specific domain]. Walk through Bronze, Silver, Gold. (L3)
- How does MERGE INTO work in Delta Lake? What are its performance pitfalls? (L2)
- Implement SCD Type 2 using MERGE INTO. (L3)
- What is Unity Catalog and how does it improve governance over Hive Metastore? (L2)
- How do you handle CDC from a legacy database (Oracle/SQL Server) into Delta Lake? (L3)
- Compare Auto Loader vs COPY INTO — when do you use each? (L2)
- How do you optimize a slow-running Spark job? Walk through your debugging steps. (L3)
- What is Z-ORDER and when would you use it? How does Liquid Clustering improve on it? (L2)
- Design a GDPR-compliant data deletion pipeline in a Lakehouse. (L3)
- How do you implement CI/CD for Databricks? (L2)
- What is Photon and when should you enable it? (L1)
- Explain the small file problem and how to solve it in Delta Lake. (L2)
- How do you handle schema evolution in a streaming pipeline? (L2)
- Your pipeline is failing intermittently in production. Walk through your debugging process. (L3)
- How do you manage costs in Databricks? What strategies have you used? (L2)
- What is Delta Live Tables and how does it compare to manual Structured Streaming? (L2)
- Design a real-time analytics pipeline on Databricks. (L3)
- How do you handle data quality in a Lakehouse architecture? (L2)
Emerging 2025-2026 Topics (Newer Questions)
- What is Lakeflow Connect and how does it simplify ingestion? (L1)
- Explain Databricks Apps — what are they and when would you use them? (L1)
- What is Genie (natural language to SQL) and how does it fit into the Databricks ecosystem? (L1)
- How do you build and deploy an LLM-powered application using Databricks? (L3)
- What is the Databricks Marketplace and how do you publish/consume data products? (L2)
- How does Mosaic AI integrate with the Lakehouse for ML/AI workflows? (L2)
- What is UniForm in Delta Lake and why does it matter for interoperability? (L2)
- How do you use Databricks system tables for cost monitoring and optimization? (L3)
- Explain Serverless compute for jobs — how does it differ from provisioned clusters? (L2)
- What is Predictive Optimization and how does it automate OPTIMIZE/VACUUM/ANALYZE? (L2)
QUESTION COUNT SUMMARY
| Topic | L1 (Direct) | L2 (Mid-Level) | L3 (Scenario) | Total |
|---|---|---|---|---|
| Delta Lake | 20 | 20 | 15 | 55 |
| ETL Pipelines | 20 | 20 | 15 | 55 |
| Azure Platform & Governance | 20 | 20 | 15 | 55 |
| Production & CI/CD | 20 | 20 | 15 | 55 |
| Bonus (Cross-Cutting) | — | — | — | 30 |
| TOTAL | 80 | 80 | 60 | 250 |
SOURCES
- DataCamp — Top 37 Azure Data Engineering Interview Questions for 2026
- InterviewBit — Azure Databricks Interview Questions and Answers (2025)
- Simplilearn — 30 Azure Databricks Interview Questions and Answers (2026)
- Data Vidhya — Top 50 Databricks Data Engineering Interview Questions
- WeCreateProblems — 100+ Databricks Interview Questions and Answers (2026)
- Glassdoor — Databricks Interview Experience & Questions (2026)
- CrackIT Interview — Databricks Scenario-Based Questions 2025
- AccentFuture — Databricks Interview Questions for Data Engineers
- Medium (Karthik) — 50 Shades of Databricks Interview Questions
- Medium (AccentFuture) — Real-Time Databricks Interview Questions with Scenarios
- Analytics Vidhya — Top Delta Lake Interview Questions
- LinkJob.ai — My Azure Databricks Interview in 2025: Real Questions
- LinkJob.ai — My 2025 Databricks System Design Interview
- Medium (Unity Catalog) — Databricks Data Engineering 21 Interview Questions
- SriniMF — Unity Catalog in Databricks: Key MCQs
- ProjectPro — Top 15 Azure Databricks Interview Questions for 2025
- LinkedIn (TSandesh) — Unity Catalog Most Asked Interview Questions