🧱
Question Bank

Databricks Interview Questions

All 250 questions — full access

✓ Full Access 250 questions shown
← Study Guide
1 What is Delta Lake and why was it created?

See the study guide for the detailed answer →

2 What file format does Delta Lake use under the hood?

See the study guide for the detailed answer →

3 What is the `_delta_log` directory and what does it contain?

See the study guide for the detailed answer →

4 What are the four ACID properties and how does Delta Lake guarantee them?

See the study guide for the detailed answer →

5 What is a checkpoint file in the Delta transaction log?

See the study guide for the detailed answer →

6 What is schema enforcement in Delta Lake?

See the study guide for the detailed answer →

7 What is schema evolution and how do you enable it?

See the study guide for the detailed answer →

8 What is Time Travel in Delta Lake? How do you query an older version?

See the study guide for the detailed answer →

9 What is the VACUUM command and what does it do?

See the study guide for the detailed answer →

10 What is the default retention period for VACUUM?

See the study guide for the detailed answer →

11 What does the OPTIMIZE command do?

See the study guide for the detailed answer →

12 What is Z-ORDER and what problem does it solve?

See the study guide for the detailed answer →

13 What are Deletion Vectors in Delta Lake?

See the study guide for the detailed answer →

14 What is the difference between Delta Lake and Apache Parquet?

See the study guide for the detailed answer →

15 What is the DESCRIBE HISTORY command used for?

See the study guide for the detailed answer →

16 What is Change Data Feed (CDF) in Delta Lake?

See the study guide for the detailed answer →

17 What is Liquid Clustering in Delta Lake?

See the study guide for the detailed answer →

18 What is Predictive Optimization in Databricks?

See the study guide for the detailed answer →

19 What are table constraints in Delta Lake (CHECK, NOT NULL)?

See the study guide for the detailed answer →

20 What is the RESTORE command in Delta Lake?

See the study guide for the detailed answer →

21 Explain the anatomy of a Delta Lake transaction — what happens when you write to a Delta table?

See the study guide for the detailed answer →

22 How does optimistic concurrency control work in Delta Lake? What happens during write conflicts?

See the study guide for the detailed answer →

23 Compare Z-ORDER vs Liquid Clustering — when would you use each?

See the study guide for the detailed answer →

24 Explain data skipping in Delta Lake. How does it use min/max statistics?

See the study guide for the detailed answer →

25 What is the difference between OPTIMIZE and VACUUM? Can you run them together?

See the study guide for the detailed answer →

26 Explain how MERGE INTO works internally. What are the performance implications of a full table scan in MERGE?

See the study guide for the detailed answer →

27 How does the Delta transaction log handle concurrent writes from multiple clusters?

See the study guide for the detailed answer →

28 Compare schema enforcement vs schema evolution — give an example where each is appropriate.

See the study guide for the detailed answer →

29 What happens if you run VACUUM with a retention of 0 hours? What are the risks?

See the study guide for the detailed answer →

30 Explain the difference between `OPTIMIZE WHERE` and partition-level OPTIMIZE.

See the study guide for the detailed answer →

31 How does Delta Lake handle small file compaction? What is the "small file problem"?

See the study guide for the detailed answer →

32 Explain the difference between Copy-on-Write and Merge-on-Read in Delta Lake.

See the study guide for the detailed answer →

33 How do Deletion Vectors improve UPDATE/DELETE performance compared to the traditional approach?

See the study guide for the detailed answer →

34 What is the relationship between file statistics, data skipping, and Z-ORDER?

See the study guide for the detailed answer →

35 Explain how Time Travel works internally — what is stored in each JSON commit file?

See the study guide for the detailed answer →

36 Compare Change Data Feed (CDF) vs reading the transaction log directly for CDC.

See the study guide for the detailed answer →

37 What is the difference between managed and external Delta tables?

See the study guide for the detailed answer →

38 How does Liquid Clustering handle incremental clustering vs Z-ORDER which requires full rewrite?

See the study guide for the detailed answer →

39 What are the trade-offs of over-partitioning a Delta table?

See the study guide for the detailed answer →

40 Explain Delta Lake 4.x features: UniForm, Universal Format. Why do they matter?

See the study guide for the detailed answer →

41 **MERGE Optimization**: Your MERGE INTO statement takes 45 minutes on a 2TB Delta table. Walk me through how you would diagnose and optimize this.

See the study guide for the detailed answer →

42 **Transaction Log Corruption**: A developer accidentally ran `VACUUM` with 0-hour retention and now Time Travel queries fail. What happened and how do you recover?

See the study guide for the detailed answer →

43 **Small File Problem**: Your Bronze table has 50,000 small Parquet files (avg 2MB each). How do you fix this and prevent it from recurring?

See the study guide for the detailed answer →

44 **Concurrent Writes**: Two Databricks jobs write to the same Delta table simultaneously and one fails with a `ConcurrentAppendException`. Explain why and how you fix it.

See the study guide for the detailed answer →

45 **Z-ORDER Strategy**: You have a 10TB Delta table queried by `country`, `date`, and `customer_id`. Design the partitioning and Z-ORDER strategy.

See the study guide for the detailed answer →

46 **Liquid Clustering Migration**: Your team wants to migrate from Z-ORDER to Liquid Clustering on a production table. What is your migration plan? Any risks?

See the study guide for the detailed answer →

47 **Schema Evolution Crisis**: A source system added 5 new columns overnight and your streaming pipeline failed. How do you design for schema evolution in Auto Loader + Delta Lake?

See the study guide for the detailed answer →

48 **Time Travel for Audit**: Your compliance team needs to prove what data looked like on a specific date 30 days ago. How do you implement this with Delta Lake Time Travel? What are the limitations?

See the study guide for the detailed answer →

49 **VACUUM vs Storage Costs**: Your Delta table consumes 5x the expected storage due to retained old versions. Design a VACUUM strategy that balances cost vs Time Travel needs.

See the study guide for the detailed answer →

50 **CDC with Delta CDF**: Design a pipeline where downstream consumers only process changed records from a Silver Delta table. How do you use Change Data Feed?

See the study guide for the detailed answer →

51 **Table Restore Scenario**: A bad ETL job corrupted your Gold table at 3 AM. It is now 9 AM and 6 versions have been written since. Walk through the recovery process.

See the study guide for the detailed answer →

52 **Partition Evolution**: Your table was partitioned by `year/month/day` but queries now filter primarily by `region`. How do you restructure without downtime?

See the study guide for the detailed answer →

53 **MERGE with SCD Type 2**: Implement a MERGE strategy for SCD Type 2 on a customer dimension table where you need to close old records and insert new ones atomically.

See the study guide for the detailed answer →

54 **Delta Sharing**: An external partner needs read access to a subset of your Delta table. How do you implement this securely using Delta Sharing?

See the study guide for the detailed answer →

55 **Deletion Vectors in Production**: After enabling Deletion Vectors, read performance on certain queries degraded. Explain why and how you would resolve this.

See the study guide for the detailed answer →

56 What is the Medallion Architecture (Bronze/Silver/Gold)?

See the study guide for the detailed answer →

57 What is Auto Loader in Databricks?

See the study guide for the detailed answer →

58 What is the difference between Auto Loader and COPY INTO?

See the study guide for the detailed answer →

59 What is Delta Live Tables (DLT)?

See the study guide for the detailed answer →

60 What is Lakeflow and how does it relate to DLT?

See the study guide for the detailed answer →

61 What is SCD Type 1 vs SCD Type 2?

See the study guide for the detailed answer →

62 What is Change Data Capture (CDC)?

See the study guide for the detailed answer →

63 What is a streaming table vs a materialized view in DLT?

See the study guide for the detailed answer →

64 What are DLT expectations (data quality constraints)?

See the study guide for the detailed answer →

65 What is the difference between `cloudFiles` and `spark.readStream` on Delta?

See the study guide for the detailed answer →

66 What is structured streaming in Databricks?

See the study guide for the detailed answer →

67 What is a checkpoint in Spark Structured Streaming?

See the study guide for the detailed answer →

68 What is the trigger mode `availableNow` vs `processingTime`?

See the study guide for the detailed answer →

69 What is idempotency and why is it important in ETL pipelines?

See the study guide for the detailed answer →

70 What is the difference between batch and streaming ETL?

See the study guide for the detailed answer →

71 What is an ETL pipeline vs an ELT pipeline?

See the study guide for the detailed answer →

72 What are the three DLT expectation actions: `warn`, `drop`, `fail`?

See the study guide for the detailed answer →

73 What is the `foreachBatch` sink in Structured Streaming?

See the study guide for the detailed answer →

74 What is event-time processing vs processing-time in streaming?

See the study guide for the detailed answer →

75 What is watermarking in Spark Structured Streaming?

See the study guide for the detailed answer →

76 Explain how Auto Loader's file notification mode works vs directory listing mode. When do you use each?

See the study guide for the detailed answer →

77 How do you handle schema evolution with Auto Loader (`cloudFiles.schemaEvolutionMode`)?

See the study guide for the detailed answer →

78 Compare Delta Live Tables (DLT) vs hand-coded Structured Streaming pipelines — trade-offs?

See the study guide for the detailed answer →

79 Explain how to implement SCD Type 2 using MERGE INTO with Delta Lake. What are the key columns?

See the study guide for the detailed answer →

80 How does DLT handle pipeline failures and retries? What is the concept of "idempotent recomputation"?

See the study guide for the detailed answer →

81 Compare `trigger(availableNow=True)` vs `trigger(processingTime='5 minutes')` — when to use each?

See the study guide for the detailed answer →

82 Explain the role of Bronze, Silver, and Gold layers in terms of data quality, latency, and consumers.

See the study guide for the detailed answer →

83 How does watermarking work in Structured Streaming? What happens to late-arriving data?

See the study guide for the detailed answer →

84 Compare CDC patterns: log-based CDC (Debezium/Kafka) vs query-based CDC vs timestamp-based CDC.

See the study guide for the detailed answer →

85 How do you handle exactly-once semantics in a Databricks streaming pipeline?

See the study guide for the detailed answer →

86 Explain `foreachBatch` — when would you use it over a standard Delta sink?

See the study guide for the detailed answer →

87 How do DLT expectations compare to Great Expectations or other data quality frameworks?

See the study guide for the detailed answer →

88 What are the different ways to orchestrate dependent DLT pipelines?

See the study guide for the detailed answer →

89 Explain how Auto Loader handles file deduplication. What is the `RocksDB` state store?

See the study guide for the detailed answer →

90 How do you test ETL pipelines in Databricks? What frameworks do you use?

See the study guide for the detailed answer →

91 Explain the difference between a complete output mode, append mode, and update mode in streaming.

See the study guide for the detailed answer →

92 How do you monitor and alert on streaming pipeline lag in Databricks?

See the study guide for the detailed answer →

93 What is the `APPLY CHANGES INTO` syntax in DLT and when do you use it?

See the study guide for the detailed answer →

94 How do you handle out-of-order events in a Medallion Architecture?

See the study guide for the detailed answer →

95 Explain incremental data loading patterns: append-only vs upsert vs full refresh.

See the study guide for the detailed answer →

96 **Oracle CDC Pipeline**: Design a CDC pipeline from Oracle to Delta Lake for Amadeus booking data. Oracle does not support log-based CDC natively. What approach do you take?

See the study guide for the detailed answer →

97 **Late-Arriving Data**: Flight booking amendments arrive 48 hours after the original booking. Design a pipeline that correctly handles these late-arriving events in the Medallion Architecture.

See the study guide for the detailed answer →

98 **SCD Type 2 at Scale**: You need to maintain SCD Type 2 on a customer dimension table with 500M rows, receiving 2M updates daily. Design the MERGE strategy for performance.

See the study guide for the detailed answer →

99 **DLT Pipeline Failure**: Your DLT pipeline fails at the Silver layer due to a data quality expectation violation at 2 AM. 50,000 records were dropped. How do you investigate, recover, and prevent recurrence?

See the study guide for the detailed answer →

100 **Auto Loader Schema Drift**: A source system renames columns from `camelCase` to `snake_case` overnight. Your Auto Loader pipeline breaks. Design a resilient schema evolution strategy.

See the study guide for the detailed answer →

101 **Multi-Source Medallion**: You have 20 source systems feeding Bronze. Some are batch (daily files), some are streaming (Kafka). Design the Medallion Architecture to unify them.

See the study guide for the detailed answer →

102 **Streaming Backpressure**: Your streaming pipeline is processing 100K events/sec but the source is producing 500K events/sec. The lag keeps growing. How do you diagnose and fix this?

See the study guide for the detailed answer →

103 **GDPR Delete Pipeline**: A GDPR deletion request arrives for a passenger. You need to delete their data across Bronze, Silver, and Gold layers in a Lakehouse. Design the process.

See the study guide for the detailed answer →

104 **Deduplication Strategy**: Your Kafka source sends duplicate booking events. Design a deduplication strategy at the Bronze and Silver layers that guarantees exactly-once processing.

See the study guide for the detailed answer →

105 **Testing & Validation**: How would you set up automated testing for a Databricks DLT pipeline? Include unit tests, integration tests, and data quality assertions.

See the study guide for the detailed answer →

106 **Hybrid Batch-Streaming**: You need near-real-time dashboards (5-minute latency) but also end-of-day reconciliation reports. Design a single pipeline architecture.

See the study guide for the detailed answer →

107 **Multi-Hop Streaming**: Design a streaming pipeline with three hops (Bronze->Silver->Gold) where each layer applies different transformations. How do you manage checkpoints and failure recovery?

See the study guide for the detailed answer →

108 **Slowly Changing Dimension with Deletes**: Your source system sends hard deletes (records simply disappear). How do you detect and handle these in an SCD Type 2 pipeline?

See the study guide for the detailed answer →

109 **Cost-Efficient Ingestion**: You ingest 10TB/day of raw JSON from ADLS Gen2. Design an ingestion pipeline that minimizes compute cost while maintaining <15 min latency.

See the study guide for the detailed answer →

110 **Pipeline Dependency Management**: You have 50 DLT pipelines with complex dependencies. Some must run sequentially, others can be parallel. How do you orchestrate this?

See the study guide for the detailed answer →

111 What is Unity Catalog in Databricks?

See the study guide for the detailed answer →

112 What is the three-level namespace in Unity Catalog (catalog.schema.table)?

See the study guide for the detailed answer →

113 What is a metastore in Unity Catalog?

See the study guide for the detailed answer →

114 What is the difference between managed and external tables in Unity Catalog?

See the study guide for the detailed answer →

115 What is a storage credential in Unity Catalog?

See the study guide for the detailed answer →

116 What is an external location in Unity Catalog?

See the study guide for the detailed answer →

117 What is data lineage in Unity Catalog?

See the study guide for the detailed answer →

118 What is Photon engine in Databricks?

See the study guide for the detailed answer →

119 What is Serverless compute in Databricks?

See the study guide for the detailed answer →

120 What is ADLS Gen2 and how does Databricks connect to it?

See the study guide for the detailed answer →

121 What is the difference between a Databricks workspace and a metastore?

See the study guide for the detailed answer →

122 What is row-level security in Unity Catalog?

See the study guide for the detailed answer →

123 What is column masking in Unity Catalog?

See the study guide for the detailed answer →

124 What is a service principal in Databricks on Azure?

See the study guide for the detailed answer →

125 What is the difference between instance profiles and storage credentials?

See the study guide for the detailed answer →

126 What are tags and labels in Unity Catalog for data classification?

See the study guide for the detailed answer →

127 What is the system tables feature in Unity Catalog?

See the study guide for the detailed answer →

128 What is Databricks SQL (DBSQL)?

See the study guide for the detailed answer →

129 What is a SQL Warehouse (Serverless vs Pro vs Classic)?

See the study guide for the detailed answer →

130 What is GDPR and what does it mean for data engineering?

See the study guide for the detailed answer →

131 Explain the Unity Catalog hierarchy: metastore -> catalog -> schema -> table/view/function. How do permissions cascade?

See the study guide for the detailed answer →

132 Compare Unity Catalog vs legacy Hive Metastore — what are the key differences and migration challenges?

See the study guide for the detailed answer →

133 How does Photon engine accelerate queries? What workloads benefit most from Photon?

See the study guide for the detailed answer →

134 Compare Serverless SQL Warehouses vs Classic SQL Warehouses — cost, startup time, scaling.

See the study guide for the detailed answer →

135 Explain how ADLS Gen2 integrates with Databricks — authentication methods (OAuth, service principals, access keys).

See the study guide for the detailed answer →

136 How do you implement GDPR "Right to be Forgotten" in a Lakehouse architecture?

See the study guide for the detailed answer →

137 Explain dynamic views in Unity Catalog for row-level and column-level security.

See the study guide for the detailed answer →

138 How does Unity Catalog handle cross-workspace data sharing?

See the study guide for the detailed answer →

139 Compare Azure Databricks vs Azure Synapse Analytics — when would you recommend each?

See the study guide for the detailed answer →

140 Explain how audit logging works in Unity Catalog. What events are captured?

See the study guide for the detailed answer →

141 How do you implement data classification (PII tagging) using Unity Catalog?

See the study guide for the detailed answer →

142 What are the networking options for Databricks on Azure (VNet injection, Private Link, NSGs)?

See the study guide for the detailed answer →

143 Explain the difference between account-level and workspace-level identity in Databricks.

See the study guide for the detailed answer →

144 How does Unity Catalog system tables help with cost monitoring and query auditing?

See the study guide for the detailed answer →

145 Compare managed identity vs service principal vs access key for ADLS Gen2 access — pros/cons.

See the study guide for the detailed answer →

146 Explain how Databricks handles encryption at rest and in transit on Azure.

See the study guide for the detailed answer →

147 How do you design a multi-region Databricks deployment on Azure?

See the study guide for the detailed answer →

148 What is the role of Azure Key Vault in Databricks? How do you manage secrets?

See the study guide for the detailed answer →

149 Explain the difference between GRANT, DENY, and REVOKE in Unity Catalog's permission model.

See the study guide for the detailed answer →

150 How does Unity Catalog's data lineage differ from tools like Apache Atlas or Purview?

See the study guide for the detailed answer →

151 **Unity Catalog Migration**: Your organization has 500 tables in Hive Metastore across 3 workspaces. Design a migration plan to Unity Catalog with zero downtime.

See the study guide for the detailed answer →

152 **GDPR Compliance Pipeline**: Amadeus handles passenger PII (names, passport numbers, emails) across 100 countries. Design a GDPR-compliant data architecture using Unity Catalog, column masking, and deletion pipelines.

See the study guide for the detailed answer →

153 **Multi-Team Governance**: You have Data Engineering, Data Science, and BI teams sharing a single Databricks deployment. Design the Unity Catalog structure (catalogs, schemas, permissions) for proper isolation and collaboration.

See the study guide for the detailed answer →

154 **Cost Optimization**: Your Azure Databricks bill is $150K/month. 60% is compute. Design a cost reduction strategy using Serverless, autoscaling, spot instances, and cluster policies.

See the study guide for the detailed answer →

155 **Secure External Sharing**: A partner airline needs read access to specific Gold tables but must NOT see PII columns. Design this using Unity Catalog, Delta Sharing, and dynamic views.

See the study guide for the detailed answer →

156 **Photon Decision**: Your team is deciding whether to enable Photon on all clusters. Some workloads are Python UDF-heavy, others are SQL-heavy. How do you evaluate and decide?

See the study guide for the detailed answer →

157 **Network Security**: Your security team requires all Databricks traffic to stay within the Azure virtual network and never traverse the public internet. Design the network architecture.

See the study guide for the detailed answer →

158 **Disaster Recovery**: Design a DR strategy for Databricks on Azure. RTO = 4 hours, RPO = 1 hour. Consider metastore, Delta tables, notebooks, and cluster configurations.

See the study guide for the detailed answer →

159 **Audit & Compliance**: The compliance team needs a report showing who accessed PII data in the last 90 days, what queries they ran, and what data they exported. Design this using system tables.

See the study guide for the detailed answer →

160 **ADLS Gen2 Organization**: You have 50 data products across 5 business domains. Design the ADLS Gen2 storage layout (containers, folders) and the corresponding Unity Catalog structure.

See the study guide for the detailed answer →

161 **Data Mesh on Databricks**: Leadership wants to adopt a Data Mesh approach. How would you structure Unity Catalog catalogs, schemas, and ownership to enable domain-oriented data products?

See the study guide for the detailed answer →

162 **Cross-Cloud Access**: A team in AWS needs to read data from your Azure Databricks Lakehouse. Design the architecture using Delta Sharing.

See the study guide for the detailed answer →

163 **PII Detection & Tagging**: You inherit 2,000 tables with no documentation. Design an automated PII detection and tagging pipeline using Unity Catalog tags and Databricks notebooks.

See the study guide for the detailed answer →

164 **Serverless Migration**: Your team runs 200 interactive clusters daily. The CFO wants to move to Serverless. What is your evaluation and migration plan?

See the study guide for the detailed answer →

165 **Regulatory Audit**: A regulator asks you to prove data lineage from source (Oracle) to final report (Power BI). How do you use Unity Catalog lineage to demonstrate this end-to-end?

See the study guide for the detailed answer →

166 What is Databricks Workflows (formerly Jobs)?

See the study guide for the detailed answer →

167 What is the difference between a Task and a Job in Databricks Workflows?

See the study guide for the detailed answer →

168 What are Databricks Asset Bundles (DABs)?

See the study guide for the detailed answer →

169 What is the Databricks CLI?

See the study guide for the detailed answer →

170 What is a job cluster vs an all-purpose (interactive) cluster?

See the study guide for the detailed answer →

171 What is cluster autoscaling and how does it work?

See the study guide for the detailed answer →

172 What is a cluster policy in Databricks?

See the study guide for the detailed answer →

173 What are spot instances and how do they reduce cost?

See the study guide for the detailed answer →

174 What is the Databricks REST API used for?

See the study guide for the detailed answer →

175 What is a Databricks repo (Git integration)?

See the study guide for the detailed answer →

176 What are Databricks Notebooks vs IDE-based development?

See the study guide for the detailed answer →

177 What is the difference between a wheel file and a notebook task in a Workflow?

See the study guide for the detailed answer →

178 What is the Ganglia UI / Spark UI used for in debugging?

See the study guide for the detailed answer →

179 What is a driver log vs an executor log?

See the study guide for the detailed answer →

180 What is the Databricks DBU (Databricks Unit) and how is pricing calculated?

See the study guide for the detailed answer →

181 What are init scripts and when would you use them?

See the study guide for the detailed answer →

182 What is a multi-task workflow (DAG) in Databricks?

See the study guide for the detailed answer →

183 What is the `dbutils` library and what are its key modules?

See the study guide for the detailed answer →

184 What are widgets in Databricks notebooks?

See the study guide for the detailed answer →

185 What is the difference between `%run` and `dbutils.notebook.run()`?

See the study guide for the detailed answer →

186 Explain Databricks Asset Bundles (DABs) — how do they enable CI/CD for Databricks projects?

See the study guide for the detailed answer →

187 Compare DABs vs Terraform for Databricks infrastructure management — when to use each?

See the study guide for the detailed answer →

188 How do you implement a CI/CD pipeline for Databricks using Azure DevOps?

See the study guide for the detailed answer →

189 Explain how Databricks Workflows handles task dependencies, retries, and conditional execution.

See the study guide for the detailed answer →

190 Compare job clusters vs all-purpose clusters — cost, startup time, use cases.

See the study guide for the detailed answer →

191 How do you debug an OOM (Out of Memory) error in a Databricks Spark job?

See the study guide for the detailed answer →

192 Explain how to read and interpret the Spark UI: stages, tasks, shuffle read/write, spill.

See the study guide for the detailed answer →

193 How do you implement blue-green or canary deployments for Databricks ETL pipelines?

See the study guide for the detailed answer →

194 Explain cluster pool strategy — how do pools reduce cluster startup time and cost?

See the study guide for the detailed answer →

195 How do you manage secrets and environment-specific configurations across dev/staging/prod?

See the study guide for the detailed answer →

196 Compare Databricks Repos (Git integration) vs external CI/CD tools for version control.

See the study guide for the detailed answer →

197 How do you implement data pipeline monitoring and alerting in Databricks?

See the study guide for the detailed answer →

198 Explain the cost implications of spot instances vs on-demand for different workload types.

See the study guide for the detailed answer →

199 How do you diagnose data skew in a Spark job using the Spark UI?

See the study guide for the detailed answer →

200 What is the recommended project structure for a Databricks DABs project?

See the study guide for the detailed answer →

201 How do you implement parameterized jobs with dynamic values in Workflows?

See the study guide for the detailed answer →

202 Explain the difference between task values (`dbutils.jobs.taskValues`) and widget parameters.

See the study guide for the detailed answer →

203 How do you handle failing tasks in a DAG — retry policies, timeout, conditional logic?

See the study guide for the detailed answer →

204 Compare Serverless jobs vs provisioned clusters for job execution — cost breakeven analysis.

See the study guide for the detailed answer →

205 How do you implement logging and observability for production Databricks pipelines?

See the study guide for the detailed answer →

206 **CI/CD Pipeline Design**: Design an end-to-end CI/CD pipeline for a Databricks project. Include Git branching, testing, deployment to dev/staging/prod, and rollback strategy. Use DABs + Azure DevOps.

See the study guide for the detailed answer →

207 **Production Incident**: Your nightly ETL job has been failing intermittently for 3 nights with `SparkException: Job aborted due to stage failure`. Walk through your debugging process step by step.

See the study guide for the detailed answer →

208 **Cost Reduction**: Your team's Databricks spend increased 3x in 3 months. You need to reduce it by 40% without impacting SLAs. What do you analyze and what changes do you make?

See the study guide for the detailed answer →

209 **Cluster Strategy**: You have 50 data engineers, 20 data scientists, and 10 BI analysts. Design the cluster strategy: interactive clusters, job clusters, pools, and policies.

See the study guide for the detailed answer →

210 **Job Orchestration**: You have 30 ETL jobs. 10 run hourly, 15 run daily, 5 run weekly. Some have dependencies. Design the orchestration using Databricks Workflows.

See the study guide for the detailed answer →

211 **OOM Debugging**: A Spark job processing a 5TB dataset fails with OOM after running for 3 hours. You have 30 minutes to fix it before the business deadline. Walk through your approach.

See the study guide for the detailed answer →

212 **Migration from Airflow**: Your team currently uses Apache Airflow for orchestration. Management wants to migrate to Databricks Workflows. Design the migration plan and address the gaps.

See the study guide for the detailed answer →

213 **Multi-Environment Deployment**: Design a deployment strategy where the same code deploys to dev (small data, small clusters), staging (prod-like), and prod (full scale) using DABs.

See the study guide for the detailed answer →

214 **Data Pipeline SLA**: Your Gold table must be refreshed by 6 AM every day. The pipeline takes 2-4 hours depending on data volume. Design the reliability strategy: monitoring, alerting, retry, fallback.

See the study guide for the detailed answer →

215 **Runaway Costs**: A data scientist launched an interactive cluster with 100 nodes and forgot to terminate it. It ran for 72 hours. How do you prevent this from happening again?

See the study guide for the detailed answer →

216 **Spark Debugging**: A join between two large tables is taking 6 hours instead of the expected 30 minutes. The Spark UI shows massive shuffle spill to disk. Diagnose and fix.

See the study guide for the detailed answer →

217 **Notebook to Production**: A data scientist built a prototype in a notebook. You need to productionize it. Describe the steps: refactoring, testing, CI/CD, monitoring.

See the study guide for the detailed answer →

218 **DABs Project Setup**: You are starting a new project with 3 DLT pipelines, 10 Workflows, and shared libraries. Design the DABs project structure, bundle configuration, and deployment targets.

See the study guide for the detailed answer →

219 **Rollback Strategy**: A production deployment introduced a bug that corrupted the Silver layer. Design the rollback process: code rollback, data recovery, and communication plan.

See the study guide for the detailed answer →

220 **Monitoring Dashboard**: Design a production monitoring dashboard for 50 Databricks pipelines. What metrics do you track? What alerting thresholds do you set? What tools do you use?

See the study guide for the detailed answer →

221 **What is the Lakehouse architecture and how does it differ from a Data Lake and a Data Warehouse?** (L1)

See the study guide for the detailed answer →

222 **Explain the Delta Lake transaction log and how it provides ACID guarantees.** (L2)

See the study guide for the detailed answer →

223 **Design a Medallion Architecture for [specific domain]. Walk through Bronze, Silver, Gold.** (L3)

See the study guide for the detailed answer →

224 **How does MERGE INTO work in Delta Lake? What are its performance pitfalls?** (L2)

See the study guide for the detailed answer →

225 **Implement SCD Type 2 using MERGE INTO.** (L3)

See the study guide for the detailed answer →

226 **What is Unity Catalog and how does it improve governance over Hive Metastore?** (L2)

See the study guide for the detailed answer →

227 **How do you handle CDC from a legacy database (Oracle/SQL Server) into Delta Lake?** (L3)

See the study guide for the detailed answer →

228 **Compare Auto Loader vs COPY INTO — when do you use each?** (L2)

See the study guide for the detailed answer →

229 **How do you optimize a slow-running Spark job? Walk through your debugging steps.** (L3)

See the study guide for the detailed answer →

230 **What is Z-ORDER and when would you use it? How does Liquid Clustering improve on it?** (L2)

See the study guide for the detailed answer →

231 **Design a GDPR-compliant data deletion pipeline in a Lakehouse.** (L3)

See the study guide for the detailed answer →

232 **How do you implement CI/CD for Databricks?** (L2)

See the study guide for the detailed answer →

233 **What is Photon and when should you enable it?** (L1)

See the study guide for the detailed answer →

234 **Explain the small file problem and how to solve it in Delta Lake.** (L2)

See the study guide for the detailed answer →

235 **How do you handle schema evolution in a streaming pipeline?** (L2)

See the study guide for the detailed answer →

236 **Your pipeline is failing intermittently in production. Walk through your debugging process.** (L3)

See the study guide for the detailed answer →

237 **How do you manage costs in Databricks? What strategies have you used?** (L2)

See the study guide for the detailed answer →

238 **What is Delta Live Tables and how does it compare to manual Structured Streaming?** (L2)

See the study guide for the detailed answer →

239 **Design a real-time analytics pipeline on Databricks.** (L3)

See the study guide for the detailed answer →

240 **How do you handle data quality in a Lakehouse architecture?** (L2)

See the study guide for the detailed answer →

241 What is Lakeflow Connect and how does it simplify ingestion? (L1)

See the study guide for the detailed answer →

242 Explain Databricks Apps — what are they and when would you use them? (L1)

See the study guide for the detailed answer →

243 What is Genie (natural language to SQL) and how does it fit into the Databricks ecosystem? (L1)

See the study guide for the detailed answer →

244 How do you build and deploy an LLM-powered application using Databricks? (L3)

See the study guide for the detailed answer →

245 What is the Databricks Marketplace and how do you publish/consume data products? (L2)

See the study guide for the detailed answer →

246 How does Mosaic AI integrate with the Lakehouse for ML/AI workflows? (L2)

See the study guide for the detailed answer →

247 What is UniForm in Delta Lake and why does it matter for interoperability? (L2)

See the study guide for the detailed answer →

248 How do you use Databricks system tables for cost monitoring and optimization? (L3)

See the study guide for the detailed answer →

249 Explain Serverless compute for jobs — how does it differ from provisioned clusters? (L2)

See the study guide for the detailed answer →

250 What is Predictive Optimization and how does it automate OPTIMIZE/VACUUM/ANALYZE? (L2)

See the study guide for the detailed answer →