🔺
Delta Lake
Delta Lake — Complete Interview Guide
🔺
🔺
Delta Lake · Section 1 of 1

Delta Lake — Complete Interview Guide

Delta Lake — Complete Interview Guide

⚠️ Common Trap
Priority: HIGHEST — Delta Lake is 30-40% of any Databricks interview Goal: After this page, you should NEVER struggle with a Delta Lake question again Approach: Every topic starts with WHY → WHAT → HOW → Interview trap to avoid

Memory Map

🧠 DELTA LAKE MASTERY → TACOVS
DELTA LAKE MASTERYTACOVS
──────────────────────────────
TTransaction Log (the brain of Delta)
AACID + Architecture (why Delta exists)
CCommands (MERGE, OPTIMIZE, VACUUM, Z-ORDER)
OOptimization (small files, data skipping, Liquid Clustering)
VVersioning (Time Travel, Clones, Recovery)
SSpecial Features (CDF, Deletion Vectors, UniForm, Sharing)

SECTION 0: DELTA LAKE vs DELTA TABLE — The #1 Confusion

💡 Interview Tip
Why this section exists: Most people mix up "Delta Lake" and "Delta Table." Interviewers LOVE testing this. Clear this confusion FIRST, everything else becomes easy.

The Simple Answer

Delta Lake = The TECHNOLOGY (the engine, the framework, the system)
Delta Table = ONE TABLE created using that technology
It's like:
MySQLDelta Lake (the system)
one tableone Delta Table (a table inside that system)

Real-World Analogy

Think of a LIBRARY SYSTEM:
📚 Delta Lake = The entire library management system
The rules (ACID transactions)
The catalogue system (transaction log)
The checkout/return process (reads/writes)
The version history (time travel)
📖 Delta Table = ONE book that the library manages
It follows the library's rules
It's tracked in the catalogue
It has checkout history
It can be restored to any past version
You don't say "I built a MySQL" — you say "I built a table IN MySQL"
You don't say "I built a Delta Lake" — you say "I built a Delta table USING Delta Lake"

Technical Difference — What Each Actually IS

📐 Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│                      DELTA LAKE (Technology)                 │
│                                                             │
│  ┌─────────────────┐  ┌─────────────────┐                  │
│  │  Delta Table:    │  │  Delta Table:    │                  │
│  │  orders          │  │  customers       │                  │
│  │                  │  │                  │                  │
│  │  📁 _delta_log/  │  │  📁 _delta_log/  │  ← each table   │
│  │  📄 data.parquet │  │  📄 data.parquet │    has its OWN   │
│  │                  │  │                  │    transaction   │
│  └─────────────────┘  └─────────────────┘    log            │
│                                                             │
│  Provides: ACID, Schema Enforcement, Time Travel,           │
│            MERGE, OPTIMIZE, VACUUM, Data Skipping           │
└─────────────────────────────────────────────────────────────┘

Side-by-Side Comparison

AspectDelta LakeDelta Table
What is it?Open-source storage layer/frameworkA single table stored in Delta format
Created byDatabricks (open-sourced in 2019)You, using CREATE TABLE ... USING DELTA
How many?ONE per environment (it's the technology)MANY — you create hundreds of Delta tables
ContainsThe protocol, the rules, the engineParquet data files + _delta_log/ folder
AnalogyThe operating system (Windows)One file on that operating system
Lives where?Runs inside Spark/Databricks runtimeStored on S3 / ADLS / GCS / DBFS
VersionDelta Lake 3.x (the protocol version)Table has its own version history (0, 1, 2, 3...)

How You Create a Delta Table (using Delta Lake)

python — editable
# Method 1: SQL
spark.sql("""
    CREATE TABLE orders (
        order_id INT,
        customer STRING,
        amount DOUBLE,
        order_date DATE
    ) USING DELTA                          -- ← tells Spark: "use Delta Lake format"
    LOCATION 's3://my-bucket/orders/'
""")

# Method 2: DataFrame write
df.write.format("delta").save("s3://my-bucket/orders/")
#                 ↑
#          this "delta" = use Delta Lake technology

# Method 3: In Databricks (default since DBR 8.0+)
spark.sql("CREATE TABLE orders (...)")
#   ↑ In Databricks, DELTA is the DEFAULT format
#     You don't even need to say USING DELTA!

# What gets created on disk:
# s3://my-bucket/orders/
# ├── _delta_log/                    ← Delta Lake adds this (transaction log)
# │   └── 00000000000000000000.json  ← First commit: "table created with schema..."
# ├── part-00000.snappy.parquet      ← Actual data (still Parquet!)
# └── part-00001.snappy.parquet      ← More data

The Key Insight — Delta Table = Parquet + Transaction Log

🗂️Regular Parquet file:
📄 data.parquet ← Just a file. No history. No ACID. No schema check.
Delta Table:
📁 my_table/
_delta_log/ ← THIS is what makes it "Delta"
00000.json ← Commit 0: created table
00001.json ← Commit 1: inserted 1000 rows
00002.json ← Commit 2: updated 50 rows
00003.json ← Commit 3: deleted 10 rows
part-00000.snappy.parquet ← Same Parquet format as before!
part-00001.snappy.parquet
part-00002.snappy.parquet
So Delta Table = Parquet files + _delta_log folder
Delta Lake = The technology that READS and WRITES that _delta_log

What Delta Lake GIVES to a Delta Table

Without Delta Lake (plain Parquet): With Delta Lake (Delta Table):
❌ No transactions ✅ ACID transactions
❌ Overwrite = data gone forever ✅ Time Travel (undo any change)
❌ Two writers = corrupt data ✅ Optimistic concurrency control
❌ Schema? What schema? ✅ Schema enforcement + evolution
❌ UPDATE one row = rewrite entire file ✅ MERGE handles insert/update/delete
❌ 10,000 small files = slow queries ✅ OPTIMIZE compacts files
❌ Scan everything for every query ✅ Data skipping (min/max stats)
❌ Delete data = rewrite files manually ✅ DELETE command + VACUUM cleanup

Interview Answers — What to Say

Q"What is Delta Lake?"

"Delta Lake is an open-source storage layer that sits on top of cloud storage like S3 or ADLS. It adds ACID transactions, schema enforcement, time travel, and scalable metadata handling to data lakes. Under the hood, data is still stored as Parquet files — Delta Lake adds a transaction log that tracks every change."

Q"What is a Delta Table?"

"A Delta table is a specific table stored in Delta format. It's a collection of Parquet data files plus a _delta_log folder that contains the transaction history. Every Delta table has its own independent transaction log."

Q"What's the difference between Delta Lake and Delta Table?"

"Delta Lake is the technology — the protocol and engine. A Delta table is one table that uses that technology. It's like MySQL vs a table in MySQL. Delta Lake provides the ACID guarantees, and each Delta table is an individual dataset that benefits from those guarantees."

What NOT to Say:

🧠 Memory Map
❌ "Delta Lake and Delta Table are the same thing"
→ No. Lake = technology, Table = one table using that technology
❌ "Delta Lake is a database"
→ No. It's a STORAGE LAYER. It doesn't have a query engine.
Spark/Databricks is the engine. Delta Lake is the format.
❌ "Delta Table is different from Parquet"
→ Partially wrong. Delta Table IS Parquet + transaction log.
The data files are still .parquet format.
❌ "Delta Lake stores data in its own format"
→ No. Data is standard Parquet. Only the _delta_log is Delta-specific.

Quick Memory Trick

Delta LAKE = the L is for LAYER (it's a storage LAYER / technology)
Delta TABLE = the T is for TABLE (it's one specific TABLE)
LAKELarge system, like a real lake (holds everything)
TABLESmall unit, like a table inside a restaurant (one specific thing)

SECTION 1: WHY DELTA LAKE EXISTS

Q01 — What is Delta Lake? Why do we need it?

Question: Your interviewer asks: "Why can't we just use Parquet files on a data lake? Why do we need Delta Lake?"

The Problem (Plain Parquet):

🗂️data_lake/
orders_2026_01.parquet ← File 1
orders_2026_02.parquet ← File 2
orders_2026_03.parquet ← File 3
PROBLEM 1: Two pipelines write to orders_2026_03.parquet at the same time
→ Data gets CORRUPTED. No transaction support.
PROBLEM 2: You accidentally overwrite orders_2026_02.parquet
→ Gone forever. No undo, no history, no time travel.
PROBLEM 3: A pipeline writes half the data and crashes
→ Half-written file stays. Readers see partial/garbage data.
PROBLEM 4: Source system adds a new column "discount_pct"
→ Old files don't have it. Schema mismatch everywhere.
PROBLEM 5: You want to UPDATE one row in a 2 GB Parquet file
→ You must rewrite the ENTIRE 2 GB file. Parquet is immutable.

The Solution (Delta Lake): Delta Lake adds a transaction log (_delta_log/) on top of Parquet files. This log tracks every change — which files were added, removed, or modified. It turns a chaotic collection of files into a proper table with ACID transactions, schema enforcement, and version history.

🗂️delta_table/
_delta_log/ ← THE BRAIN — tracks every change
00000000000000000000.json ← Commit 0: table created
00000000000000000001.json ← Commit 1: 1000 rows inserted
00000000000000000002.json ← Commit 2: 50 rows updated
00000000000000000010.checkpoint.parquet ← Checkpoint (summary)
_last_checkpoint ← Points to latest checkpoint
part-00000-abc123.snappy.parquet ← Actual data file 1
part-00001-def456.snappy.parquet ← Actual data file 2
part-00002-ghi789.snappy.parquet ← Actual data file 3

One-liner for interviews: "Delta Lake is an open-source storage layer that adds ACID transactions, scalable metadata handling, and time travel to existing data lakes built on Parquet."

Q02 — What are the key features of Delta Lake? (The Top 8)

Question: "List the main features of Delta Lake and explain why each matters."

#FeatureWhat It DoesWhy It Matters
1ACID TransactionsEvery write is atomic — all or nothingNo more corrupted tables from failed writes
2Time TravelQuery any historical version of the tableUndo mistakes, audit changes, debug issues
3Schema EnforcementRejects writes that don't match the table schemaPrevents bad data from entering your table
4Schema EvolutionAutomatically adds new columns when neededHandles evolving source systems gracefully
5MERGE (Upsert)INSERT + UPDATE + DELETE in one atomic operationThe #1 operation for data engineering pipelines
6Data SkippingSkips files that can't contain query resultsMakes queries 10-100x faster on large tables
7Unified Batch + StreamingSame table for both batch and streaming writesNo separate streaming tables needed
8Open FormatData stored as Parquet (open standard)No vendor lock-in — read with Spark, Trino, Presto, etc.

Interview Tip: Don't just list features — connect them to PROBLEMS they solve. "We needed MERGE because our booking system sends daily change feeds that contain inserts, updates, and cancellations — all in one file."

What NOT to Say: "Delta Lake is a database." — No, it's a STORAGE LAYER on top of cloud storage. The data still lives as Parquet files on ADLS/S3/GCS.

Q03 — How does Delta Lake ensure ACID transactions?

Question: "Walk me through how each ACID property is implemented in Delta Lake."

PropertyWhat It MeansHow Delta Does It
AtomicityAll changes in a transaction succeed, or none doEach commit writes ONE JSON file atomically using put-if-absent (cloud storage rename). If it fails, nothing changes.
ConsistencyData always follows schema rulesSchema enforcement rejects writes with wrong column types/names
IsolationConcurrent readers/writers don't interfereSnapshot isolation — readers see the table as it was when they started reading. Writers use optimistic concurrency (see Q05).
DurabilityCommitted data survives crashesData is stored as Parquet on cloud storage (S3/ADLS/GCS) — inherently durable

Follow-up they'll ask: "What happens if a write fails midway?"

🧠 Memory Map
Step 1: Spark writes new Parquet data files to storage ← files exist but NOT committed
Step 2: Spark tries to write the commit JSON to _delta_log ← THIS is the atomic step
If Step 2 FAILS:
→ The Parquet files from Step 1 are "orphaned" (exist but not referenced)
→ No commit JSON means Delta doesn't know about them
→ Next VACUUM will clean them up
→ Table state is unchanged — ZERO corruption
If Step 2 SUCCEEDS:
→ Transaction is committed
→ Readers will see the new data

Interview Tip: "The key insight is that the commit JSON file is the SINGLE point of atomicity. Data files can be partially written — it doesn't matter until the commit JSON makes them visible."

SECTION 2: TRANSACTION LOG DEEP DIVE

Q04 — What is the Delta Transaction Log (_delta_log)? What's inside each commit?

Question: "Explain the Delta transaction log in detail. What exactly is stored in each commit file?"

Each JSON commit file contains these actions:

json
// Example: 00000000000000000005.json (Commit #5)
{
  "commitInfo": {
    "timestamp": 1711670400000,
    "operation": "MERGE",
    "operationParameters": {"predicate": "t.id = s.id"},
    "operationMetrics": {
      "numTargetRowsInserted": "1500",
      "numTargetRowsUpdated": "300",
      "numTargetRowsDeleted": "50"
    }
  },
  "add": {
    "path": "part-00000-new-file.snappy.parquet",
    "size": 1073741824,
    "partitionValues": {"date": "2026-03-29"},
    "stats": "{\"numRecords\":50000,\"minValues\":{\"id\":1},\"maxValues\":{\"id\":50000}}"
  },
  "remove": {
    "path": "part-00000-old-file.snappy.parquet",
    "deletionTimestamp": 1711670400000,
    "dataChange": true
  }
}
ActionWhat It RecordsExample
addNew Parquet file was added to the tableAfter INSERT, MERGE, or OPTIMIZE
removeFile was logically deleted (still physically exists!)After UPDATE, DELETE, or OPTIMIZE
metaDataSchema change, table properties changedAfter ALTER TABLE or schema evolution
protocolReader/writer version requirementsAfter enabling new features (e.g., Deletion Vectors)
commitInfoWho, when, what operation, and metricsEvery commit has this

Critical detail: remove is LOGICAL, not physical

🧠 Memory Map
remove: "part-00000-old-file.snappy.parquet"
→ File is marked as "no longer part of current table state"
→ File STILL EXISTS on disk (for time travel!)
→ File is PHYSICALLY deleted only when VACUUM runs
→ Default: VACUUM deletes files older than 7 days

Interview Tip: "The transaction log is append-only — you never modify existing commit files, you only add new ones. This is what makes it safe for concurrent readers."

Q05 — What are checkpoint files? Why are they critical?

Question: "Without checkpoints, what happens when a Delta table has 100,000 commits?"

Problem without checkpoints:

🧠 Memory Map
To read the current table state at version 100,000:
→ Read 00000000000000000000.json (commit 0)
→ Read 00000000000000000001.json (commit 1)
→ Read 00000000000000000002.json (commit 2)
→ ... replay ALL 100,000 JSON files
→ VERY SLOW — just to know which Parquet files to read!

Solution — Checkpoints:

🧠 Memory Map
Every 10 commits (default), Delta creates a checkpoint file:
→ 00000000000000000010.checkpoint.parquet ← State at commit 10
→ 00000000000000000020.checkpoint.parquet ← State at commit 20
→ ...
→ 00000000000000099990.checkpoint.parquet ← State at commit 99,990
To read at version 100,000:
→ Read checkpoint at 99,990 (ONE Parquet file — fast!)
→ Replay only 10 JSON files (99,991 to 100,000)
→ Done! Instead of 100,000 files, we read ~11 files

Key details:

  • Checkpoint interval: delta.checkpointInterval (default: 10)
  • _last_checkpoint file points to the latest checkpoint
  • Checkpoints are in Parquet format (columnar, fast to read)
  • Multi-part checkpoints exist for very large tables (splits into multiple files)

Q06 — How does Optimistic Concurrency Control work?

Question: "Two pipelines write to the same Delta table at the same time. What happens?"

🗂️Pipeline A reads table at version 5 (fare update batch)
Pipeline B reads table at version 5 (booking cancellation batch)
Pipeline A finishes first:
→ Writes commit 6 to _delta_log → SUCCEEDS (first to commit)
Pipeline B tries to commit:
→ Attempts to write commit 6 → FAILS (file already exists!)
→ Delta automatically:
1. Reads commit 6 to see what Pipeline A changed
2. Checks for LOGICAL conflict:
Pipeline A touched files X, Y
Pipeline B touched files M, N
X,Y ∩ M,N = empty → NO CONFLICT
Pipeline B retries as commit 7 → SUCCEEDS ✓
→ But if both touched the SAME files:
CONFLICT → throws ConcurrentModificationException ✗
Automatic retry (up to 3 times by default, configurable)

Two isolation levels:

LevelDefault?BehaviorUse When
WriteSerializableYesConcurrent writes to different files succeedNormal ETL workloads
SerializableNoEven reads during writes can conflictStrict audit/compliance tables

How to AVOID conflicts in practice:

  1. Partition writes by date/region so different pipelines touch different files
  2. Add partition column to MERGE condition: ON t.id = s.id AND t.date = s.date
  3. Use Liquid Clustering to minimize file overlap

Interview Tip: "The word 'optimistic' is key — Delta assumes no conflict, lets both writers proceed, and only checks at commit time. This is efficient because most concurrent writes DON'T conflict."

SECTION 3: MERGE — THE MOST IMPORTANT COMMAND

Q07 — Write a MERGE from memory (upsert with all clauses)

Question: "Write a MERGE that handles inserts, updates, and deletes in one command."

This is the #1 coding question in Databricks interviews. Memorize it.

sql
-- MERGE = Compare source (new data) with target (existing table), then act
MERGE INTO orders_fact AS t                -- t = target (existing table)
USING orders_staging AS s                  -- s = source (new data)
ON t.order_id = s.order_id                -- Match condition

-- Case 1: Record exists AND is cancelled → delete it
WHEN MATCHED AND s.status = 'CANCELLED' THEN
    DELETE

-- Case 2: Record exists AND has changed → update it
WHEN MATCHED AND s.updated_at > t.updated_at THEN
    UPDATE SET
        t.status = s.status,
        t.amount = s.amount,
        t.updated_at = s.updated_at

-- Case 3: Record doesn't exist in target → insert it
WHEN NOT MATCHED THEN
    INSERT (order_id, customer_id, status, amount, created_at, updated_at)
    VALUES (s.order_id, s.customer_id, s.status, s.amount, s.created_at, s.updated_at)

-- Case 4: Record in target but NOT in source → clean up (Databricks extension)
WHEN NOT MATCHED BY SOURCE AND t.status = 'PENDING' THEN
    DELETE

PySpark equivalent:

python — editable
from delta.tables import DeltaTable

target = DeltaTable.forName(spark, "orders_fact")

target.alias("t").merge(
    source_df.alias("s"),
    "t.order_id = s.order_id"
).whenMatchedDelete(
    condition="s.status = 'CANCELLED'"
).whenMatchedUpdate(
    condition="s.updated_at > t.updated_at",
    set={"status": "s.status", "amount": "s.amount", "updated_at": "s.updated_at"}
).whenNotMatchedInsertAll(
).execute()

Q08 — How do you handle duplicate keys in source during MERGE?

Question: "Your staging table has duplicate order_ids. MERGE throws an error. How do you fix it?"

Why it fails: MERGE requires each target row to match AT MOST one source row. Duplicate keys = ambiguous → error.

The fix — deduplicate source with ROW_NUMBER:

sql
WITH deduped_source AS (
    SELECT *,
        ROW_NUMBER() OVER (
            PARTITION BY order_id           -- Group duplicates together
            ORDER BY updated_at DESC        -- Keep the LATEST record
        ) AS rn
    FROM orders_staging
)
MERGE INTO orders_fact AS t
USING (SELECT * FROM deduped_source WHERE rn = 1) AS s    -- Only latest per key
ON t.order_id = s.order_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

Why duplicates happen in real life:

  • Kafka retry delivered the same event twice
  • Source system sent the same file twice
  • Backfill job reprocessed already-loaded data

Q09 — How to make MERGE faster? (6 techniques)

Question: "Our MERGE takes 3 hours on a 500 GB table. How do you optimize it?"

Why MERGE is slow: It scans the ENTIRE target to find matching records. 500 GB table + 10K source rows = reads all 500 GB just to match 10K rows.

1. Add partition column to match condition:

sql
-- BAD: Scans ALL data across ALL dates
MERGE INTO orders t USING staging s
ON t.order_id = s.order_id                              -- Scans 500 GB!

-- GOOD: Only scans matching date partitions
MERGE INTO orders t USING staging s
ON t.order_id = s.order_id AND t.order_date = s.order_date  -- Scans ~2 GB!

2. Z-ORDER or Liquid Cluster on merge key:

sql
OPTIMIZE orders ZORDER BY (order_id);
-- Data skipping now works for MERGE — skips irrelevant files

3. Filter unchanged rows from source:

sql
MERGE INTO orders t
USING (
    SELECT s.* FROM staging s
    LEFT JOIN orders t ON s.order_id = t.order_id
    WHERE t.order_id IS NULL OR s.hash_value != t.hash_value
) AS s
ON t.order_id = s.order_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

4. Broadcast small source:

python — editable
100 MB → broadcast to all workers
    "t.order_id = s.order_id"
)">from pyspark.sql.functions import broadcast
target.alias("t").merge(
    broadcast(source_df).alias("s"),     # Source < 100 MB → broadcast to all workers
    "t.order_id = s.order_id"
)

5. Compact target first:

sql
OPTIMIZE orders;  -- 50,000 small files → 500 large files → fewer tasks

6. Use Photon runtime:

  • Select "Photon" when creating cluster — 3-5x faster for MERGE
  • No code changes needed

Q10 — MERGE with Schema Evolution

Question: "Source data has a new column that doesn't exist in the target. How do you handle it?"

sql
-- Enable auto schema merge
SET spark.databricks.delta.schema.autoMerge.enabled = true;

-- MERGE as normal — new columns from source auto-added to target
MERGE INTO customers t
USING staging s
ON t.customer_id = s.customer_id
WHEN MATCHED THEN UPDATE SET *       -- * includes new columns
WHEN NOT MATCHED THEN INSERT *       -- New column auto-added to target schema

When to use: Source systems add new fields over time (very common). When NOT to use: Strict regulatory tables where schema changes need approval.

SECTION 4: OPTIMIZE, VACUUM, AND THE SMALL FILE PROBLEM

Q11 — What is OPTIMIZE? When is it good? When is it bad?

Question: "Explain OPTIMIZE. What are its trade-offs?"

What it does: Combines many small files into fewer large files (~1 GB each).

sql
-- Basic OPTIMIZE
OPTIMIZE orders;
-- Example: 10,000 files × 1 MB each → 10 files × 1 GB each

-- OPTIMIZE specific partition
OPTIMIZE orders WHERE order_date >= '2026-03-01';

-- OPTIMIZE with Z-ORDER (organize data within files)
OPTIMIZE orders ZORDER BY (customer_id, order_date);

GOOD things about OPTIMIZE:

BenefitWhy
Faster readsFewer files = fewer I/O operations = faster queries
Better data skippingSorted data has tighter min/max ranges per file
Faster MERGEFewer files = fewer tasks for Spark
Non-destructiveOld files still exist (for time travel) until VACUUM

BAD things about OPTIMIZE:

DownsideWhy
Rewrites ALL filesEven unchanged files get rewritten (Z-ORDER does full rewrite)
Expensive for large tables1 TB table = reads 1 TB + writes 1 TB = 2 TB of I/O
Doubles storage temporarilyOld files + new files exist until VACUUM
Not incremental (with Z-ORDER)OPTIMIZE without Z-ORDER is incremental in newer versions
Blocks concurrent writes brieflyTakes a commit lock during the commit phase

Best practices:

WHEN to OPTIMIZE:
✅ After large batch writes (ETL job just loaded 10M rows)
✅ Before heavy read workloads (BI dashboard refresh coming)
✅ When MERGE is slow (too many small files in target)
✅ Daily maintenance window (schedule nightly)
WHEN NOT to OPTIMIZE:
❌ After every micro-batch in streaming (use autoCompact instead)
❌ On tables with very few files already
❌ During peak read hours (it competes for resources)

Interview Tip: "OPTIMIZE is a WRITE operation that benefits READS. Run it during low-traffic windows. For streaming tables, use autoOptimize table property instead of manual OPTIMIZE."

Q12 — What is VACUUM? What are the risks?

Question: "What does VACUUM do? Can I run VACUUM RETAIN 0 HOURS? What happens?"

What VACUUM does:

🗂️BEFORE VACUUM:
orders/
_delta_log/
part-00000-NEW.parquet ← Current (referenced by latest commit)
part-00001-NEW.parquet ← Current
part-00000-OLD.parquet ← Old (was replaced by OPTIMIZE 10 days ago)
part-00001-OLD.parquet ← Old (was replaced by UPDATE 8 days ago)
AFTER VACUUM (7-day retention):
orders/
_delta_log/
part-00000-NEW.parquet ← Kept (current)
part-00001-NEW.parquet ← Kept (current)
part-00000-OLD.parquet ← DELETED (older than 7 days, unreferenced)
part-00001-OLD.parquet ← DELETED (older than 7 days, unreferenced)
sql
-- Default: delete files older than 7 days
VACUUM orders;

-- Custom retention
VACUUM orders RETAIN 168 HOURS;    -- 168 hours = 7 days

-- DANGEROUS — delete ALL old files immediately
SET spark.databricks.delta.retentionDurationCheck.enabled = false;
VACUUM orders RETAIN 0 HOURS;

Risks of VACUUM RETAIN 0 HOURS:

RiskExplanation
Breaks ALL time travelCan't query any previous version — old files are gone
Active readers may failA running query that started before VACUUM may reference deleted files → FileNotFoundException
No recoveryDeleted files are gone permanently — no undo
Concurrent write failuresWriters that started with old versions lose their reference files

Rule: NEVER use RETAIN 0 HOURS in production. Default 7 days is a safety net.

OPTIMIZE vs VACUUM — they work as a pair:

Step 1: OPTIMIZECreates NEW large files, marks OLD small files as "removed"
Step 2: VACUUMPhysically DELETES the OLD files marked as "removed"
Without VACUUM after OPTIMIZE: Storage keeps growing (old + new files both exist)
Without OPTIMIZE before VACUUM: Nothing to clean up (no old files to remove)

Q13 — The Small File Problem — Root Causes and Solutions

Question: "Your Delta table has 50,000 small files (each <1 MB). Queries take 45 minutes. How do you fix it?"

Why small files are bad:

🧠 Memory Map
50,000 files × 1 MB each = 50 GB of data
Each file requires:
→ 1 Spark task to read
→ 1 metadata operation on cloud storage
→ Scheduler overhead
50,000 tasks vs 50 tasks (with 1 GB files) = ~100x slower

Root causes of small files:

CauseExample
Streaming micro-batchesProcessing every 10 seconds → 6 files/minute → 8,640 files/day
Over-partitioningPARTITIONED BY (date, hour, region, category) → millions of tiny partitions
Frequent small appends100-row inserts every minute
Too many shuffle partitionsspark.sql.shuffle.partitions = 200 writes 200 files per write

Immediate fix:

sql
-- Step 1: Compact files now
OPTIMIZE slow_table;
-- 50,000 × 1 MB → ~50 × 1 GB

-- Step 2: Clean up old files
VACUUM slow_table;

-- Step 3: Refresh statistics
ANALYZE TABLE slow_table COMPUTE STATISTICS;

Prevent future small files:

sql
-- Option A: Auto-optimization table properties
ALTER TABLE my_table SET TBLPROPERTIES (
    'delta.autoOptimize.optimizeWrite' = 'true',     -- Coalesce during write
    'delta.autoOptimize.autoCompact' = 'true'          -- Mini-OPTIMIZE after each write
);

-- Option B: For streaming — increase trigger interval
# .trigger(processingTime="5 minutes")   -- Instead of "10 seconds"

-- Option C: For new tables — use Liquid Clustering
CREATE TABLE orders (...) CLUSTER BY (order_date, customer_id);
-- Handles compaction automatically

-- Option D: Reduce shuffle partitions
spark.conf.set("spark.sql.shuffle.partitions", "auto")  -- Adaptive (AQE)

SECTION 5: DATA ORGANIZATION — PARTITIONING vs Z-ORDER vs LIQUID CLUSTERING

Q14 — Partitioning — When is it good? When is it bad?

Question: "When should you partition a Delta table? When should you NOT?"

sql
-- Creating a partitioned table
CREATE TABLE bookings (
    booking_id LONG,
    passenger_name STRING,
    booking_date DATE,
    amount DECIMAL(10,2)
) PARTITIONED BY (booking_date);

GOOD — Use partitioning when:

  • Column has low cardinality (few unique values): date, country, region
  • Each partition has at least 1 GB of data
  • Most queries filter on the partition column
  • Example: PARTITIONED BY (year) where each year has 50 GB+ of data

BAD — Don't partition when:

  • Column has high cardinality (many unique values): user_id, order_id
  • Partitions are too small (< 1 GB each) → small file problem
  • You filter on multiple columns → partitioning only helps ONE column
  • Example: PARTITIONED BY (customer_id) with 10 million customers = disaster

The 1 GB rule:

Total data ÷ Number of unique partition values = Partition size
Example: 500 GB table, partitioned by date, 365 days/year
500 GB ÷ 365 = ~1.4 GB per partition ✅ Good!
Example: 500 GB table, partitioned by customer_id, 1 million customers
500 GB ÷ 1,000,000 = 0.5 MB per partition ❌ Terrible!

Q15 — Z-ORDER — How does it work internally?

Question: "What is Z-ORDER? How is it different from sorting? How many columns can you Z-ORDER on?"

What Z-ORDER does: Regular sorting (ORDER BY) sorts by one column, then another. Z-ORDER uses a space-filling curve to interleave multiple columns, so data is co-located on ALL specified columns simultaneously.

sql
-- Z-ORDER is always used WITH OPTIMIZE
OPTIMIZE orders ZORDER BY (customer_id, order_date);

How it helps — data skipping becomes effective:

📋 Overview
WITHOUT Z-ORDER:
File 1: customer_id [1-1000000], order_date [2025-01-01 to 2026-12-31]
File 2: customer_id [1-1000000], order_date [2025-01-01 to 2026-12-31]
→ Every file has ALL customer_ids and ALL dates
→ Query WHERE customer_id = 42must read ALL files (no skipping!)
WITH Z-ORDER on (customer_id):
File 1: customer_id [1-10000]
File 2: customer_id [10001-20000]
→ Query WHERE customer_id = 42reads only File 1 (skips rest!)

Limitations of Z-ORDER:

LimitationExplanation
Max 4 columns effectiveMore columns = less effective (diminishing returns)
Full rewrite every timeOPTIMIZE ZORDER rewrites ALL files — expensive on large tables
Not incrementalCan't Z-ORDER just the new data — must redo everything
Manual operationMust run OPTIMIZE manually or schedule it
Can't change columns easilyChanging Z-ORDER columns requires full rewrite

Q16 — Liquid Clustering — The Modern Replacement

Question: "What is Liquid Clustering? Why does Databricks say it replaces both partitioning AND Z-ORDER?"

sql
-- Create table with Liquid Clustering
CREATE TABLE orders (
    order_id LONG,
    customer_id LONG,
    order_date DATE,
    amount DECIMAL(10,2)
) CLUSTER BY (order_date, customer_id);

-- Change clustering columns anytime — NO full rewrite!
ALTER TABLE orders CLUSTER BY (region, order_date);

-- OPTIMIZE is now INCREMENTAL — only processes new/changed data
OPTIMIZE orders;

Why Liquid Clustering is better:

AspectPartitioningZ-OrderingLiquid Clustering
How it worksSeparate folders per valueSorts data within filesAuto-organizes data incrementally
Good forLow cardinality onlyHigh cardinalityAny cardinality
Applied whenOn write (rigid)Manual OPTIMIZEAutomatically on writes
IncrementalN/ANo (full rewrite)Yes (only new/changed data)
Change columnsRequires full table rewriteMust re-OPTIMIZE everythingJust ALTER TABLE
Small file handlingCan cause small filesDoesn't helpPrevents small files
Replaces others?NoYes — replaces BOTH

When to use what (2026 recommendation):

🧠 NEW TABLES → Always use Liquid Clustering
NEW TABLESAlways use Liquid Clustering
EXISTING TABLESMigrate to Liquid Clustering when possible
LEGACYKeep partitioning if working fine, don't fix what's not broken

Interview Tip: "For new tables, I always use CLUSTER BY instead of PARTITIONED BY. It's incremental, handles any cardinality, and I can change the clustering columns without rewriting the table."

Q17

Question: "How does Delta Lake skip files during queries? What are file-level statistics?"

sql
SELECT * FROM orders WHERE order_date = '2026-03-15' AND customer_id = 42

Behind the scenes:

📋 Overview
Delta reads statistics from the transaction log (NOT the actual files):
File A: min(order_date) = 2026-01-01, max = 2026-01-31, min(customer_id) = 1, max = 10000
→ Can order_date = '2026-03-15' be here? NOSKIP ✅
File B: min(order_date) = 2026-03-01, max = 2026-03-31, min(customer_id) = 1, max = 100
→ Can order_date = '2026-03-15' be here? YES
→ Can customer_id = 42 be here? YESREAD this file
File C: min(order_date) = 2026-03-01, max = 2026-03-31, min(customer_id) = 5000, max = 10000
→ Can order_date = '2026-03-15' be here? YES
→ Can customer_id = 42 be here? NO (42 < 5000) → SKIP ✅
Result: Read 1 file instead of 33x faster!

Key details:

  • Stats stored for the first 32 columns by default
  • Config: delta.dataSkippingNumIndexedCols (default 32)
  • Move frequently filtered columns to the first 32 positions in your schema
  • Works BEST when data is sorted/clustered (Z-ORDER or Liquid Clustering)
  • Stats include: min, max, null count, row count per file

Why Z-ORDER/Clustering improves data skipping:

UNSORTED data: File min=1, max=1000000 (wide range → can't skip anything)
SORTED data: File min=1, max=10000 (narrow range → skip most files!)

SECTION 6: TIME TRAVEL, CLONES, AND RECOVERY

Q18 — Time Travel — Query any version of your table

Question: "How does time travel work? Show me all the ways to use it."

sql
-- METHOD 1: Query by version number
SELECT * FROM orders VERSION AS OF 5;
SELECT * FROM orders@v5;                   -- Shorthand

-- METHOD 2: Query by timestamp
SELECT * FROM orders TIMESTAMP AS OF '2026-03-15 10:30:00';

-- See full history of changes
DESCRIBE HISTORY orders;
-- Shows: version, timestamp, operation, user, metrics

-- RESTORE: Roll back the entire table to a previous version
RESTORE TABLE orders TO VERSION AS OF 5;
-- Creates a NEW version (non-destructive) — you can undo the restore too!

-- Compare two versions
SELECT * FROM orders@v10 EXCEPT ALL SELECT * FROM orders@v5;
-- Shows rows that exist in v10 but not v5

Limits:

  • Data retention: 7 days default (VACUUM deletes files older than this)
  • Log retention: 30 days default (DESCRIBE HISTORY works for 30 days)
  • After VACUUM, time travel is broken for vacuumed versions

Interview Tip: "Time travel works because Delta keeps old Parquet files around. VACUUM is what deletes them. So time travel range = VACUUM retention period."

Q19 — Scenario: Accidental DELETE — How to recover?

Question: "Someone ran DELETE FROM customers WHERE region = 'APAC' by mistake. 2 million rows deleted. How do you recover?"

sql
-- Step 1: Find the version BEFORE the accidental delete
DESCRIBE HISTORY customers;
-- Look for the DELETE operation — say it was at version 42
-- We want version 41 (just before the delete)

-- Option A: FULL RESTORE (simple but blunt)
RESTORE TABLE customers TO VERSION AS OF 41;
-- Table looks exactly like version 41
-- BUT: any legitimate changes between 42 and now are also lost!

-- Option B: SELECTIVE RESTORE (surgical — better answer!)
MERGE INTO customers AS target
USING customers VERSION AS OF 41 AS source
ON target.customer_id = source.customer_id
WHEN NOT MATCHED THEN INSERT *;
-- Only inserts rows that were deleted (exist in v41 but not current)
-- Preserves any legitimate changes that happened after the delete

Interview Tip: "Option B is the better answer — it shows you understand MERGE + time travel together, and it's surgical (doesn't overwrite changes made after the accidental delete)."

Q20 — Deep Clone vs Shallow Clone

Question: "What are clones? When would you use deep vs shallow?"

AspectDeep CloneShallow Clone
Copies data?YES — full independent copyNO — references source files
Independent?Fully independentDepends on source files
SpeedSlow (copies all data)Fast (copies only metadata)
Use caseProduction backups, migrationTesting, experimentation
VACUUM safe?Yes — owns its dataNO — vacuuming source breaks clone
Storage cost2x (full copy)Minimal (metadata only)
sql
-- Deep clone: Full independent copy
CREATE TABLE orders_backup DEEP CLONE orders;
-- Subsequent runs are INCREMENTAL (only copies changes since last clone)

-- Shallow clone: Fast reference copy
CREATE TABLE orders_test SHALLOW CLONE orders;
-- Perfect for testing — make changes without affecting production
-- WARNING: If source is VACUUMed, shallow clone may break!

SECTION 7: SPECIAL FEATURES

Q21

Question: "How do you track what rows changed in a Delta table? How does downstream consume only the changes?"

sql
-- Step 1: Enable Change Data Feed on the table
ALTER TABLE orders SET TBLPROPERTIES ('delta.enableChangeDataFeed' = 'true');

-- Step 2: Read only the changes between two versions
SELECT * FROM table_changes('orders', 5, 10);
-- Returns ALL changes between version 5 and 10 with metadata:

What CDF returns:

+----------+--------+--------+------------------+----------------+
| order_id | amount | status | _change_type | _commit_version|
+----------+--------+--------+------------------+----------------+
| 1001 | 99.99 | SHIPPED| update_postimage | 7 |
| 1001 | 89.99 | PENDING| update_preimage | 7 |
| 1002 | 149.00 | NEW | insert | 8 |
| 1003 | 25.00 | CANCEL | delete | 9 |
+----------+--------+--------+------------------+----------------+
Change TypeMeaning
insertNew row was added
update_preimageRow BEFORE the update (old values)
update_postimageRow AFTER the update (new values)
deleteRow was deleted

Why CDF matters for data engineering:

WITHOUT CDF
SilverGold: Read ALL 10 million Silver rows, compare, find changes
Time: 45 minutes
WITH CDF
SilverGold: Read only 5,000 changed rows since last run
Time: 30 seconds

PySpark to read changes:

python — editable
# Read changes since version 5
changes_df = spark.read.format("delta") \
    .option("readChangeFeed", "true") \
    .option("startingVersion", 5) \
    .table("orders")

# Read changes since a timestamp
changes_df = spark.read.format("delta") \
    .option("readChangeFeed", "true") \
    .option("startingTimestamp", "2026-03-15") \
    .table("orders")

Interview Tip: "CDF is essential for incremental ETL in medallion architecture. Instead of reprocessing all Silver data to update Gold, we read only the changes. This reduces pipeline runtime from hours to minutes."

Q22 — Deletion Vectors — Faster DELETEs and UPDATEs

Question: "How do Deletion Vectors improve UPDATE/DELETE performance?"

Without Deletion Vectors:

📋 Overview
UPDATE orders SET status = 'SHIPPED' WHERE order_id = 42;
Step 1: Find the 1 GB file containing order_id = 42
Step 2: Read ALL rows from that file (1 million rows)
Step 3: Change 1 row
Step 4: Write a NEW 1 GB file with the change
Step 5: Mark old file as "removed" in transaction log
Total I/O: Read 1 GB + Write 1 GB = 2 GB of I/O for ONE row change!

With Deletion Vectors:

📋 Overview
UPDATE orders SET status = 'SHIPPED' WHERE order_id = 42;
Step 1: Find the file containing order_id = 42
Step 2: Write a tiny Deletion Vector file (~bytes) saying "row #47 in file X is deleted"
Step 3: Write a tiny new file with just the updated row
Total I/O: ~KB instead of 2 GB!
The original file is untouched — cleanup happens during OPTIMIZE later
sql
-- Enable deletion vectors
ALTER TABLE orders SET TBLPROPERTIES (
    'delta.enableDeletionVectors' = 'true'
);

Trade-offs:

AspectWithout DVWith DV
Write speed (UPDATE/DELETE)Slow (full file rewrite)Fast (tiny marker file)
Read speedNormalSlightly slower (must check DV files)
StorageCleanAccumulates DV files until OPTIMIZE
Best forSmall tablesLarge tables with frequent updates

Interview Tip: "Deletion Vectors change the write pattern from copy-on-write to a more efficient mark-and-sweep approach. The actual file rewrite is deferred to OPTIMIZE, which runs during off-peak hours."

Q23 — Delta Sharing — Share data without copying

Question: "How can you share Delta tables with external organizations without copying data?"

How it works:

📐 Architecture Diagram
Provider (your org)                     Recipient (partner org)
┌─────────────────┐                    ┌─────────────────┐
│ Delta Table on  │                    │ Any client:     │
│ your storage    │───── REST API ────→│ - Spark         │
│ (S3/ADLS/GCS)   │   (open protocol) │ - pandas        │
└─────────────────┘                    │ - Power BI      │
       ↑                               │ - Databricks    │
  Data stays HERE                      └─────────────────┘
  (never copied)                        Reads directly from
                                        your storage
python — editable
# Recipient reads shared data — one line!
df = spark.read.format("deltaSharing").load("profile.json#share.schema.table")

Key benefits:

  • No data copying — recipient reads directly from your storage
  • Open protocol — not locked to Databricks (works with any client)
  • Provider controls access — revoke anytime, audit all reads
  • Live data — recipient always sees the latest version

Q24 — UniForm — One table, all formats

Question: "What is UniForm? Why does it matter?"

Problem: Different tools expect different table formats:

  • Databricks → Delta Lake
  • Snowflake → Iceberg
  • Trino/Presto → Hudi or Iceberg
  • BigQuery → Iceberg

UniForm solution: Write once as Delta, automatically generates Iceberg and Hudi metadata.

sql
-- Enable UniForm on a table
ALTER TABLE orders SET TBLPROPERTIES (
    'delta.universalFormat.enabledFormats' = 'iceberg'
);
-- Now this Delta table can be read as an Iceberg table by Snowflake, Trino, etc.
-- Data is NOT duplicated — only metadata is generated in Iceberg format

Why it matters:

  • One copy of data, readable by ALL engines
  • No ETL to copy data between formats
  • Eliminates "format wars" (Delta vs Iceberg vs Hudi)

Q25 — Schema Evolution — All the options

Question: "How does Delta handle schema changes? What options are available?"

Option 1: Add new columns (mergeSchema):

python — editable
df.write.format("delta") \
    .mode("append") \
    .option("mergeSchema", "true") \
    .saveAsTable("orders")
# New columns in df are auto-added to the table schema
# Existing columns are preserved

Option 2: Replace entire schema (overwriteSchema):

python — editable
df.write.format("delta") \
    .mode("overwrite") \
    .option("overwriteSchema", "true") \
    .saveAsTable("orders")
# Completely replaces the schema — destructive!

Option 3: Rename/drop columns (requires column mapping):

sql
-- Enable column mapping first
ALTER TABLE orders SET TBLPROPERTIES ('delta.columnMapping.mode' = 'name');

-- Now you can rename and drop
ALTER TABLE orders RENAME COLUMN old_name TO new_name;
ALTER TABLE orders DROP COLUMN unused_column;

Option 4: Type widening (Delta 4.0+):

sql
-- Change column type without rewriting data
ALTER TABLE orders SET TBLPROPERTIES ('delta.enableTypeWidening' = 'true');
ALTER TABLE orders ALTER COLUMN amount TYPE DECIMAL(20,2);
-- INT → BIGINT, FLOAT → DOUBLE, etc. — NO full rewrite needed!

Q26 — Important Delta Table Properties (Know These!)

Question: "What are the most important Delta table properties?"

sql
ALTER TABLE orders SET TBLPROPERTIES (

    -- AUTO OPTIMIZATION: Fix small files automatically
    'delta.autoOptimize.optimizeWrite' = 'true',
    -- Coalesces small output files during write

    'delta.autoOptimize.autoCompact' = 'true',
    -- Runs mini-OPTIMIZE after each write

    -- CHANGE DATA FEED: Track row-level changes for incremental ETL
    'delta.enableChangeDataFeed' = 'true',

    -- DELETION VECTORS: Faster UPDATE/DELETE
    'delta.enableDeletionVectors' = 'true',

    -- COLUMN MAPPING: Enable rename/drop columns
    'delta.columnMapping.mode' = 'name',

    -- RETENTION: How long to keep old data
    'delta.logRetentionDuration' = 'interval 30 days',
    'delta.deletedFileRetentionDuration' = 'interval 7 days',

    -- TYPE WIDENING: Allow type changes without rewrite
    'delta.enableTypeWidening' = 'true',

    -- UNIFORM: Auto-generate Iceberg metadata
    'delta.universalFormat.enabledFormats' = 'iceberg'
);

SECTION 8: DELTA LAKE vs COMPETITORS

Q27 — Delta Lake vs Apache Iceberg vs Apache Hudi

Question: "How does Delta Lake compare to Iceberg and Hudi? Why would you choose one over another?"

AspectDelta LakeApache IcebergApache Hudi
Created byDatabricksNetflix → ApacheUber → Apache
Best withDatabricks, SparkSnowflake, Trino, Spark, FlinkSpark, Flink
ACIDYesYesYes
Time TravelYesYesYes (limited)
MERGE performanceExcellent (Photon)GoodGood (MoR tables)
StreamingExcellentGoodExcellent
Schema EvolutionGoodExcellent (best)Good
Partition EvolutionVia Liquid ClusteringNative (hidden partitioning)Manual
Multi-engine supportUniForm bridges gapBest (truly engine-agnostic)Good
Community/EcosystemLarge (Databricks-led)Fastest growingSmaller
Vendor independenceDatabricks-optimizedMost vendor-neutralLess common

When to choose each:

🧠 Memory Map
Delta LakeYou're on Databricks (obvious choice — best integration)
IcebergMulti-engine environment (Snowflake + Spark + Trino)
HudiHeavy streaming upsert workloads (Uber-style CDC)

Interview Tip: "Don't bash other formats. Say: 'Delta is the natural choice on Databricks because of deep integration with Photon, Unity Catalog, and Predictive Optimization. But UniForm lets us expose tables as Iceberg for teams using Snowflake.'"

Q28 — Managed vs External Tables

Question: "What is the difference? When do you use each?"

AspectManaged TableExternal Table
Data locationDatabricks-managed (auto)YOUR storage path (you specify)
DROP TABLEDeletes metadata AND dataDeletes metadata ONLY — data survives
Predictive OptimizationYes (automatic)No
Best forMost tables (default)Shared data, legacy migration
sql
-- Managed (recommended for new tables)
CREATE TABLE catalog.schema.orders (order_id LONG, amount DECIMAL);
-- Data stored in Databricks-managed location
-- DROP TABLE deletes everything

-- External (for data shared with other systems)
CREATE TABLE catalog.schema.legacy_orders (order_id LONG, amount DECIMAL)
LOCATION 'abfss://container@storage.dfs.core.windows.net/legacy/orders/';
-- Data stays at YOUR path even if table is dropped

SECTION 9: PRODUCTION SCENARIOS & GOTCHAS

Q29 — Predictive Optimization — Auto-maintenance

Question: "What is Predictive Optimization? How does it work?"

Predictive Optimization is Databricks' auto-maintenance system. It watches your tables and automatically runs OPTIMIZE, VACUUM, and ANALYZE TABLE at the right time.

Key points:

  • Enabled by default on all new Unity Catalog managed tables
  • Learns access patterns → optimizes during low-traffic windows
  • Automatically runs: OPTIMIZE (compaction), VACUUM (cleanup), ANALYZE TABLE (refresh stats)
  • No configuration needed — just use managed tables
  • Not available on external tables

Interview Tip: "With Predictive Optimization, I no longer need to schedule nightly OPTIMIZE/VACUUM jobs. The platform handles it. This is one reason I prefer managed tables over external tables."

Q30 — Scenario: Design a Delta Lake pipeline for a real-time order system

Question: "Design a medallion architecture for an e-commerce order system using Delta Lake."

📐 Architecture Diagram
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   BRONZE    │     │   SILVER    │     │    GOLD     │
│  Raw data   │────→│  Cleaned    │────→│ Aggregated  │
│  as-is      │     │  validated  │     │ business    │
└─────────────┘     └─────────────┘     └─────────────┘

BRONZE LAYER:
  - Auto Loader ingests raw JSON/CSV from cloud storage
  - CLUSTER BY (ingest_date)
  - Schema: raw_payload STRING, source STRING, ingest_timestamp TIMESTAMP
  - enableChangeDataFeed = true (for Silver to read changes)

SILVER LAYER:
  - Reads BRONZE changes via CDF (incremental!)
  - MERGE to deduplicate and validate
  - CLUSTER BY (order_date, customer_id)
  - Schema enforcement + data quality checks (expectations)
  - enableChangeDataFeed = true (for Gold to read changes)

GOLD LAYER:
  - Reads SILVER changes via CDF
  - Aggregations: daily revenue, customer metrics, product performance
  - Optimized for BI queries (pre-computed)
  - CLUSTER BY (report_date)

Key decisions in this design:

  • Liquid Clustering everywhere (not partitioning)
  • CDF enabled at each layer for incremental processing
  • Managed tables for Predictive Optimization
  • Auto Loader for exactly-once ingestion
  • MERGE for deduplication at Silver layer

Q31 — Common production gotchas with Delta Lake

Question: "What are the most common issues you've seen with Delta Lake in production?"

GotchaSymptomFix
Small file problemQueries slow, too many Spark tasksOPTIMIZE + autoOptimize properties
VACUUM too aggressiveTime travel broken, readers fail with FileNotFoundExceptionKeep default 7-day retention, never RETAIN 0
MERGE without partition filterMERGE takes hours on large tablesAdd partition column to ON clause
Over-partitioningThousands of tiny files per partitionSwitch to Liquid Clustering
Schema mismatchWrite fails with schema errorUse mergeSchema or overwriteSchema as appropriate
Concurrent write conflictsConcurrentModificationExceptionPartition writes so pipelines touch different files
Stale statisticsData skipping not working, queries slowANALYZE TABLE or enable Predictive Optimization
Z-ORDER on too many columnsNo improvement in query speedMax 4 columns, choose most filtered ones
Not using PhotonMERGE and queries 3-5x slower than neededSwitch cluster to Photon runtime
Streaming + OPTIMIZE conflictOPTIMIZE blocks streaming writesUse autoCompact instead of manual OPTIMIZE

SECTION 10: NEW FEATURES (2025-2026)

Q32 — What's new in Delta Lake 4.x?

Question: "What are the latest features? Mention 2-3 to show you're up to date."

FeatureVersionWhat It Does
Variant Data Type4.0Store semi-structured JSON without schema — query nested fields directly
Type Widening4.0Change column type (INT → BIGINT) without rewriting data
Coordinated Commits4.0Multiple engines can safely write to the same Delta table
UniForm4.0Auto-generate Iceberg/Hudi metadata for cross-engine reads
Conflict-Free DV4.1Enable Deletion Vectors without blocking concurrent writers
Atomic CTAS4.1CREATE TABLE AS SELECT is fully atomic (no partial tables on failure)
Lakebase2026Serverless PostgreSQL-compatible database inside Databricks

Interview power move: "I'd use Liquid Clustering with Deletion Vectors enabled and CDF turned on — that gives me automatic compaction, fast updates, and incremental ETL. For cross-team sharing with Snowflake users, I'd enable UniForm so they read the same table as Iceberg."

QUICK REVISION CHECKLIST

Can you answer each in 2-3 minutes?

  • Why do we need Delta Lake over plain Parquet? (Q01)
  • List 8 key features of Delta Lake (Q02)
  • How does Delta ensure ACID? What happens if a write fails midway? (Q03)
  • What's inside a transaction log commit? (Q04)
  • What are checkpoints and why are they critical? (Q05)
  • How does optimistic concurrency work? (Q06)
  • Write a MERGE from memory with all 4 clauses (Q07)
  • How to handle duplicate keys in MERGE? (Q08)
  • List 6 ways to make MERGE faster (Q09)
  • What is OPTIMIZE? Good and bad? (Q11)
  • What is VACUUM? Risks of RETAIN 0? (Q12)
  • How to fix the small file problem? (Q13)
  • Partitioning vs Z-ORDER vs Liquid Clustering — when to use each? (Q14-Q16)
  • How does data skipping work? (Q17)
  • How does Time Travel work? Recover from accidental delete? (Q18-Q19)
  • What is CDF? How does it help incremental ETL? (Q21)
  • What are Deletion Vectors? Trade-offs? (Q22)
  • Delta vs Iceberg vs Hudi — when to choose each? (Q27)
  • Design a medallion architecture with Delta (Q30)
  • Name 3 new features in Delta 4.x (Q32)

APPENDIX: LEARN BY WATCHING + DOING

(This section replaces YouTube videos — read + copy-paste + see the output)

LAB 1 — Watch the Transaction Log Grow in Real Time

Goal: See EXACTLY what Delta writes on every operation. No magic. Where: Databricks Community Edition (free) OR any Databricks workspace Time: 10 minutes

Setup (copy-paste this first)

python — editable
# ─── SETUP ────────────────────────────────────────────────
# Create a fresh Delta table in a clean location
path = "/tmp/delta_lab/orders"

# Clean up anything from previous runs
dbutils.fs.rm(path, recurse=True)   # ← wipes old data (safe, /tmp only)

# Create initial data
from pyspark.sql import Row
data = [
    Row(order_id=1, customer="Alice", amount=100.0),
    Row(order_id=2, customer="Bob",   amount=200.0),
    Row(order_id=3, customer="Carol", amount=300.0),
]
df = spark.createDataFrame(data)

# Write as Delta — this creates the table AND the _delta_log
df.write.format("delta").save(path)

print("✅ Delta table created at:", path)

Step 1 — Look at what Delta just created

python — editable
# List files in the folder
files = dbutils.fs.ls(path)
for f in files:
    print(f.name)

# EXPECTED OUTPUT:
# _delta_log/                          ← the transaction log folder
# part-00000-abc123.snappy.parquet     ← actual data (Parquet)
# part-00001-def456.snappy.parquet

# Now peek inside _delta_log
log_files = dbutils.fs.ls(f"{path}/_delta_log")
for f in log_files:
    print(f.name)

# EXPECTED OUTPUT:
# 00000000000000000000.json    ← ONE commit file. This is commit #0.

Step 2 — Read commit #0 (see the "birth certificate" of the table)

python — editable
# Read the first commit JSON
commit_0 = spark.read.json(f"{path}/_delta_log/00000000000000000000.json")
commit_0.show(truncate=False)

# EXPECTED OUTPUT (5 rows, one per action):
# ┌──────────┬────────────────────────────────────────┐
# │ action   │ details                                 │
# ├──────────┼────────────────────────────────────────┤
# │ protocol │ minReaderVersion=1, minWriterVersion=2 │  ← compat info
# │ metaData │ schema={order_id INT, customer STR,..} │  ← table schema
# │ add      │ path=part-00000-abc.parquet, size=921  │  ← data file 1
# │ add      │ path=part-00001-def.parquet, size=874  │  ← data file 2
# │ commitInfo│ operation=WRITE, timestamp=...         │  ← who did what
# └──────────┴────────────────────────────────────────┘

Step 3 — Do an UPDATE. Watch a NEW commit appear.

python — editable
from delta.tables import DeltaTable
dt = DeltaTable.forPath(spark, path)

# Update Alice's amount from 100 → 150
dt.update(
    condition = "customer = 'Alice'",
    set       = {"amount": "150.0"}
)

# Now check _delta_log AGAIN
log_files = dbutils.fs.ls(f"{path}/_delta_log")
for f in log_files:
    print(f.name)

# EXPECTED OUTPUT:
# 00000000000000000000.json    ← original commit (unchanged)
# 00000000000000000001.json    ← NEW commit! ✨ This is the UPDATE

Step 4 — Read commit #1. See what Delta did behind the scenes.

python — editable
commit_1 = spark.read.json(f"{path}/_delta_log/00000000000000000001.json")
commit_1.show(truncate=False)

# EXPECTED OUTPUT:
# ┌───────────┬───────────────────────────────────────────┐
# │ action    │ details                                    │
# ├───────────┼───────────────────────────────────────────┤
# │ remove    │ path=part-00000-abc.parquet (OLD file)    │  ← old file marked removed
# │ add       │ path=part-00002-xyz.parquet (NEW file)    │  ← new file with updated row
# │ commitInfo│ operation=UPDATE, numUpdatedRows=1        │
# └───────────┴───────────────────────────────────────────┘

# 🧠 KEY INSIGHT:
#    Delta NEVER modifies a Parquet file in place.
#    It writes a NEW file with the updated row,
#    and marks the OLD file as "removed" in the log.
#    The old file still EXISTS on disk until VACUUM runs!

Step 5 — Check the history (Delta's "audit log")

python — editable
dt.history().select("version", "timestamp", "operation", "operationMetrics").show(truncate=False)

# EXPECTED OUTPUT:
# ┌───────┬──────────────────┬─────────┬────────────────────────┐
# │version│ timestamp         │ operation│ operationMetrics        │
# ├───────┼──────────────────┼─────────┼────────────────────────┤
# │  1    │ 2026-04-05 10:31 │ UPDATE  │ {numUpdatedRows: 1}    │
# │  0    │ 2026-04-05 10:30 │ WRITE   │ {numOutputRows: 3}     │
# └───────┴──────────────────┴─────────┴────────────────────────┘
#
# Every commit = one row in DESCRIBE HISTORY. That's your audit trail.

Step 6 — TIME TRAVEL (go back to BEFORE the update)

python — editable
# Read version 0 (the original, before Alice got updated)
df_before = spark.read.format("delta").option("versionAsOf", 0).load(path)
df_before.show()

# EXPECTED OUTPUT (Alice is still 100.0):
# +--------+--------+------+
# |order_id|customer|amount|
# +--------+--------+------+
# |       1|   Alice| 100.0|   ← original value!
# |       2|     Bob| 200.0|
# |       3|   Carol| 300.0|
# +--------+--------+------+

# Now read current version (version 1)
df_now = spark.read.format("delta").load(path)
df_now.show()

# EXPECTED OUTPUT:
# +--------+--------+------+
# |order_id|customer|amount|
# +--------+--------+------+
# |       1|   Alice| 150.0|   ← updated value
# |       2|     Bob| 200.0|
# |       3|   Carol| 300.0|
# +--------+--------+------+
#
# 🎯 Same table. Two different versions. This is Time Travel.

Now you've SEEN the transaction log grow. No video needed.

LAB 2 — Watch MERGE Work (The #1 Delta Operation)

Goal: See insert + update + delete happen in ONE atomic operation.

python — editable
# ─── SETUP target table ───────────────────────────
target_path = "/tmp/delta_lab/customers"
dbutils.fs.rm(target_path, recurse=True)

target_data = [
    Row(id=1, name="Alice",   status="active"),
    Row(id=2, name="Bob",     status="active"),
    Row(id=3, name="Carol",   status="active"),
]
spark.createDataFrame(target_data).write.format("delta").save(target_path)

print("📋 TARGET (before MERGE):")
spark.read.format("delta").load(target_path).show()
# +---+-----+------+
# | id| name|status|
# +---+-----+------+
# |  1|Alice|active|
# |  2|  Bob|active|
# |  3|Carol|active|
# +---+-----+------+

# ─── Build source (daily change feed) ──────────────
source_data = [
    Row(id=2, name="Bob",     status="inactive"),   # UPDATE  (Bob deactivated)
    Row(id=3, name="Carol",   status="DELETE_ME"),  # DELETE  (special marker)
    Row(id=4, name="Dave",    status="active"),     # INSERT  (new customer)
]
source_df = spark.createDataFrame(source_data)

print("📥 SOURCE (daily change feed):")
source_df.show()
# +---+----+---------+
# | id|name|   status|
# +---+----+---------+
# |  2| Bob| inactive|     ← will UPDATE existing row
# |  3|Carol|DELETE_ME|    ← will DELETE existing row
# |  4|Dave|   active|     ← will INSERT new row
# +---+----+---------+
python — editable
# ─── THE MERGE ─────────────────────────────────────
from delta.tables import DeltaTable
dt = DeltaTable.forPath(spark, target_path)

dt.alias("t").merge(
    source_df.alias("s"),
    "t.id = s.id"                                   # ← match on id
).whenMatchedDelete(
    condition = "s.status = 'DELETE_ME'"            # ← if marker, delete
).whenMatchedUpdate(
    set = {"name": "s.name", "status": "s.status"}  # ← else update
).whenNotMatchedInsert(
    values = {"id": "s.id", "name": "s.name", "status": "s.status"}  # ← new rows: insert
).execute()

print("📋 TARGET (after MERGE):")
spark.read.format("delta").load(target_path).orderBy("id").show()
# +---+-----+--------+
# | id| name|  status|
# +---+-----+--------+
# |  1|Alice|  active|   ← unchanged
# |  2|  Bob|inactive|   ← UPDATED ✅
# |  4| Dave|  active|   ← INSERTED ✅
# +---+-----+--------+
#                         ← Carol (id=3) DELETED ✅

# ─── Verify: one commit for the whole thing ────────
dt.history().select("version", "operation", "operationMetrics").show(truncate=False)
# +-------+---------+─────────────────────────────────────────────────┐
# |version|operation│ operationMetrics                                 │
# +-------+---------+─────────────────────────────────────────────────┤
# |   1   │ MERGE   │ numTargetRowsInserted=1                          │
# |       │         │ numTargetRowsUpdated=1                           │
# |       │         │ numTargetRowsDeleted=1                           │
# |   0   │ WRITE   │ numOutputRows=3                                  │
# +-------+---------+─────────────────────────────────────────────────┘
#
# 🎯 ONE atomic commit handled INSERT + UPDATE + DELETE together.
#    That's why MERGE is the most important Delta operation.

LAB 3 — Watch OPTIMIZE Fix the Small File Problem

python — editable
# ─── Create a table with MANY small files (simulating streaming) ─────
small_files_path = "/tmp/delta_lab/small_files"
dbutils.fs.rm(small_files_path, recurse=True)

# Write 10 times, one row each time → 10 tiny files
for i in range(10):
    spark.createDataFrame([Row(id=i, val=f"row_{i}")]) \
         .write.format("delta").mode("append").save(small_files_path)

# Count Parquet files
parquet_files = [f.name for f in dbutils.fs.ls(small_files_path) if f.name.endswith(".parquet")]
print(f"🔴 Parquet files BEFORE OPTIMIZE: {len(parquet_files)}")
# EXPECTED: 🔴 Parquet files BEFORE OPTIMIZE: 10
#
# 🧠 Each file is tiny (few hundred bytes) → disaster for query performance
python — editable
# ─── Run OPTIMIZE ───────────────────────────────────
spark.sql(f"OPTIMIZE delta.`{small_files_path}`")

parquet_files = [f.name for f in dbutils.fs.ls(small_files_path) if f.name.endswith(".parquet")]
print(f"🟢 Parquet files AFTER OPTIMIZE: {len(parquet_files)}")
# EXPECTED: 🟢 Parquet files AFTER OPTIMIZE: 11
#
# Wait — 11? Why not 1?
#   → OPTIMIZE created 1 NEW compacted file (with all 10 rows)
#   → The 10 OLD files are marked "removed" in _delta_log but still on disk
#   → VACUUM will clean them up later
#
# Verify: check the log
dt = DeltaTable.forPath(spark, small_files_path)
dt.history().select("version", "operation", "operationMetrics").show(truncate=False)
# Latest row: operation=OPTIMIZE, numFilesAdded=1, numFilesRemoved=10 ✅

VISUAL ANIMATION 1 — Optimistic Concurrency Conflict

What happens when TWO jobs try to update the same table at the same time?

📐 Architecture Diagram
TIME  →  T0              T1               T2               T3
         │               │                │                │
JOB A    ├─ reads v5 ────┼─ writes data ──┼─ commits v6 ─✅ │
         │               │                │                │
JOB B    ├─ reads v5 ────┼─ writes data ──┼────────────────┼─ tries to commit v6 ❌
         │               │                │                │    CONFLICT!
         │               │                │                │    Job A already wrote v6
         │               │                │                │
         │               │                │                ├─ Delta checks:
         │               │                │                │    "Did Job A touch the
         │               │                │                │     same files as me?"
         │               │                │                │
         │               │                │                ├─ IF NO overlap:
         │               │                │                │    retry commit as v7 ✅
         │               │                │                │
         │               │                │                └─ IF YES overlap:
         │               │                │                     throw ConcurrentModification ❌
         │               │                │                     → user must retry the WHOLE job

KEY INSIGHT:
  • Delta does NOT lock the table while writing (unlike traditional DBs)
  • Both jobs write data files in PARALLEL — no slowdown
  • Conflict is detected ONLY at commit time (when writing the JSON)
  • Only ONE commit can "win" version 6 → the other must retry or fail

VISUAL ANIMATION 2 — Time Travel Mechanics

📐 Architecture Diagram
DELTA TABLE /customers/ — Timeline of Operations:

v0: CREATE          v1: INSERT 100      v2: UPDATE 20      v3: DELETE 5
    │                   │                   │                   │
    ▼                   ▼                   ▼                   ▼
┌─────────┐         ┌─────────┐         ┌─────────┐         ┌─────────┐
│ empty   │         │ file-A  │         │ file-A  │ removed │ file-A  │ removed
│         │    →    │         │    →    │ file-B  │         │ file-B  │
│         │         │         │         │ file-C  │ added   │ file-C  │
│         │         │         │         │         │         │ file-D  │ added
└─────────┘         └─────────┘         └─────────┘         └─────────┘
 _delta_log          _delta_log          _delta_log          _delta_log
 /0.json             /0.json             /0.json             /0.json
                     /1.json             /1.json             /1.json
                                         /2.json             /2.json
                                                             /3.json

TIME TRAVEL QUERIES:

  SELECT * ... VERSION AS OF 0  →  reads NO files        (empty table)
  SELECT * ... VERSION AS OF 1  →  reads file-A          (100 rows)
  SELECT * ... VERSION AS OF 2  →  reads file-B, file-C  (80 old + 20 updated)
  SELECT * ... VERSION AS OF 3  →  reads file-B, file-C,  (75 rows after delete)
                                         file-D

🧠 KEY INSIGHT:
   Delta doesn't "roll back" anything for Time Travel.
   It just reads the OLD files that are STILL on disk
   (because VACUUM hasn't removed them yet).
   Time Travel = reading old files according to an old commit.

VISUAL ANIMATION 3 — Z-ORDER Rearranging Data

📋 Overview
BEFORE Z-ORDER on (customer_id):
File 1: [cust_5, cust_102, cust_57, cust_893, cust_12, cust_501] ← random!
File 2: [cust_88, cust_3, cust_421, cust_77, cust_999, cust_15] ← random!
File 3: [cust_66, cust_250, cust_8, cust_180, cust_33, cust_700] ← random!
Query: WHERE customer_id = 102
→ Spark must scan ALL 3 files (any could contain 102) 😢
AFTER Z-ORDER on (customer_id):
File 1: [cust_1, cust_3, cust_5, cust_8, cust_12, cust_15] ← sorted range 1-15
File 2: [cust_33, cust_57, cust_66, cust_77, cust_88, cust_102] ← sorted range 33-102
File 3: [cust_180, cust_250, cust_421, cust_501, cust_700, cust_893] ← sorted range 180-893
Query: WHERE customer_id = 102
→ Delta checks min/max stats in _delta_log:
File 1: min=1, max=15102 NOT here, skip ⏭️
File 2: min=33, max=102102 COULD be here, read ✅
File 3: min=180, max=893102 NOT here, skip ⏭️
→ Only 1 file read instead of 3! 3x faster 🚀
🧠 KEY INSIGHT:
Z-ORDER doesn't change HOW data is stored (still Parquet),
it changes WHICH rows end up in WHICH file.
Data skipping uses min/max stats to avoid reading files that can't match.

GOTCHAS — The Weird Errors You'll Hit (and How to Fix Them)

Gotcha 1: VACUUM RETAIN 0 HOURS — "Invalid retention" Error

Error: IllegalArgumentException: requirement failed: Are you sure you would
like to vacuum files with such a low retention period?
If you're not sure, you can set:
spark.databricks.delta.retentionDurationCheck.enabled = false

Why it happens: Delta's default minimum retention is 7 days. VACUUM RETAIN 0 HOURS would delete files that RUNNING QUERIES might still be reading → data corruption.

Fix:

python — editable
0 HOURS")
# Then TURN IT BACK ON:
spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "true")"># DANGER: only disable if you KNOW no concurrent reads are running
spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "false")
spark.sql(f"VACUUM delta.`{path}` RETAIN 0 HOURS")
# Then TURN IT BACK ON:
spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "true")

Interview trap: If they ask "how would you immediately VACUUM a Delta table?" and you say "RETAIN 0 HOURS," they'll follow up with "What's the risk?" → answer: concurrent readers can get FileNotFoundException.

Gotcha 2: Schema Mismatch — "A schema mismatch detected when writing"

Error: AnalysisException: A schema mismatch detected when writing to the Delta table.
To enable schema migration using DataFrameWriter or DataStreamWriter,
please set: .option("mergeSchema", "true")

Why it happens: Your source DataFrame has a column the target table doesn't have (or different type).

Fix:

python — editable
# Option 1: allow new columns to be added automatically
df.write.format("delta") \
  .option("mergeSchema", "true") \   # ← THIS solves it
  .mode("append") \
  .save(path)

# Option 2 (for MERGE operations):
spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled", "true")

Interview trap: "How is this different from schema ENFORCEMENT?" → Enforcement REJECTS bad writes by default. mergeSchema=true opts in to schema EVOLUTION.

Gotcha 3: ConcurrentAppendException

Error: ConcurrentAppendException: Files were added to partition [date=2026-04-05]
by a concurrent update. Please try the operation again.

Why it happens: Two jobs MERGED into the same partition at the same time. Both added files → conflict at commit time.

Fix:

python — editable
# Add a filter in your MERGE condition to restrict partitions
dt.alias("t").merge(
    source.alias("s"),
    "t.date = '2026-04-05' AND t.id = s.id"   # ← partition filter narrows conflict scope
).whenMatchedUpdate(...).execute()

# Or: retry with exponential backoff
import time
for attempt in range(3):
    try:
        # your merge
        break
    except Exception as e:
        if "ConcurrentAppend" in str(e):
            time.sleep(2 ** attempt)
        else:
            raise

Gotcha 4: MERGE with Duplicate Keys in Source

Error: UnsupportedOperationException: Cannot perform Merge as multiple source
rows matched and attempted to modify the same target row.

Why it happens: Your source has 2+ rows with the same join key → Delta can't decide which one wins.

Fix:

python — editable
from pyspark.sql.window import Window
from pyspark.sql.functions import row_number, desc

# Deduplicate source BEFORE merging
w = Window.partitionBy("id").orderBy(desc("updated_at"))
source_deduped = source_df.withColumn("rn", row_number().over(w)) \
                          .filter("rn = 1") \
                          .drop("rn")
# Now source has ONE row per id — merge will succeed

Interview trap: This is THE most common MERGE production bug. If you know this fix, you sound senior.

Gotcha 5: Small Files Problem in Streaming

Symptom: Streaming writes create thousands of tiny files (few KB each)
→ queries become slow, $$$ spent on listing files

Fix:

python — editable
# Enable auto-compaction + optimized writes on the table
spark.sql(f"""
  ALTER TABLE delta.`{path}`
  SET TBLPROPERTIES (
    'delta.autoOptimize.optimizeWrite' = 'true',
    'delta.autoOptimize.autoCompact'   = 'true'
  )
""")

# Also: run scheduled OPTIMIZE daily
spark.sql(f"OPTIMIZE delta.`{path}`")

Gotcha 6: "Delta Table not found" when path looks correct

Error: Path does not exist: /mnt/data/my_table
(or: _delta_log does not exist)

Why it happens: Either:

  • You wrote as Parquet but are trying to read as Delta
  • You deleted _delta_log/ manually (DON'T)
  • Path typo

Fix:

python — editable
# Always verify before reading
print(dbutils.fs.ls(path))                      # check path exists
print(dbutils.fs.ls(f"{path}/_delta_log"))      # check _delta_log exists

# Convert plain Parquet to Delta if needed:
from delta.tables import DeltaTable
DeltaTable.convertToDelta(spark, f"parquet.`{path}`")

MOCK INTERVIEW — Live Walkthrough

⚠️ Common Trap
Read this like watching an interview. See the bad answer → good answer → follow-up trap.

Scenario: Senior Data Engineer Round (30 min Delta deep-dive)

INTERVIEWER: "We have a 2 TB Delta table being updated every hour by a MERGE job. Lately, queries are getting slower and slower. Walk me through how you'd debug this."

❌ BAD ANSWER:

"I would run OPTIMIZE to fix it."

(Too short. No reasoning. No investigation. Red flag.)

✅ GOOD ANSWER:

"Three things come to mind, and I'd investigate in order. First, I'd run DESCRIBE HISTORY and DESCRIBE DETAIL to check the file count and size distribution — if we have thousands of small files from hourly MERGEs, that's small-file problem and OPTIMIZE would help. Second, I'd check if any column has high-cardinality filters in queries — if yes, we might need Z-ORDER on that column for data skipping. Third, I'd check the MERGE condition itself — if it's not partition-pruning, every MERGE is rewriting files across the whole table. Let me describe how I'd verify each..."

FOLLOW-UP TRAP: (interviewer nods) "Good. Now — OPTIMIZE is running but taking 4 hours and blocking the pipeline. What do you do?"

❌ BAD ANSWER: "Run it less often."

✅ GOOD ANSWER:

"I'd do three things. First, run OPTIMIZE with a partition filter so it only compacts recent partitions — OPTIMIZE table WHERE date >= current_date() - 7. Second, enable auto-compaction at the table level so small files get compacted incrementally during writes, reducing how much OPTIMIZE has to do. Third, if we're on a recent Databricks runtime, I'd consider switching from partitioning + Z-ORDER to Liquid Clustering, which handles incremental clustering automatically without blocking writes."

INTERVIEWER: "A user accidentally ran DELETE FROM customers WHERE status = 'active' — deleted 4 million rows. It's been 2 hours. Can you recover?"

❌ BAD ANSWER: "No, the data is gone."

❌ BAD ANSWER 2: "Yes, just restore from backup." (This is Delta — there's no separate backup needed)

✅ GOOD ANSWER:

"Yes, absolutely — Delta's time travel makes this easy. Two hours ago is well within default retention. I'd run DESCRIBE HISTORY customers to find the version BEFORE the DELETE, then either: option A — RESTORE TABLE customers TO VERSION AS OF to restore the whole table, or option B — if we want to preserve changes AFTER the bad DELETE, do INSERT INTO customers SELECT * FROM customers VERSION AS OF WHERE status = 'active' to re-insert just the deleted rows. Option B is safer in production because it doesn't rewind other legitimate changes."

FOLLOW-UP TRAP: "What if it had been 10 days ago, not 2 hours?"

✅ GOOD ANSWER:

"Then it depends on retention settings. Default is 7 days for data file retention, 30 days for log retention. If VACUUM ran with default retention, the old data files would already be deleted from storage — even though the transaction log would still reference them. Time Travel would fail with FileNotFoundException. The prevention is setting delta.deletedFileRetentionDuration to something higher like 30 days for critical tables, or taking scheduled shallow clones as point-in-time snapshots."

INTERVIEWER: "Explain what happens on disk when you run UPDATE customers SET email = LOWER(email)."

❌ BAD ANSWER: "It updates the rows."

✅ GOOD ANSWER:

📝 Note
"Delta files are immutable, so UPDATE works in three steps. First, Delta identifies which Parquet files contain rows matching the filter — in this case, ALL files since there's no WHERE clause. Second, for each affected file, it reads the file, applies LOWER to email for every row, and writes a NEW Parquet file. Third, it writes ONE commit JSON to _delta_log that has remove entries for all the old files and add entries for all the new files. The old files stay on disk until VACUUM runs, which is what enables time travel back to the pre-update version. Note: with Deletion Vectors enabled, UPDATE can be faster because it writes a small vector file marking which rows changed, avoiding the full file rewrite."

INTERVIEWER: "We have Job A doing hourly MERGE into table X, and Job B doing hourly OPTIMIZE on table X. They started conflicting. Why, and how do you fix it?"

❌ BAD ANSWER: "Run them at different times."

✅ GOOD ANSWER:

"This is an optimistic concurrency conflict. OPTIMIZE marks a set of files as 'removed' and adds one compacted file. If MERGE reads those same files and also tries to add/remove files in an overlapping set, Delta throws ConcurrentAppendException at commit time. Two fixes: First, OPTIMIZE is idempotent and retryable — wrap it with retry logic so a conflict just triggers a retry after the MERGE finishes. Second, partition the table so OPTIMIZE targets OLD partitions and MERGE targets the CURRENT partition — e.g., OPTIMIZE WHERE date < current_date(). That way they work on disjoint file sets and never conflict."

FINAL READINESS CHECK

If you can do all of these, you're video-free:

  • Run LAB 1 end-to-end without looking at the answers
  • Run LAB 2 and explain each MERGE clause from memory
  • Run LAB 3 and explain why you see 11 files after OPTIMIZE (not 1)
  • Draw the Optimistic Concurrency animation on a whiteboard
  • Draw the Z-ORDER animation on a whiteboard
  • Name all 6 Gotchas and how to fix each
  • Answer all 4 mock interview questions using the "good answer" framework

If yes → you don't need a single video. You're ready.