🧱
Databricks
ETL Patterns — Quick Recall
🧱
🧱
Databricks · Section 6 of 18

ETL Patterns — Quick Recall

🔒

This section is locked

Unlock every deep-dive, lab, mock interview, and memory map across all 10 topics.

View Plans — from ₹299/month

Already have a plan? Sign in

ETL Patterns — Quick Recall

🗺️ Memory Map
How to use this file:
  • ⚡ = Must remember (95% chance of being asked)
  • 🔑 = Key concept (core understanding needed)
  • ⚠️ = Common trap (interviewers love to test this)
  • 🧠 = Memory Map (mnemonic/acronym — memorize this!)
  • 📝 = One-liner (flash-card style — cover answer, test yourself)
Reading strategy: Read Memory Maps FIRST → then Direct Questions → then Mid-Level.

🧠 MASTER MEMORY MAP — Day 2

🧠 ETL PIPELINE DESIGN = "MACS-L"
ETL PIPELINE DESIGN"MACS-L"
MMedallion Architecture (Bronze → Silver → Gold)
AAuto Loader (streaming file ingestion)
CCDC (Change Data Capture — Oracle → Kafka → Delta)
SSCD Type 2 (Slowly Changing Dimensions — history tracking)
LLakeflow Declarative Pipelines (formerly DLT)
MEDALLION"BSG" (like the TV show Battlestar Galactica)
BBronze (raw, as-is, append-only)
SSilver (clean, deduplicated, validated)
GGold (aggregated, business-ready)
AUTO LOADER"DNS"
DDirectory listing mode (small folders, simple)
NNotification mode (huge folders, uses Azure Event Grid)
SSchema evolution (auto-detects new columns → rescue)

SECTION 1: MEDALLION ARCHITECTURE

🧠 Memory Map: Medallion Layers

🧠 BRONZE = Raw ingredients from market (dirty, uncut)
Think: COOKING ANALOGY
BRONZERaw ingredients from market (dirty, uncut)
→ Append only, never update
→ Add metadata: _ingested_at, _source_file, _batch_id
→ Engineers read this for debugging
SILVERWashed, cut, measured ingredients
→ Deduplicated (MERGE/upsert)
→ Schema validated, nulls handled
→ PII tagged (Unity Catalog)
→ Engineers + Scientists read this
GOLDFinal dish served to customer
→ Aggregated (daily revenue, route stats)
→ Star schema for BI tools
→ Power BI, Tableau read this
KEY DESIGN DECISIONS per layer:
Bronze: partition by ingestion_date (when data arrived)
Silver: cluster by busine