❄️
Snowflake
Day 2: Pipelines & Performance — Quick Recall Guide
❄️
❄️
Snowflake · Section 4 of 8

Day 2: Pipelines & Performance — Quick Recall Guide

🔒

This section is locked

Unlock every deep-dive, lab, mock interview, and memory map across all 10 topics.

View Plans — from ₹299/month

Already have a plan? Sign in

Day 2: Pipelines & Performance — Quick Recall Guide

Must remember🔑Key concept⚠️Common trap🧠Memory Map📝One-liner

🧠 MASTER MEMORY MAP — Day 2

🧠 DATA LOADING = "CSP" (Cheap→Smart→Push)
DATA LOADING"CSP" (Cheap → Smart → Push)
CCOPY INTO (batch, cheapest, manual trigger)
SSnowpipe (auto-ingest, ~1 min latency, event-driven)
PSnowpipe Streaming (push via API, sub-second, Kafka)
CDC INSIDE SNOWFLAKE"Streams + Tasks" OR "Dynamic Tables"
Streams = track what CHANGED in a table (INSERT/UPDATE/DELETE)
Tasks = scheduled SQL (run stream processor every N min)
Dynamic Tables = newer, declarative, replaces most Streams+Tasks
STREAMS"A camera watching your table"
3 metadata columns:
METADATA$ACTION = 'INSERT' or 'DELETE'
METADATA$ISUPDATE = TRUE if row is part of an UPDATE
METADATA$ROW_ID = unique row identifier
⚠️UPDATE = DELETE old + INSERT new (two rows in stream!)
TASKS"Cron jobs inside Snowflake"
Root task: has SCHEDULE
Child tasks: have AFTER parent_task
Only RESUME root task — children auto-follow
Serverless tasks: no WAREHOUSE needed (10% cheaper)
DYNAMIC TABLES"Declarative ELT — you say WHAT, Snowflake does HOW"
CREATE DYNAMIC TABLE ... TARGET_LAG = '5 minutes' AS SELECT ...
Min lag = 1 minute
INCREMENTAL refresh by default (only processes changes)
⚠️Cannot do SCD Type 2 (use Streams+Tasks for that)
PERFORMANCE"CQMWS"
CClustering key (organize micro-partitions by filter columns)
QQuery Profile (find bottleneck: pruning? memory? compute?)
MMaterialized Views (pre-compute heavy aggregations)
WWarehouse sizing (scale UP for slow, scale OUT for concurrency)
SSearch Optimization (point lookups: WHERE id = 'X')

SECTION 1: DATA LOADING

⚡ MUST KNOW DIRECT QUESTIONS

Q1What is COPY INTO?

Snowflake's bulk loading command — reads files from a stage and loads into a table in parallel. Much faster than INSERT