Day 2: Data Loading, Pipelines & Performance

🔒

This section is locked

Unlock every deep-dive, lab, mock interview, and memory map across all 10 topics.

View Plans — from ₹299/month

Already have a plan? Sign in

Day 2: Data Loading, Pipelines & Performance

✅ Pro Tip

Time: 5-6 hours | Priority: HIGH — Pipelines & performance are 30-35% of interviews Context: Real-time booking ingestion, CDC from Oracle, BI dashboard performance Approach: Simple explanation → analogy → technical details → code → interview tip

SECTION 1: DATA LOADING (1 hour)

Q1: What is COPY INTO? How is it different from INSERT?

Simple Explanation: COPY INTO is Snowflake's bulk loading command — it reads files from a stage (ADLS, S3, internal) and loads them into a table. It is MUCH faster than INSERT for large volumes because it runs in parallel across the warehouse.

Why do we need it? INSERT processes rows one by one — fine for 100 rows, terrible for 100 million rows. COPY INTO parallelizes across the warehouse and processes files in parallel batches.

Real-world analogy: INSERT = one person typing data manually. COPY INTO = 100 workers simultaneously moving pallets from a loading dock into the warehouse.

sql

-- Basic COPY INTO from external stage (Azure ADLS)
COPY INTO raw_bookings
FROM @azure_booking_stage/2026/03/25/        -- Read from this ADLS path
FILE_FORMAT = (
    TYPE = 'PARQUET'                          -- File type
)
ON_ERROR = 'CONTINUE';                        -- Skip bad files, continue loading
-- Options: ABORT_STATEMENT (default), CONTINUE, SKIP_FILE

-- COPY INTO with transformation (load + clean in one step)
COPY INTO silver_bookings (booking_id, booking_date, amount, airline_code)
FROM (
    SELECT
        $1:booking_id::VARCHAR,               -- Cast from VARIANT
        $1:booking_date::DATE,
        $1:fare.amount::NUMBER(10,2),
        UPPER($1:airline_code::VARCHAR)        -- Transform during load
    FROM @azure_booking_stage/bookings.json
)
FILE_FORMAT = (TYPE = 'JSON');

-- COPY INTO with error handling — see what failed
COPY INTO raw_bookings FROM @stage/
ON_ERROR = 'CONTINUE';

-- After loading, check what errors occurred:
SE

← Day 2: Pipelines & Performance — Quick Recall GuidePrevious Day 3: Security, Sharing & New Features — Quick Recall GuideNext →