SQL Patterns 4–6: Gaps & Islands, Sessionization, Deduplication
These are the patterns that separate average SQL writers from senior engineers. Gaps & Islands is THE hardest commonly-asked pattern. Master it.
🧠 MEMORY MAP
🧠 HARD PATTERNS = "GSD"
HARD PATTERNS"GSD"
GGaps & Islands: find consecutive sequences (date - ROW_NUMBER = constant!)
SSessionization: group events by inactivity gap (LAG + cumulative SUM)
DDeduplication: keep one row per key (ROW_NUMBER = 1)
GAPS & ISLANDS KEY TRICK:
If dates are CONSECUTIVE→(date - ROW_NUMBER) = SAME VALUE for the whole group!
When the sequence BREAKS→the (date - ROW_NUMBER) VALUE CHANGES
Group by that constant→each "island" becomes one group!
date row_num date - row_num = group_key
2024-01-01 1 2024-01-01 - 1 = 2023-12-31 ← island 1
2024-01-02 2 2024-01-02 - 2 = 2023-12-31 ← island 1 (same!)
2024-01-03 3 2024-01-03 - 3 = 2023-12-31 ← island 1 (same!)
← GAP: 01-04 missing →
2024-01-05 4 2024-01-05 - 4 = 2024-01-01 ← island 2 (different!)
2024-01-06 5 2024-01-06 - 5 = 2024-01-01 ← island 2 (same!)
PATTERN 4: GAPS & ISLANDS
What Is It?
Find consecutive sequences ("islands") of events/dates, or the gaps between them. Classic use: login streaks, consecutive active days, periods of server uptime, stocks with N consecutive price increases.
Recognize It When You See:
- "Consecutive days / weeks"
- "Continuous period of [status]"
- "Longest streak of [activity]"
- "Start and end date of each [active/inactive] period"
- "N or more consecutive [events]"
- "Periods of server uptime/downtime"
THE TEMPLATE
sql
-- TEMPLATE: Gaps & Islands
-- Find all consecutive date sequences per user/entity
WITH activity AS (
-- Step 1: Get one row per user per active date (deduplicate if needed)
SELECT DISTINCT
user_id,
activity_date
FROM events
),
numbered AS (
-- Step 2: Assign row number P