SQL Patterns 13–15: Recursive CTE, Median/Percentile, Funnel Analysis
The advanced patterns. Recursive CTE separates good candidates from great ones. Funnel analysis is asked at almost every product-driven data engineering role.
🧠 MEMORY MAP
🧠 ADVANCED PATTERNS = "RMF"
ADVANCED PATTERNS"RMF"
RRecursive CTE: anchor + recursive step (trees, hierarchies, date spines)
MMedian/Percentile: ROW_NUMBER trick OR PERCENTILE_CONT (built-in)
FFunnel: COUNT DISTINCT at each stage → conversion rate between stages
RECURSIVE CTE STRUCTURE
WITH RECURSIVE cte AS (
SELECT ... ← ANCHOR: the starting point (root / first row)
UNION ALL
SELECT ... FROM cte JOIN table ... ← RECURSIVE: add one more level
WHERE stopping_condition ← TERMINATION: when to stop
)
SELECT * FROM cte;
MEDIAN TRICK (no built-in MEDIAN)
"The median is the middle value"
→ Count total rows (N)
→ Assign ROW_NUMBER to each row ordered by value
→ Keep row where rn = (N+1)/2 (odd N)
→ Or AVG of rows where rn IN (N/2, N/2+1) (even N)
→ PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY col) is simpler if available!
FUNNEL PATTERN
Define stages: signup→verify → first_purchase
For each stage: COUNT(DISTINCT user_id) who reached that stage
Conversion: next_stage_count / this_stage_count × 100
PATTERN 13: RECURSIVE CTE
What Is It?
A CTE that calls itself repeatedly, adding one "level" per iteration. Used for: org chart traversal, category trees, date spine generation, dependency chains, graph reachability.
Recognize It When You See:
- "All employees under manager X (direct AND indirect)"
- "Full category hierarchy / subcategories recursively"
- "Generate a sequence of dates / numbers"
- "Task A depends on task B depends on task C — full chain"
- "All cities reachable from city X via connections"
THE TEMPLATE
sql
-- TEMPLATE: Recursive CTE
WITH RECURSIVE cte_name AS (
-- ANCHOR MEMBER: base case (starting point)