PySpark
PySpark Interview — Question Bank
PySpark · Section 9 of 9

PySpark Interview — Question Bank

🔒

This section is locked

Unlock every deep-dive, lab, mock interview, and memory map across all 10 topics.

View Plans — from ₹299/month

Already have a plan? Sign in

PySpark Interview — Question Bank

Structured coding questions with full PySpark solutions Pattern: B=Basic | M=Medium | H=Hard | S=Scenario Add new questions at the bottom — never renumber existing ones

MASTER TRACKING TABLE

Q#QuestionCompanyLevelTopicSolved
Q01Filter employees earning > 50kGeneralBfilter/select
Q02Count orders per customerGeneralBgroupBy/agg
Q03Find duplicate rows by emailGeneralBdropDuplicates/groupBy
Q04Total sales by dateGeneralBgroupBy/sum
Q05Add a new derived columnGeneralBwithColumn
Q06Read CSV + handle nullsGeneralBna.fill/isNull
Q07Word count in text columnGeneralBflatMap/reduceByKey
Q08Find max salary per departmentAmazon/GoogleMgroupBy/max
Q09Rank employees by salary per deptAmazonMwindow/dense_rank
Q10Top 3 salaries per departmentAmazonMwindow/dense_rank
Q11Second highest salaryAmazonMwindow/dense_rank
Q12Running total of revenueGeneralMwindow/sum
Q137-day rolling averageGeneralMwindow/avg
Q14Employees earning more than their managerGoogleMself-join
Q15Customers with no orders (NOT IN)GeneralMleft_anti join
Q16Deduplicate — keep latest recordGeneralMwindow/row_number
Q17MoM revenue change (%)GeneralMwindow/lag
Q18Pivot: rows to columnsGeneralMpivot
Q19Explode array column + count tagsGeneralMexplode/groupBy
Q20Find consecutive purchase daysAmazonHwindow/lag+filter
Q21Session ID assignment (30-min gap)GeneralHwindow/lag+sum
Q22Temperature rise from previous dayAmazon/GoogleHwindow/lag
Q23Longest streak of active