Hadoop Commands Reference — Interview Quick-Fire Guide
⚠️ Common Trap
Purpose: Every command interviewers test, with real examples and traps to avoid
Interviewers expect you to type these from memory
Levels: ⬜ Direct (what/define) | 🟨 Mid-level (how/why) | 🟥 Scenario (debug/fix)
Format: What it does → Command syntax → Practical example → Interview tip
SECTION 1: HDFS COMMANDS (hadoop fs / hdfs dfs)
💡 Interview Tip
Key distinction:
hadoop fs works with ANY filesystem (HDFS, S3, local). hdfs dfs works ONLY with HDFS.
In interviews, use hdfs dfs — it shows you know the difference.ls / ls -R — List files and directories
What it does: Lists files and directories in HDFS, similar to Linux ls.
Syntax:
bash
hdfs dfs -ls <path>
hdfs dfs -ls -R <path> # Recursive listing (all subdirectories)
hdfs dfs -ls -h <path> # Human-readable file sizes (KB, MB, GB)
Practical example:
bash
# List all files in the bookings directory
hdfs dfs -ls /data/travelco/bookings/
# Output:
# -rw-r--r-- 3 krishna hadoop 1073741824 2026-03-25 14:30 /data/travelco/bookings/booking_2026.orc
# Recursive list to see all partitions under a Hive table
hdfs dfs -ls -R /user/hive/warehouse/bookings_db.db/flights/
# List with human-readable sizes
hdfs dfs -ls -h /data/travelco/bookings/
# Output: shows 1.0 G instead of 1073741824
Interview tip: The output columns are: permissions, replication factor, owner, group, size (bytes), date, time, path. The replication factor column is what catches people — they forget it's there. Files show replication (e.g., 3), directories show -.
mkdir / mkdir -p — Create directories
What it does: Creates directories in HDFS. -p creates parent directories if they don't exist.
Syntax:
bash
hdfs dfs -mkdir <path>
hdfs dfs -mkdir -p <path> # Create parents (like Linux mkdir -p)
Practical example:
bash
# Create a single directory (parent must exist)
hdf