🐘
Hadoop
Day 3: Performance, Security + Cloud Migration — Deep Interview Guide
🐘
🐘
Hadoop · Section 7 of 9

Day 3: Performance, Security + Cloud Migration — Deep Interview Guide

🔒

This section is locked

Unlock every deep-dive, lab, mock interview, and memory map across all 10 topics.

View Plans — from ₹299/month

Already have a plan? Sign in

Day 3: Performance, Security + Cloud Migration — Deep Interview Guide

🧠 MASTER MEMORY MAP — Day 3

🧠 HADOOP SECURITY = "KRK" (Kerberos → Ranger → Knox):
HADOOP SECURITY"KRK" (Kerberos → Ranger → Knox):
KKerberos: AUTHENTICATION (who are you?)
RRanger: AUTHORIZATION (what can you do?)
KKnox: GATEWAY (how do you get in? SSL + API proxy)
PERFORMANCE TUNING LAYERS"JYH" (JVM → YARN → Hadoop):
JJVM tuning (heap sizes, GC, JVM reuse)
YYARN tuning (container sizes, scheduler config)
HHadoop-level (block size, replication, compression)
CLOUD MIGRATION PATTERNS"LRR" (Lift → Replatform → Refactor):
LLift-and-Shift: HDFS → S3/ADLS (same MapReduce/Hive, just different storage)
RReplatform: MapReduce → Spark (same data in cloud, better processing)
RRefactor: Hive → Delta Lake / Snowflake (rebuild for cloud-native)
HADOOP vs SPARK = "DISK vs RAM":
Hadoop MapReduce: disk-based, fault-tolerant, Java-only
Spark: in-memory (100x faster), Python/SQL/Scala, streaming + batch
When to still use Hadoop: legacy code, can't migrate budget, HBase is on-prem

SECTION 1: HADOOP SECURITY

Layer 1: Kerberos — Authentication

KERBEROSMIT protocol for distributed authentication
WITHOUT KERBEROS (plain Hadoop)
User: "I am root"
Hadoop: "OK, here's all the data" ← takes your word for it!
Security: ZERO
WITH KERBEROS
User: must have valid Kerberos ticket (cryptographically signed by KDC)
Hadoop: verifies ticket with KDC before granting access
Can't fake identityproper enterprise security
HOW KERBEROS WORKS (simplified)
KDCKey Distribution Center (the central auth server)
Authentication Service (AS): verifies your password
Ticket Granti