a deeper understanding of spark’s internals · pdf filethis talk • goal:...

45
A Deeper Understanding of Spark’s Internals Aaron Davidson 07/01/2014

Upload: vandat

Post on 06-Feb-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

A Deeper Understanding of Spark’s Internals Aaron Davidson"

07/01/2014

Page 2: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

This Talk •  Goal: Understanding how Spark runs, focus

on performance

•  Major core components: – Execution Model – The Shuffle – Caching

Page 3: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

This Talk •  Goal: Understanding how Spark runs, focus

on performance

•  Major core components: – Execution Model – The Shuffle – Caching

Page 4: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Page 5: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

Page 6: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, Andy) (P, Pat) (A, Ahir)

Page 7: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, Andy) (P, Pat) (A, Ahir)

Page 8: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, Andy) (P, Pat) (A, Ahir)

Page 9: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, Set(Ahir, Andy)) (P, Set(Pat))

(A, Andy) (P, Pat) (A, Ahir)

Page 10: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2) (P, 1)

(A, Andy) (P, Pat) (A, Ahir)

Page 11: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2) (P, 1)

(A, Andy) (P, Pat) (A, Ahir)

res0 = [(A, 2), (P, 1)]

Page 12: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Spark Execution Model 1.  Create DAG of RDDs to represent

computation 2.  Create logical execution plan for DAG 3.  Schedule and execute individual tasks

Page 13: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 1: Create RDDs

sc.textFile(“hdfs:/names”)

map(name => (name.charAt(0), name))

groupByKey()

mapValues(names => names.toSet.size)

collect()

Page 14: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 1: Create RDDs

HadoopRDD

map()

groupBy()

mapValues()

collect()

Page 15: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 2: Create execution plan •  Pipeline as much as possible •  Split into “stages” based on need to

reorganize data

Stage 1 HadoopRDD

map()

groupBy()

mapValues()

collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2)

(A, Andy) (P, Pat) (A, Ahir)

res0 = [(A, 2), ...]

Page 16: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 2: Create execution plan •  Pipeline as much as possible •  Split into “stages” based on need to

reorganize data

Stage 1

Stage 2

HadoopRDD

map()

groupBy()

mapValues()

collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2) (P, 1)

(A, Andy) (P, Pat) (A, Ahir)

res0 = [(A, 2), (P, 1)]

Page 17: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

•  Split each stage into tasks •  A task is data + computation •  Execute all tasks within a stage before

moving on

Step 3: Schedule tasks

Page 18: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks Computation Data

hdfs:/names/0.gz

hdfs:/names/1.gz

hdfs:/names/2.gz

Task 0 Task 1 Task 2

hdfs:/names/3.gz

Stage 1 HadoopRDD

map() Task 3

hdfs:/names/0.gz

Task 0

HadoopRDD

map()

hdfs:/names/1.gz

Task 1

HadoopRDD

map()

Page 19: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

/names/0.gz

HadoopRDD

map() Time

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

Page 20: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map() Time

Page 21: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

Time

Page 22: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map() Time

Page 23: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

Time

Page 24: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

Time

Page 25: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

Page 26: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

/names/3.gz

HadoopRDD

map()

Page 27: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

/names/3.gz

HadoopRDD

map()

Page 28: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

/names/3.gz

HadoopRDD

map()

Page 29: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

The Shuffle

Stage 1

Stage 2

HadoopRDD

map()

groupBy()

mapValues()

collect()

Page 30: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

The Shuffle

Stage  1  

Stage  2  

•  Redistributes data among partitions •  Hash keys into buckets •  Optimizations: – Avoided when possible, if"

data is already properly"partitioned

– Partial aggregation reduces"data movement

Page 31: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

The Shuffle

Disk  

Stage  2  

Stage  1  

•  Pull-based, not push-based •  Write intermediate files to disk

Page 32: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Execution of a groupBy() •  Build hash map within each partition

•  Note: Can spill across keys, but a single key-value pair must fit in memory

A => [Arsalan, Aaron, Andrew, Andrew, Andy, Ahir, Ali, …], E => [Erin, Earl, Ed, …] …

Page 33: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Done!

Stage 1

Stage 2

HadoopRDD

map()

groupBy()

mapValues()

collect()

Page 34: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

What went wrong? •  Too few partitions to get good concurrency •  Large per-key groupBy() •  Shipped all data across the cluster

Page 35: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Common issue checklist 1.  Ensure enough partitions for concurrency 2.  Minimize memory consumption (esp. of

sorting and large keys in groupBys) 3.  Minimize amount of data shuffled 4.  Know the standard library

1 & 2 are about tuning number of partitions!

Page 36: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Importance of Partition Tuning •  Main issue: too few partitions –  Less concurrency – More susceptible to data skew –  Increased memory pressure for groupBy,

reduceByKey, sortByKey, etc. •  Secondary issue: too many partitions •  Need “reasonable number” of partitions – Commonly between 100 and 10,000 partitions –  Lower bound: At least ~2x number of cores in

cluster – Upper bound: Ensure tasks take at least 100ms

Page 37: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Memory Problems •  Symptoms: –  Inexplicably bad performance –  Inexplicable executor/machine failures"

(can indicate too many shuffle files too) •  Diagnosis: –  Set spark.executor.extraJavaOptions to include

•  -XX:+PrintGCDetails •  -XX:+HeapDumpOnOutOfMemoryError

–  Check dmesg for oom-killer logs •  Resolution: –  Increase spark.executor.memory –  Increase number of partitions –  Re-evaluate program structure (!)

Page 38: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Fixing our mistakes sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name)) .groupByKey()

.mapValues { names => names.toSet.size }

.collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 39: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Fixing our mistakes sc.textFile(“hdfs:/names”)

.repartition(6) .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.toSet.size }

.collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 40: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Fixing our mistakes sc.textFile(“hdfs:/names”)

.repartition(6) .distinct() .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.toSet.size }

.collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 41: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Fixing our mistakes sc.textFile(“hdfs:/names”)

.repartition(6) .distinct() .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.size } .collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 42: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Fixing our mistakes sc.textFile(“hdfs:/names”)

.distinct(numPartitions = 6) .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.size } .collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 43: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Fixing our mistakes sc.textFile(“hdfs:/names”)

.distinct(numPartitions = 6) .map(name => (name.charAt(0), 1)) .reduceByKey(_ + _) .collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 44: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Fixing our mistakes sc.textFile(“hdfs:/names”)

.distinct(numPartitions = 6) .map(name => (name.charAt(0), 1)) .reduceByKey(_ + _) .collect()

Original: sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.toSet.size } .collect()

Page 45: A Deeper Understanding of Spark’s Internals · PDF fileThis Talk • Goal: Understanding how Spark runs, focus on performance • Major core components: – Execution Model – The

Questions?