page 1 online aggregation for large mapreduce jobs niketan pansare, vinayak borkar, chris jermaine,...
TRANSCRIPT
Page 1
Online Aggregation for Large MapReduce Jobs
Niketan Pansare, Vinayak Borkar, Chris Jermaine, Tyson Condie
VLDB 2011
IDS Fall Seminar2011. 11. 11.
Presented by Yang Byoung Ju
Page 2
Online Aggregation (OLA)
▶ select avg(stock_price) from nasdaq_db where company = 'xyz';
▶ Conventional DB:
▶ With OLA Extension: [0, 2000] with 95% probabil-ity
After 1 seconds
Page 3
Online Aggregation (OLA)
▶ select avg(stock_price) from nasdaq_db where company = 'xyz';
▶ Conventional DB:
▶ With OLA Extension: [900, 1100] with 95% proba-bility
After 2 minutes
Page 4
Online Aggregation (OLA)
▶ select avg(stock_price) from nasdaq_db where company = 'xyz';
▶ Conventional DB:
▶ With OLA Extension: [995, 1005] with 95% proba-bility
After 10 minutes
Page 5
Online Aggregation (OLA)
▶ select avg(stock_price) from nasdaq_db where company = 'xyz';
▶ Conventional DB: 1000
▶ With OLA Extension: 1000
After 2 hours
Page 6
Online Aggregation (OLA)
▶ User gets estimates of an aggregate query
▶ At all times during the query processing, a database sys-tem gives user a statistically valid estimate for the final answer
(Ex. Output range estimate: [990, 1010] with 95% probabil-ity)
▶ Advantages Can get reasonable answer very quickly (depends of application) Can save time and computing resourse
▶ Distavantages Implementation requires changes to the database ker-
nel In a self-managed system, decreased resource cost
may not benefit the user directly
Page 7
Why ‘Online Aggregation’?
▶ OLA was proposed in 1997, but its commercial impact has been limited or even non-existent due to two reasons OLA require extensive changes to the database kernel Saving resources has never been compelling
▶ Why OLA now? People are implementing all sorts of new databases
thesedays Given the current move into the cloud, as a query runs,
dollars flow from the end-user’s pocket to the cloud
Page 8
OLA in a distributed environment
▶ Classic OLA Set of data(tuples) at any point in the computation is a
random subset of the data in the system Easy to estimate the final answer using statistics
method
▶ OLA for Large-scale The basic unit of data that is processed is a block (Ex.
64MB) A lot of variation in the time taken to process each
block This variation in processing time is tremendously im-
portant, if it is correlated with the aggregate value of the block
Page 9
OLA in a distributed environment
▶ OLA for Large-scale (Cond.) Blocks with a lot of data may have greater aggregate
value, and takes longer to process So, the set of blocks completed at any particular point
are more likely to have small values, leading to biased estimates-> “Inspection Paradox”
This paper solved the ‘inspection paradox’ problem, consequently making OLA possible
in a distributed environment
Page 10
Inspection Paradox
▶ In a renewal process, if we wait some predetermined time t and then observe how large the renewal interval contain-ing t is, we should expect it to be typically larger than a renewal interval of average size.
Page 11
Inspection Paradox
▶ Explanation #1 If we randomly shot arrows to the target below, there
would be more arrows on larger target
Page 12
Inspection Paradox
▶ Explanation #2 There are buses that has an average interval as 10
minutes. How long you wait, when you get to the busstop ran-domly? 5 minutes?
Yes. If bus arrives every 10 minutes
What if arrival intervals are not uniform(random)?Ex. 5min, 15min, 5min, 15min (average 10min)
Waiting time: 1/4 X 2.5min + 3/4 X 7.5 min = 6.25 min
10 min 20 min 30 min 40 min
5 min 20 min 25 min 40 min
Page 13
Inspection Paradox
▶ Explanation #2 (Cond.) Waiting time – Area of the triangle is the waiting time
Different even if their avg. interval is same
In the latter case, if the inspector sit down at the busstop all day and average intervals of all buses, he can get 10 minutes
But, if the inspector get to the busstop at particular point and estimates avg. interval based on his waiting time(6.25 min), he will get 12.5 minutes
“Inspection Paradox”
10 min 20 min 5 min 20 min
Page 14
Inspection Paradox
▶ If someone tries to get information from randomly inter-valed data at a particular point, he will be at the larger in-terval, consequently he will get biased(wrong) estimation
▶ Explanation #3 On a machine of the distributed system, block process-
ing time will be different depending on its data, even if every block’s size is same
If we take snapshot at a particular point to get an esti-mation, it will be the time that larger block is being pro-cessed.
It means that we just get the information of the smaller blocks which contain less information while we cannot include the information of a larger block to the estima-tion.
completed
Block 1Block
2Block 3 Block 4
processing waiting
snapshot
Page 15
Inspection Paradox
▶ Let’s make ‘inspection paradox’ go away Take 3 parameters of the block for estimation
- x : aggregate value of the block- tsch : waiting time of the block to be scheduled
- tproc : processing time of the block
tsch and tproc will allow us to make predictions about the x value that we have not seen.
For example, if we have a particular block that has been processed for 125 seconds (not completed yet), where it took 5 seconds to be scheduled, we can cor-rectly view x as a random sample from the distribution,
f( x | tsch = 5, tproc >= 125)
Page 16
Implementation
▶ Implemented OLA mode in Hyracks
▶ Hyracks Open source project that supports Map and Reduce op-
eration Relational operations such as selection, projection, and
join Architecture is similar to Hadoop
▶ Modification of the Hyracks Logical block queue to make their order statistically ran-
dom Estimator in the reduce task during the shuffle phase
- Completed map tasks are gathered in the shuffle phase
- The estimator receives aggregate value (x) and meta-data (tsch and tproc)
Page 17
Estimation
▶ Bayesian approach is applied for estimation Z is randomly sampled from blocks Z produces observed data, X and hidden data, Y Θ includes any data that is unobserved Process below is repeated to get an estimation
Page 18
Experiments
▶ 6 months of data from Wikipedia page traffic data Counting the # of page per language 220GB, 3960 blocks On 11 nodes (1 master, 10 slaves) 80 mappers and 10 reducers Took 46 minutes to run to completion
▶ Experimented on 3 different versions w/ random block order, w/ correlation (inspection para-
dox) w/o random block order, w/ correlation (inspection para-
dox) w/ random block order, w/o correlation (inspection para-
dox)
Page 19
Experiments
(a) Posterior query result distribution for number of English language page at various time,
using both randomized and arbitrary block ordering (actual result: black verti-cal line)
(b) Posterior query result distribution for number of English language page at various time,
taking into account and ignoring correlation between aggregate value and pro-cessing time
Page 20
Conclusion
▶ The authors proposed a system model that is appropriate for OLA over MapReduce in a large-scale, distributed envi-ronment
▶ The model accounts for biases that can arise when esti-mating aggregates in a cluster environment
(deals with ‘inspection paradox’)
▶ This model allows us to export “early returns” of query ag-gregates that are statistically robust