upc compiler support for trace-level speculative multithreaded architectures antonio gonzález λ,ф...
TRANSCRIPT
![Page 1: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/1.jpg)
UU PP CC
Compiler Support for Trace-Level Speculative
Multithreaded Architectures
Compiler Support for Trace-Level Speculative
Multithreaded Architectures
Antonio González λ,ф
Carlos Molina ψ
Jordi Tubella ф
INTERACT-9, San Francisco (USA) - February 13, 2005
λ Intel Barcelona Research Center
Intel Labs - UPC
Barcelona, Spain
ф Dept. Arquitectura de Computadors
Universitat Politècnica de Catalunya
Barcelona, Spain
{antonio,jordit}@ac.upc.edu
ψ Dept. Enginyeria Informàtica
Universitat Rovira i Virgili
Tarragona, Spain
![Page 2: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/2.jpg)
Trace Level SpeculationTrace Level Speculation
Avoids serialization caused by data dependences
Skips in a row multiple instructions
Predicts values based on the past
Introduces penalties due to misspeculations
With Live Output TestWith Live Output Test
Trace Level SpeculationTrace Level Speculation
With Live Input TestWith Live Input Test
![Page 3: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/3.jpg)
BUFFERBUFFER
Trace Level Speculation with Live Output Test
Trace Level Speculation with Live Output Test
Live Output Update & Trace Speculation
NST
ST
Trace Miss Speculation Detection & Recovery Actions
INSTRUCTION EXECUTION
NOT EXECUTED
LIVE OUTPUT VALIDATION
![Page 4: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/4.jpg)
TSMA Block DiagramTSMA Block Diagram
CacheI
EngineFetch
RenameDecode &
UnitsFunctional
PredictorBranch
TraceSpeculation
Engine NST Reorder BufferNST Reorder Buffer
ST Reorder BufferST Reorder Buffer
NST Ld/St QueueNST Ld/St Queue
ST Ld/St QueueST Ld/St Queue
NST I WindowNST I Window
ST I WindowST I Window
Look Ahead BufferLook Ahead Buffer
EngineVerification
L1NSDCL1NSDC L2NSDCL2NSDC
L1SDCL1SDC DataCache
Register FileNST Arch.
Register FileST Arch.
![Page 5: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/5.jpg)
MotivationMotivation
Two orthogonal issues microarchitecture support for trace speculation control and data speculation techniques
– prediction of initial and final points– prediction of live output values
TSMA does not introduce significant misspeculation penalties does not impose constraints to build or predict traces
This work focuses on developing effective trace selection schemes for TSMA based on static analysis that uses profiling data
![Page 6: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/6.jpg)
OutlineOutline
Trace Selection
Graph Construction
Graph Analysis
Performance Evaluation
Conclusions
![Page 7: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/7.jpg)
Graph ConstructionGraph Construction
Test input set of the analyzed benchmarks
Abstract data structure is built based on control flow graph data dependences graph predictability of values
Each node represents each static instruction type of instruction, number of dynamic executions pointers and frequencies to succeeding instructions pointers and frequencies to preceding instructions predictability of live output values and dead values
![Page 8: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/8.jpg)
Graph AnalysisGraph Analysis
Two important issues initial and final point of a trace
– maximize trace length & minimize control flow misspeculations
predictability of live output values– prediction accuracy and utilization degree
Three basic heuristics Procedure Trace Heuristic
Loop Trace Heuristic
Instruction Chaining Trace Heuristic
![Page 9: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/9.jpg)
Procedure Trace HeuristicProcedure Trace Heuristic
Procedures relatively frequent
Computations that follow a subroutine fairly independent of the subroutine
except return values and some memory locations
Quite easy to predict the end of a trace
![Page 10: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/10.jpg)
I10
I4
I5
I6
I7I12
I1
I2
I3
I11
I12
I11
CallBranch
Return
T NT
NT
T
Branch
I13 I14
Call instruction is marked as initial point of the trace
I3
1Return address is marked as final point of the trace
I11
2
N instructions after the final point of the trace are checked.
Only significant paths are considered.
I12
I13 I14
I11
3
Each instruction in a significant path it is checked whether any of its operands are produced by any instruction of the procedure.
4
In this case, utilization degree of the value produced and predictability of the producer instruction is evaluated.
5
If it does not achieve a certain threshold, the trace is discarded
6
Procedure Trace HeuristicProcedure Trace Heuristic
![Page 11: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/11.jpg)
Loop Trace HeuristicLoop Trace Heuristic
Traditional source of parallelization and
speculation
We consider the whole execution of a loop as
a trace
The objective is to detect loops whose live-
output values after their whole execution are
predictable
![Page 12: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/12.jpg)
I8
I1
I2
I3
I5 I6
I4
Backward Branch
T NT
Branch
I7T
NT
Backward branch target is marked as initial point of the trace
I2
1Fall-through instruction of the same backward branch
is marked as final point of the trace
I8
2 N instructions after the final point of the trace are checked.
Same behaviour as procedure trace heuristic
3
Loop Trace HeuristicLoop Trace Heuristic
![Page 13: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/13.jpg)
Ichaining Trace HeuristicIchaining Trace Heuristic
Goal to identify large sequences of dynamic instructions besides procedures and loops
A trace is identified by: initial point final point behaviour of conditional branches within the trace
![Page 14: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/14.jpg)
IChaining Trace HeuristicIChaining Trace Heuristic
I2
I12
I1
I5
I7
I11
I8
Conditional Branch
T NT
I4
NT
Conditional Branch
I6
I10I9
Conditional BranchT NT
T
I3
Taken and not taken targets of all conditional branches
are considered as initial points of a trace
I2 I3
I7 I8
I9 I10
1
![Page 15: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/15.jpg)
IChaining Trace HeuristicIChaining Trace Heuristic
I2
I12
I1
I5
I7
I11
I8
Conditional Branch
T NT
I4
NT
Conditional Branch
I6
I10I9
Conditional BranchT NT
T
I3
Given an initial point, a trace is extended
adding successive instructions
I3
2
I5
Every time a conditional branch is found,
the trace is split into two.
3
![Page 16: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/16.jpg)
IChaining Trace HeuristicIChaining Trace Heuristic
I2
I12
I1
I5
I7
I11
I8
Conditional Branch
T NT
I4
NT
Conditional Branch
I6
I10I9
Conditional BranchT NT
T
I3I3
I5
I7
I11
I12
![Page 17: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/17.jpg)
IChaining Trace HeuristicIChaining Trace Heuristic
I2
I12
I1
I5
I7
I11
I8
Conditional Branch
T NT
I4
NT
Conditional Branch
I6
I10I9
Conditional BranchT NT
T
I3I3
I5
I7
I11
I12
![Page 18: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/18.jpg)
IChaining Trace HeuristicIChaining Trace Heuristic
I2
I12
I1
I5
I7
I11
I8
Conditional Branch
T NT
I4
NT
Conditional Branch
I6
I10I9
Conditional BranchT NT
T
I3I3
I5
I7
I11
I12
Final point is reached if: new instruction already belongs to the trace, trace reaches a maximum size or new instructions is an indirect jump.
4
I12
![Page 19: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/19.jpg)
IChaining Trace HeuristicIChaining Trace Heuristic
I2
I12
I1
I5
I7
I11
I8
Conditional Branch
T NT
I4
NT
Conditional Branch
I6
I10I9
Conditional BranchT NT
T
I3I3
I5
I7
I11
I12
Live-output values are determined and its predictability is checked for every trace candidate
(highest between prediction accuracy and utilization degree)
5
Trace is considered predictable, if the multiplication of percentagesof all live output-values is above certain threshold
6If not, final instruction is removed and process starts again.
(until trace reaches a minimum size)
7
I12
![Page 20: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/20.jpg)
Trace Speculation EngineTrace Speculation Engine
Traces are communicated to the hardware at program loading time filling a special hardware structure (trace table)
Each entry of the trace table contains initial PC final PC branch history live-output values information frequency counter
![Page 21: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/21.jpg)
Experimental FrameworkExperimental Framework
Simulator Alpha version of the SimpleScalar Toolset
Benchmarks Spec2000, ref input
Maximum Optimization Level DEC C & F77 compilers with -non_shared -O5
Statistics Collected for 250 million instructions Skipping an initial part of 500 million
![Page 22: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/22.jpg)
Simulation ParametersSimulation Parameters
Base microarchitecture out of order machine, 4 instructions per cycle I cache: 16KB, D cache: 16KB, L2 shared: 256KB bimodal predictor
TSMA additional structures each thread: I window, reorder buffer, register file speculative data cache: 1KB verification engine: up to 8 instructions per cycle trace table: 128 entries, 4-way set associative look ahead buffer: 128 entries
![Page 23: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/23.jpg)
Profiling Analysis ParametersProfiling Analysis Parameters
Value Predictors: Stride & Context
Minimum size of trace: 16
Maximum size of trace: 1024
Maximum number of live-outputs: 32
Threshold to consider a set of LO predictable: 25%
Significative path (mimimum frequency): 10%
![Page 24: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/24.jpg)
Type of Speculated InstructionsType of Speculated Instructions
Amm
pApsi
Crafty Eon
Equake
Gcc Mcf
Mes
a
Mgrid
Sixtra
ck
Vortex
Vpr
A_Mean
100 %
90 %
80 %
70 %
60 %
50 %
40 %
30 %
20 %
10 %
0 %
Loop Heuristic Procedure Heuristic Ichaining Heuristic
![Page 25: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/25.jpg)
Type of Speculated InstructionsType of Speculated Instructions
Procedure and loop traces are relatively low
But sizes are significantly larger than Ichain
Some statistics: procedure trace size: 97.3 loop trace size: 215.8 Ichaining trace size: 36.4 average size of speculated traces: 65.7 average number of live output values: 16.4 branches within a trace (Ichaining): 5.3 traces with same initial PC (Ichaining): 1.57
![Page 26: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/26.jpg)
Type of SpeculationsType of Speculations
Amm
pApsi
Crafty Eon
Equake
Gcc Mcf
Mes
a
Mgrid
Sixtra
ck
Vortex
Vpr
A_Mean
100 %
90 %
80 %
70 %
60 %
50 %
40 %
30 %
20 %
10 %
0 %
Spec KO, Path KO
Spec KO, Path OK
Spec OK, Path KO
Spec OK, Path OK
![Page 27: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/27.jpg)
Type of SpeculationsType of Speculations
Correct speculations: up to 70% 65% for correctly predicted paths 7% for incorrectly predicted paths (positive missprediction)
Incorrect speculations: close to 30% 20% for correctly predicted paths 8% for incorrectly predicted paths
These confirms that mechanism proposed to predict paths and final points provides significant accuracy
![Page 28: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/28.jpg)
SpeedupSpeedup
Amm
pApsi
Crafty Eon
Equake
Gcc Mcf
Mes
a
Mgrid
Sixtra
ck
Vortex
Vpr
A_Mean
1.35
1.30
1.25
1.20
1.15
1.10
1.05
1.00
1.40
1.45
![Page 29: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/29.jpg)
SpeedupSpeedup
Average speedup close to 38%
In spite of misspeculating close to 30%
![Page 30: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/30.jpg)
Type of Cycles of STType of Cycles of ST
Amm
pApsi
Crafty Eon
Equake
Gcc Mcf
Mes
a
Mgrid
Sixtra
ck
Vortex
Vpr
A_Mean
100 %
90 %
80 %
70 %
60 %
50 %
40 %
30 %
20 %
10 %
0 %
ST can not speculate ST can speculate
![Page 31: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/31.jpg)
Type of Cycles of STType of Cycles of ST
25% of the time ST can speculate but does not
find a trace to be speculated performance could be improved with further analysis
75% of the time ST can not speculate because
NST is executing and verifying a speculated
trace speculation may be performed only when NST catches
up ST
![Page 32: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/32.jpg)
Type of Cycles of NSTType of Cycles of NST
Amm
pApsi
Crafty Eon
Equake
Gcc Mcf
Mes
a
Mgrid
Sixtra
ck
Vortex
Vpr
A_Mean
100 %
90 %
80 %
70 %
60 %
50 %
40 %
30 %
20 %
10 %
0 %
NST is verifying instructions NST is executing instructions
![Page 33: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/33.jpg)
Type of Cycles of NSTType of Cycles of NST
65% of the time NST is executing traces
speculated by ST more speculated instructions imply more time
executing instructions
35% of the time NST is verifying
instructions from the look ahead buffer verifying instructions is faster than executing them
![Page 34: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/34.jpg)
Useless Cycles of STUseless Cycles of ST
Amm
pApsi
Crafty Eon
Equake
Gcc Mcf
Mes
a
Mgrid
Sixtra
ck
Vortex
Vpr
A_Mean
100 %
90 %
80 %
70 %
60 %
50 %
40 %
30 %
20 %
10 %
0 %
![Page 35: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/35.jpg)
Useless Cycles of STUseless Cycles of ST
Up to 20% of the time ST is executing
instructions beyond the misspeculation point ST is wasting up to 20% of the time executing
instructions that will be discarded
Ideal scenario would be when this percentage
is negligible
![Page 36: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/36.jpg)
Branch Behaviour DistributionBranch Behaviour Distribution
50 60 70 80 90
100 %
90 %
80 %
70 %
60 %
50 %
40 %
30 %
20 %
10 %
0 %
![Page 37: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/37.jpg)
Branch Behaviour DistributionBranch Behaviour Distribution
Instruction chanining heuristic does not
provide many traces with the same initial point despite the significant number of branches within a
trace (5.3 on average)
The study concludes that the majority of branches take almost always the same direction
Close to 80% of the branches take the same direction more than 90% of the times
![Page 38: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/38.jpg)
ConclusionsConclusions Profile guided analysis to support TSMA
identifies large and highly predictable traces reducing hardware complexity
Three basic heuristics are proposed procedure trace heuristic loop trace heuristic instruction chaining heuristic
Results show speedup of 38% with a 30% of missprediction rate
Future work aggressive trace level predictors generalization to multiple threads
![Page 39: UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco](https://reader035.vdocuments.pub/reader035/viewer/2022081519/56649c755503460f949299aa/html5/thumbnails/39.jpg)
UU PP CC
Questions & AnswersQuestions & Answers
INTERACT-9, San Francisco (USA) - February 13, 2005