gpu-accelerated evaluation platform for high fidelity networking modeling 11 december 2007 alex...
TRANSCRIPT
GPU-accelerated Evaluation GPU-accelerated Evaluation Platform for High Fidelity Platform for High Fidelity
Networking ModelingNetworking Modeling
11 December 2007
Alex Donkers
Joost Schutte
ContentsContents
Summary of the paperSummary of the paper
EvaluationEvaluation
QuestionsQuestions
Summary of the paperSummary of the paper
Using commercial graphic cards Using commercial graphic cards
to speed up to speed up execution of network simulation models.execution of network simulation models.
Network simulators Network simulators
high fidelity performance evaluation high fidelity performance evaluation more detailed modelsmore detailed models
higher computation cost higher computation cost speed up technique speed up technique
GPU = graphics processing unitGPU = graphics processing unit
Computational power GPU against CPU widening.Computational power GPU against CPU widening.
Computational power of GPU and CPU
(courtesy of Ian Buck, Standford Univ.)
GPU superior because:GPU superior because:Stream processing model Stream processing model
Spatial parallelismSpatial parallelism
Necessities for GPU usage:Necessities for GPU usage:
Identification data parallelism in network simultions Identification data parallelism in network simultions Software abstractionSoftware abstraction
Goal:Goal:Design evaluation platform architecture Design evaluation platform architecture
Efficient utilisation of computational processors Efficient utilisation of computational processors
of GPUs and CPU, memory, IO and other recources.of GPUs and CPU, memory, IO and other recources.
Available in commodity desktops.Available in commodity desktops.
Commodity desktop equipped with multiple GPUs
With Vidia SLI technology more GPUs in singel system.
Suitability for different types of computation:Suitability for different types of computation:CPU = CPU = high performance on single thread of executionhigh performance on single thread of execution
GPU = GPU = many more arithmetic unitsmany more arithmetic units
extremely high extremely high data parallel and data parallel and
instruction parallel executioninstruction parallel execution
Evaluating process high-fidelity network modeling involves:Evaluating process high-fidelity network modeling involves:task-parallel computation task-parallel computation multi CPUmulti CPU
data-parallel computation data-parallel computation GPUsGPUs
Features necessary for GPU acceleration:Features necessary for GPU acceleration:highly data parallelhighly data parallel
arithmetic-intensive arithmetic-intensive
Power of GPUs is showed by implementing two cases from Power of GPUs is showed by implementing two cases from a network environment in both CPU and GPU.a network environment in both CPU and GPU.
Compared are speed and acurracy of the simulation Compared are speed and acurracy of the simulation results.results.
Two cases:Two cases:
Fluid-flow-based TCP model = Fluid-flow-based TCP model = predicts the traffic dynamics at predicts the traffic dynamics at active queue management active queue management routers.routers.
Adaptive antenna modelAdaptive antenna model = = calculates weight of calculates weight of the beam the beam former in direction former in direction minimizingminimizing mean squared error. mean squared error.
Fluid-flow-based TCP modelFluid-flow-based TCP model
• TCP flows and active queue management Routers TCP flows and active queue management Routers are modelled with Stochastic differential equationsare modelled with Stochastic differential equations
• Transform Stochastic differential equations into Transform Stochastic differential equations into ordinary differential equations (ODEs) for CPU useordinary differential equations (ODEs) for CPU use
• CPU-based implementation uses a ODE solver, CPU-based implementation uses a ODE solver, ODE45, provided in MatlabODE45, provided in Matlab
• GPU maps all data structures in CPU to on-board GPU maps all data structures in CPU to on-board memory in GPUmemory in GPU
Fluid-flow-based TCP modelFluid-flow-based TCP model
• Time varying state of routers require Time varying state of routers require recomputation of ODE solvers periodicallyrecomputation of ODE solvers periodically
• Execution speed of model is significantly Execution speed of model is significantly affected by execution speed of ODE solversaffected by execution speed of ODE solvers
• Implementing ODE solver in GPU can Implementing ODE solver in GPU can significantly increase size of network that can significantly increase size of network that can be evaluatedbe evaluated
Adaptive antenna modelAdaptive antenna model
• recursively updates weights of the recursively updates weights of the beamformers in the direction minimizing mean beamformers in the direction minimizing mean squared error (MSE)squared error (MSE)
• Recursive least squares (RLS) algorithm is Recursive least squares (RLS) algorithm is usedused
• Implement data layout and operations of Implement data layout and operations of arrays of complex numbers in GPUarrays of complex numbers in GPU
EvaluationEvaluation
Strong pointsStrong points
Weak pointsWeak points
Simulation modelsSimulation models
Conclusion & Future workConclusion & Future work
Strong PointsStrong Points
Highly data-parallelHighly data-parallel
Arithmetic-intensive Arithmetic-intensive
Weak PointsWeak Points
Processes constitute largely sequential Processes constitute largely sequential operationsoperations
Processes require bit-wise operationsProcesses require bit-wise operations
Solution: Use DSP platformSolution: Use DSP platform
Real-time simulationReal-time simulation
Evaluation simulation modelsEvaluation simulation models
Hardware Platform:Hardware Platform:Dell Dimension desktopDell Dimension desktop• Intel (dual core) 3GHz Pentium 4 CPU Intel (dual core) 3GHz Pentium 4 CPU
1GB DDR2 memory1GB DDR2 memory• nVidia GeForce 7900GTXnVidia GeForce 7900GTX
512MB texture memory512MB texture memory
Vertex & fragment program: Vertex & fragment program: • programmed with openGL and GLSLprogrammed with openGL and GLSL
Simulation modelsSimulation models
Differences between GPU & CPU based Differences between GPU & CPU based simulation for Fluid-flow-based TCP modelsimulation for Fluid-flow-based TCP model
• Difference in prediction of traffic dynamicsDifference in prediction of traffic dynamics
• Difference in execution timeDifference in execution time• GPU outperforms CPU for with 256 flows & 256 GPU outperforms CPU for with 256 flows & 256
queues or more because of larger number of queues or more because of larger number of iterations in GPU based ODE solveriterations in GPU based ODE solver
Normalized ODE solver Evaluation Time
Simulation modelsSimulation models
Adaptive antenna modelAdaptive antenna model• GPU-based simulation runs faster than CPU-GPU-based simulation runs faster than CPU-
based one when antenna array size exceeds based one when antenna array size exceeds 256256
• Execution time of GPU-based implementation Execution time of GPU-based implementation linear decreases with respect to the number linear decreases with respect to the number of sub-carriers due to parallel processingof sub-carriers due to parallel processing
Simulation Execution Times
Conclusions & Future workConclusions & Future work
GPU’s can achieve a speedup of 10x GPU’s can achieve a speedup of 10x without loss of accuracywithout loss of accuracyHigh fidelity network simulations can be High fidelity network simulations can be accelerated by parallel use of CPU & GPU accelerated by parallel use of CPU & GPU unitsunits
Integrate GPU-implemented modules into Integrate GPU-implemented modules into existing simulation-based network existing simulation-based network evaluation platform evaluation platform
Questions?Questions?