jaguar: a next-generation low-power x86-64 core
DESCRIPTION
Jaguar: A Next-Generation Low-Power x86-64 Core. University of Tehran School of Electrical and Computer Engineering . Provided By: Ali Teymouri Based on article “ Jaguar: A Next-Generation Low-Power x86-64 Core ” Coarse: Custom Implementation of DSP Systems . Outline. Introduction - PowerPoint PPT PresentationTRANSCRIPT
1
Jaguar: A Next-Generation Low-Power x86-64 Core
Provided By:Ali Teymouri
Based on article “Jaguar: A Next-Generation Low-Power x86-64 Core”
Coarse: Custom Implementation of DSP Systems
University of TehranSchool of Electrical and Computer
Engineering
2
OutlineIntroduction MotivationComparing two coreArchitectureImprovements Conclusion
3
List of AMD microprocessors [5]1 AMD-originated architectures
1.1 Am2900 series (1975)1.2 29000 (29K) (1987–95)
2 non x86 architecture processors2.1 2nd source (1974)2.2 2nd source (1982)
3 x86 architecture processors3.1 2nd source (1979–91)3.2 Am X86 series (1991–95)3.3 K5 architecture (1995)3.4 K6 architecture (1997–2001)3.5 K7 architecture (1999–2005)3.6 K8 core architecture3.7 K10 core architecture3.8 Bulldozer module architecture3.9 Bobcat core architecture
4
BobcatCore power gating and a
micro architecture optimized for low power
designed for mobile, tablet to address the specific
customer demands4.5 – 18 watt power range
Bobcat low-power core [2]
5
Jaguar core
Bobcat low-power core [4]Jaguar core [4]
Jaguar
6
Jaguar CUFirst AMD 28nm quad-
core x86-64Build unit to deploy
into a wide variety of SoCs for different applications
Span wide array of applications from sub 5W to 25W
Jaguar CU[4]
7
Motivation Jaguar
[1]
Build SoC to fit range of markets
– Tablet, hybrids– Value notebook– Ultrathin notebook– Value desktop
8
Comparing two core
[1]
9
ArchitectureImproved IPC, frequency and power more than BTEstimated typical IPC improvement over “Bobcat”:
>15%*The load-store unit is redesigned4x32B Instruction Cache loop buffer for powerImproved Instruction Cache prefetcher for IPCAdded L2 prefetcherAdded hardware integer dividerImproved C6 and CC6 entry/exit latenciesClock gate >92% flops in typical applications
10
ArchitectureThe JG core is optimized at
two main frequency targets, low and high voltage
giving the core a dynamic range for application in several markets
3 Vt solution:HVT/RVT/LVT
Longer lengths for each VtBT had 10 metal stackJG uses 11 metal stack [1]
11
High Speed Flopcustom built flip-flops [4] to maximize performance over
traditional master-slave flopslarger flops consume more dynamic powerTo minimize the power and area impact they are
inserted only in critical paths
[1]
custom flops account for < 8%
12
CU Level Clock DistributionMatched clock delay to all endpoints to minimize
latencyextensive clock gating Each unit’s clock independently gated to reduce
dynamic power
[1]
13
Power GatingIntegrated Power GatingHeaders have 4 independent enables to Longer lengths
for each VtDiagram showing highlighted headers within the JG coreArea overhead is ~3%
[1]
[1]
14
Conclusion“Jaguar” is first AMD 28nm bulk CPUQuad core with shared L2support a wide range of applicationsIs low-power and Focus on high density and smaller
chip area Improved IPC, frequency and power more than BTWorthy successor to “Bobcat” x86-64 core
15
References
[1]. T. Singh, J. Bell, S. Southard. , “Jaguar: A Next-Generation Low-Power x86-64 Core,” in 2013 IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech . Papers , Feb. 17–21, 2013, section 3
[2]. D. Foley, P. Bansal, D. Cherepacha, R. Wasmuth, A. Gunasekar, S. Gutta, A. Naini, ‘‘A Low-Power Integrated x86–64 and Graphics Processor for Mobile Computing Devices, ’’ IEEE Journal of Solid-State Circuits , VOL. 47, NO. 1, January 2012.
[3]. www.hitechreview.com [4]. www. semiaccurate.com [5]. www.wikipedia.org