jvm virtual method invoking optimization based on cam table

32
page 1 JVM virtual method invoking optimization based on CAM table 自自自自自自 , 自自自自自自 www.loongson.cn Songsong Cai Institute of Computing Technology, Chinese Academy of Sciences [email protected] 28/7/2011

Upload: iolana

Post on 30-Jan-2016

49 views

Category:

Documents


0 download

DESCRIPTION

自主决定命运 , 创新成就未来 www.loongson.cn. JVM virtual method invoking optimization based on CAM table. Songsong Cai Institute of Computing Technology, Chinese Academy of Sciences [email protected] 28/7/2011. Outline. Introduction. Monomorphic Inline Caching in HotSpot. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: JVM virtual method invoking optimization based on CAM table

page 1

JVM virtual method invoking optimization based on CAM table

自主决定命运 , 创新成就未来www.loongson.cn

Songsong Cai

Institute of Computing Technology, Chinese Academy of Sciences

[email protected]

28/7/2011

Page 2: JVM virtual method invoking optimization based on CAM table

page 2

Outline

Introduction

Monomorphic Inline Caching in HotSpot

Hardware Design of CAM Used in Virtual Method Call

SW/HW Co-design Virtual Method Invoking Mechanism

Experimental Results and Analysis

Conclusions

References

Page 3: JVM virtual method invoking optimization based on CAM table

page 3

Methods in Java Programming Language (1)

Class (static) Method The class method does not require an instance Class method uses static binding When JVM calls a class method, it will be based on the type

of object reference (usually known while compiling) to select the call method

Page 4: JVM virtual method invoking optimization based on CAM table

page 4

Methods in Java Programming Language (2)

Instance (virtual) Method The instance method needs an instance Instance method uses dynamic binding when calling an instance method, the virtual machine will be

based on the actual class object (only known while running) to select the call method

The information of type can be known only when JVM runs to the call site

The dynamic resolution is generally translated into an indirect jump, which can usually lead to pipeline stall

The instance method takes up a large proportion, such as in SPECjvm98, a virtual method call occurs in Java program every 12-40 bytecodes

Page 5: JVM virtual method invoking optimization based on CAM table

page 5

Types of method invocation in Java

invokestatic process the static method invocation The entrance of the constant pool includes a symbolic

reference of the target method, then JVM pops the parameters and executes the target method

invokevirtual process virtual method invocation JVM needs to pops the object reference and the parameters

before the execution of the target method Invokespecial

process virtual method invocation invokeinterface

process virtual method invocation

Page 6: JVM virtual method invoking optimization based on CAM table

page 6

The percentage of virtual method invocation

0%

20%

40%

60%

80%

100%

al l method i nvocat i on

vi r tual method i nvocat i on

The percentage of virtual method invocation in SPECjvm2008 benchmark

Page 7: JVM virtual method invoking optimization based on CAM table

page 7

Related Work : Inline Caching

Origin The call type in the same call site will not change frequently With this locality, we can cache the call type in the call site

Kinds Monomorphic inline caching

store methods and the corresponding type value at the call site in an inline way

For each virtual method call, it compares the type values to jump to the target value method, rather than searches for objective method in many target of virtual method

Polymorphic inline caching different types of target method will be recorded at the same call site The type value of current call can be compared with these types of

storage in turn, until the matching type is found, the program jumps to the corresponding target method

Page 8: JVM virtual method invoking optimization based on CAM table

page 8

Shortage of Inline Caching

Monomorphic inline caching It cannot handle the case that several different types of

methods are called frequently in the same call site

Polymorphic inline caching Although polymorphic inline caching can solve the problem

above, its complex implementation will result in additional costs

Page 9: JVM virtual method invoking optimization based on CAM table

page 9

Solution: Software and Hardware co-design

Hardware (CAM table) We design and implement CAM (content associated memory)

table to index and search the virtual target method The CAM table is implemented by hardware and can be

managed by software

Software (Efficient Algorithm) With the CAM table, we optimize the virtual method invocation

that the target method can be resolved easily The program can jump to the target method directly, rather

than resolve the virtual method dynamically at runtime

Page 10: JVM virtual method invoking optimization based on CAM table

page 10

Thesis Contributions

System architecture We present a Java Virtual Machine system with high

performance of virtual methods invocation. The JVM is simple, but efficient

CAM lookup table We design and implement the CAM hardware lookup table to

help resolve the virtual method. The target method can be easily resolved with the CAM table

Efficient algorithm of virtual method invocation With the CAM table, we present a virtual method invoking

algorithm based on software and hardware co-design. The algorithm attains a relatively high performance on virtual method invocation

Page 11: JVM virtual method invoking optimization based on CAM table

page 11

Outline

Monomorphic Inline Caching in HotSpot

Introduction

Hardware Design of CAM Used in Virtual Method Call

SW/HW Co-design Virtual Method Invoking Mechanism

Experimental Results and Analysis

Conclusions

References

Page 12: JVM virtual method invoking optimization based on CAM table

page 12

The Virtual Method Invocation in HotSpot

HotSpot the core of the open source project Openjdk6 standards and stability

The invocation of virtual method uses optimized monomorphic inline caching

Page 13: JVM virtual method invoking optimization based on CAM table

page 13

The State Transition of Monomorphic Inline Caching

uninitialized

the comparison of the typemisses for many times

numbers of virtual method call

the comparison of the typemisses at the call site

dynamic resolve successfullyat the first time

polymorphic

monomorphic

Page 14: JVM virtual method invoking optimization based on CAM table

page 14

A bad case

Var value = {1,”a”,2,”b”,3,”c”,4,”d”};For (var I in values){

Document.write(values[i].toString());}

the program always calls target methods with different types, the performance loss can be very large

Although such extreme case is rare, the type of virtual method call changes very commonly, so the performance loss caused by the overhead of virtual method call is very serious

Page 15: JVM virtual method invoking optimization based on CAM table

page 15

Outline

Hardware Design of CAM Used in Virtual Method Call

Monomorphic Inline Caching in HotSpot

Introduction

SW/HW Co-design Virtual Method Invoking Mechanism

Experimental Results and Analysis

Conclusions

References

Page 16: JVM virtual method invoking optimization based on CAM table

page 16

The Structure of CAM Table

8 40 64

0

1

2

61

62

63

the current method call instruction PC

XOR type of the method

ASID CAM_value RAM_value

…… ……

Page 17: JVM virtual method invoking optimization based on CAM table

page 17

Operating Instructions of CAM Table

Instructions CAMPI

look up CAM table according to the index

CAMPV look up CAM table according to the value

CAMWI write CAM table according to the index

Usage All CAM entries can be written by the instruction CAMWI, and

RAM value can be read by the instruction CAMRI. Instruction CAMPI and CAMPV are used to look up CAM

Page 18: JVM virtual method invoking optimization based on CAM table

page 18

Evaluation of CAM Entry Number

0. 00%

20. 00%

40. 00%

60. 00%

80. 00%

100. 00%

16 ent r i es

32 ent r i es64 ent r i es

128 ent r i es

Page 19: JVM virtual method invoking optimization based on CAM table

page 19

Outline

SW/HW Co-design Virtual Method Invoking Mechanism

Monomorphic Inline Caching in HotSpot

Hardware Design of CAM Used in Virtual Method Call

Introduction

Experimental Results and Analysis

Conclusions

References

Page 20: JVM virtual method invoking optimization based on CAM table

page 20

Flow Diagram of the Virtual Method Invoking Mechanism

look up CAMJump to the target method

and execute

the type comparison of inline cache mechanism by software in HotSpot

1、Jump to the target method and execute2、fill the CAM table

basic dynamic method invocation

1、Jump to the target method and execute2、refill the inline cache at the call site

PC of the call siteXOR

type of the method

hit

miss

hit

miss

resolve successfully

Page 21: JVM virtual method invoking optimization based on CAM table

page 21

Comparison between These Three Dynamic Resolutions

CAM hitfoo(){…… }

CAM miss typeof(x) != cached(x) ?

foo(){…… }

dynamic method invocation

X.foo()With CAM

X.foo()With inline cache

X.foo()With the basic

dynamic method resolusion

Page 22: JVM virtual method invoking optimization based on CAM table

page 22

Outline

Experimental Results and Analysis

Monomorphic Inline Caching in HotSpot

Hardware Design of CAM Used in Virtual Method Call

SW/HW Co-design Virtual Method Invoking Mechanism

Introduction

Conclusions

References

Page 23: JVM virtual method invoking optimization based on CAM table

page 23

Evaluation Platform

Software Hotspot

Hardware Loongson-3 Processor

4-core high-performance general-purpose processor CAM table is implemented in the processor After we add the CAM table, the whole processor area increases

less than 5 ‰, the power consumption increases less than 1‰, the cost is negligible.

Page 24: JVM virtual method invoking optimization based on CAM table

page 24

Virtual_Test Evaluation (1)

publ i c cl ass I nl i neCache{

stati c Ani mal [] ani mal = new Ani mal [8]; stati c { ani mal [0] = new Ani mal () ; ani mal [1] = ani mal [0]; ani mal [2] = new Dog(); ani mal [3] = ani mal [2]; ani mal [4] = new Cat(); ani mal [5] = ani mal [4]; }

publ i c stati c voi d mai n(Stri ng argv[]) { i nt i = 0; i nt ret ; whi l e ( i ++ < 1000000) { run(i %6); } }

publ i c stati c voi d run(i nt i ) { i nt ret ; ret = ani mal [ i ] . run(); }}

cl ass Ani mal{ publ i c Ani mal () { }

publ i c i nt run() {i nt a = 12;return a;

System. out. pri nt l n("ani mal i s runni ng. "); }}

cl ass Dog extends Ani mal{ publ i c Dog() { super(); }

publ i c i nt run() {i nt b = 0;i nt c = 23;i nt d = b+c;return d;

System. out. pri nt l n("dog i s runni ng. ") ; }}

cl ass Cat extends Ani mal{ publ i c Cat() { super(); }

publ i c i nt run() { i nt a = 345;

i nt b = a*12;return b;System. out. pri nt l n("cat i s runni ng. ") ;

}

}

Page 25: JVM virtual method invoking optimization based on CAM table

page 25

Virtual_Test Evaluation (2)

Virtual test Original Optimized

Inline cache hit rate 13.3% 76.4%

run time of program (second) 36.098 30.250

Page 26: JVM virtual method invoking optimization based on CAM table

page 26

SPECjvm98 Evaluation

0. 00%

20. 00%

40. 00%

60. 00%

80. 00%

100. 00%

120. 00%

140. 00%

160. 00%

sl owest bef ore opt i mi zed

f astest bef ore opt i mi zed

sl owest af t er opt i mi zed

f astest af t er opt i mi zed

Page 27: JVM virtual method invoking optimization based on CAM table

page 27

Outline

Conclusions

Monomorphic Inline Caching in HotSpot

Hardware Design of CAM Used in Virtual Method Call

SW/HW Co-design Virtual Method Invoking Mechanism

Experimental Results and Analysis

Introduction

References

Page 28: JVM virtual method invoking optimization based on CAM table

page 28

Conclusions

Problem The performance loss resulted from the dynamic method

resolution of virtual method call is always an important reason that causes the poor performance of Java language

Solution Design and achieve the hardware of CAM lookup table Present a mechanism of virtual method call based on

hardware and software co-design Performance improvement

In the case that there are frequently multiple types of target method at the same call site

the virtual hit rate increases from 13.3% to 76.4% the performance of the program improves by 16.2% it improves the performance of SPECjvm98 by 6.4% on average

Page 29: JVM virtual method invoking optimization based on CAM table

page 29

Outline

References

Monomorphic Inline Caching in HotSpot

Hardware Design of CAM Used in Virtual Method Call

SW/HW Co-design Virtual Method Invoking Mechanism

Experimental Results and Analysis

Conclusions

Introduction

Page 30: JVM virtual method invoking optimization based on CAM table

page 30

References (1)

J. Gosling, B. Joy, G. Steele, and G. Bracha. The JavaTM Language Specification. Addison-Wesley, 3rd edition, 2005.

B. Venners, Inside the Java virtual machine: McGraw-Hill Professional, 1999. Karel Driesen. Efficient Polymorphic Calls. The Kluwer International Series in Engineering

and Computer Science. Kluwer Academic Publisher, 2001. K. Driesen, P. Lam, J. Miecznikowski, F. Qian, and D. Rayside. On the predictability of Java

byte codes (abstract) (poster session), In: Addendum to the 2000 proceedings of the conference on Object-oriented programming, systems, languages, and applications (Addendum). Minneapolis, Minnesota, United States: ACM, pp. 127-128, 2000.

L. P. Deutsch and A. M. Schiffman. Efficient implementation of the smalltalk-80 system, In: Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages. Salt Lake City, Utah, United States: ACM, pp. 297-302, 1984.

D. M. Ungar, The design and evaluation of a high performance Smalltalk system. 1986. D. Ungar and D. Patterson, What Price Smalltalk. Computer;(United States). 20(1), 1987. http://en.wikipedia.org/wiki/Inline_caching. [J. Dolby and A. Chien. An automatic object inlining optimization and its evaluation. In:

Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, Vancouver, British Columbia, United States: ACM, pp. 345–357, 2000.

Page 31: JVM virtual method invoking optimization based on CAM table

page 31

References (2)

O. Lhot´ ak and L. Hendren. Run-time evaluation of opportunities for object inlining in Java. Concurrency and Computation: Practice and Experience, 17(5-6): pp. 515–537, 2005.

V. Sundaresan, L. Hendren, C. Razafimahefa, R. Vallée-Rai, e-Rai, P. Lam, E. Gagnon, and C. Godin. Practical virtual method call resolution for Java, In: Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. Minneapolis, Minnesota, United States: ACM, pp. 264-280, 2000.

T. Kotzmann and H. M¨ ossenb¨ ock. Escape analysis in the context of dynamic compilation and deoptimization. In: Proceedings of the ACM/USENIX International Conference on Virtual Execution Environments, Chicago, United States: ACM, pp. 111–120, 2005.

U. Hölzle, D. Ungar. Optimizing dynamically-dispatched calls with run-time type feedback, In: Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation. Orlando, Florida, United States: ACM, pp. 326-336, 1994.

R. Veldema, C. J. H. Jacobs, R. F. H. Hofman, and H. E. Bal. Object combining: A new aggressive optimization for object intensive programs. Concurrency and Computation: Practice and Experience, 17(5-6): pp.439–464, 2005.

R. Griesemer and S. Mitrovic, A compiler for the Java HotSpot(tm) virtual machine. The School of Niklaus Wirth: The Art of Simplicity. pp. 133-152.

D. F. Bacon and P. F. Sweeney, Fast static analysis of C++ virtual function calls. ACM SIGPLAN Notices. 31(10): pp. 324-341, 1996.

Craig Chambers and Weimin Chen. Efficient Multiple and Predicated Dispatching. ACM SIGPLAN Notices. 34(10): pp. 238-255, 1999.

Page 32: JVM virtual method invoking optimization based on CAM table

北京市海淀区中关村科学院南路 10号 100190No.10 Kexueyuan South Road,zhongguancunHaidian District,beijing 100190,china

Thanks!