1 exposing behavioral differences in cross-language api mapping relations hao zhong suresh...

26
1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM Research, India NC State University, USA

Upload: ann-day

Post on 19-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

1

Exposing Behavioral Differences in Cross-Language

API Mapping Relations

Hao Zhong Suresh Thummalapenta Tao XieInstitute of Software, CAS, China IBM Research, India NC State University, USA

Page 2: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

2

Many programming languages are introduced over decades

Motivation

Business requirements force companies to release applications in multiple languages E.g., Lucene and WordNet have both Java and C# variants

Three major reasons for developing variants in multiple languages

For API libraries, to attract a large number of programmers

For stand-alone applications, to acquire specific features of underlying languages

For mobile applications, to support multiple platforms

Page 3: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

3

Develop in one language and translate to other languages Example applications: Lucene.Net and Db4o Advantage: significant reduction of effort

Many translation tools already exist E.g., Java2CSharp, Net2Java Key idea: replace APIs of one language with

their corresponding APIs in another language via API mapping relations

Trends in Developing Variants

Page 4: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

4

Associate APIs of one language with APIs of the other language

What Are API Mapping Relations?

Help translate code from one language to the other language

Page 5: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

5

Mapped APIs can have behavioral differences Differences among outputs or exceptions being

thrown Such differences lead to defects in translated code

Problem

An Example from Lucene project

Substring API Java: 2nd parameter represents end index C#: 2nd parameter represents #characters

Page 6: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

6

Are such behavioral differences pervasive?

What types of behavioral differences are there?

What types of differences are more common than others?

Are these differences easy to be resolved?

Goals of Our Study

Page 7: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

7

Mapping relations are not available explicitly and take long time to be written manually

Extraction from tools : translation tools use different formats for specifying API mapping relations

Extraction from translated code: applications under translation may not cover APIs of interest

Extraction from translated code: translated code typically has compilation errors, not feasible for testing

Challenges

Page 8: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

8

A tool chain, called TeMAPI, that detects behavioral differences among API mapping relations

Empirical results showing Behavioral differences are pervasive 8 findings on exposed behavioral differences

and implications to API-library implementers&users

Behavioral differences indicating defects in translation tools, and 4 defects were confirmed by developers

Major Contributions

Page 9: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

9

MotivationStudy SetupEmpirical ResultsConclusion

Outline

Page 10: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

10

Subject libraries

Study Setup

Includes two major steps

Page 11: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

11

Create wrapper for each API method in one lang

Apply translation tools on the wrapperExtract the mapping relation from

original & translated wrappers Ignore a mapping relation if the translated wrapper does not

compile

Step 1: Extract Mapping Relations

Page 12: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

12

Step 2: Generate Test Cases

Original Wrapper

Translated Wrapper

Apply translation tool

Original Test case

Translated Test case

Generate test on original wrapper

Execute test on translated

wrapperApply translation

tool

Two existing state-of-the-art test generation tools Pex: a dynamic-symbolic-execution-based test generation

tool Randoop: a feedback-guided random test generation tool

Page 13: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

13

MotivationStudy SetupEmpirical ResultsConclusion

Outline

Page 14: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

14

We address the following research questions: Are behavioral differences pervasive in

cross-language API mapping relations? What are the characteristics of behavioral

differences concerning inputs and outputs?

What are the characteristics of behavioral differences concerning method sequences?

Research Questions

Page 15: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

15

Columns E-Tests: #exception-causing test casesColumn A-Tests: #assertion-failing test cases

RQ1: Pervasiveness

About 50% of the generated test cases fail:Behavioral differences are pervasive in API mapping relations between Java and C#

Page 16: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

16

Finding 1. 36.8% - handling of null inputs. Java.lang.Integer.parseInt(null, 10) ->NumberFormatException System.Convert.ToInt32(null, 10)->0

Implication API-library implementers should clearly define

behaviors of null inputs Programmers should handle null inputs

carefully.

RQ2: Findings and Implications

Page 17: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

17

Finding 2. 22.3% - returned string values. ToString vs toString GetName vs getName

Implication A method in Java and a method in C#

typically return different string values even if they have the same functionality. ▪ Programmers should be cautious while using

these values.

RQ2: Findings and Implications

Page 18: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

18

Finding 3. 11.5% - input domains. java.lang.Boolean.parserBoolean(“test”)-

>false System.Boolean.Parse(“test”)-

>FormatException. Implication

Programmers should be cautious while dealing with methods with odd input values.

RQ2: Findings and Implications

Page 19: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

19

Finding 4. 10.7% - implementations. java.lang.Character.isJavaIdentifierPart(“\

0”)->true ILOG.J2CsMapping.Util.Character.IsCSharpI

dentifierPart (“\0”)->false

Implication Some differences reflect different

natures of different languages, and some others indicate defects in translation tools.▪ Programmers should learn the natures of

different programming languages to figure out such differences, e.g., different definitions of paths and files.

RQ2: Findings and Implications

Page 20: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

20

Finding 5. 7.9% - handling of exceptions.

Implication API-library implementers may design

different exception-handling mechanisms. If programmers do not notice these

differences, they may introduce dead or defective code

java.lang.StringBuffer.insert(int,char)->ArrayIndexOutofBoundsException

System.Text.StringBuilder.Insert(int, char)-> ArgumentOutOfRangeException

IndexOutOfRangeException

RQ2: Findings and Implications

Page 21: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

21

Finding 6. 2.9% - constants. java.lang.Double.MAX VALUE->

1.7976931348623157E+308

System.Double.MaxValue -> 1.79769313486232 E+308

Implication API-library implementers may store

different values in constants, even if two constants have the same name.

Programmers should be careful to use constants.

RQ2: Findings and Implications

Page 22: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

22

Finding 7. Different inheritance hierarchies that can lead to compilation errors.

Implication When programmers translate code (e.g., cast

statements), they should be aware of such differences.

StringBufferInputStream var4 = ...;InputStreamReader var10 = new InputStreamReader((InputStream)var4, var8);

StringReader var4 = ...;StreamReader var10 = new StreamReader((Stream)var4, var8);

StringBufferInputStream is a subclass of InputStream

StringReader is NOT a subclass of Stream

RQ3: Findings and Implications

Page 23: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

23

Finding 8. 3.4% - method sequences.

Implication Legal method sequences can become illegal after

translation, due to various factors such as constraints in the target programming language and field accessibility.

DateFormatSymbols var0 = new DateFormatSymbols();String[] var16 = new String[]...;var0.setShortMonths(var16);

DateTimeFormatInfo var0 = System.Globalization.DateTimeFormatInfo.CurrentInfo;String[] var16 = new String[]...;var0.AbbreviatedMonthNames = var16;

InvalidOperationException

RQ3: Findings and Implications

Page 24: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

24

Tool chain + empirical study of exposing behavioral differences of API mapping relations

Behavioral differences are pervasive and dangerous

8 findings with valuable implications for API-library implementers and users + 4 defects confirmed

Conclusion

Original Wrapper

Translated Wrapper

Apply translation tool

Original Test case

Translated Test case

Generate test on original wrapper

Execute test on translated wrapper

Apply translation tool

Page 25: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

25

Thank You

Acknowledgment: NSF of China No. 61100071, NSF of China No. 61228203, NSF grants CCF-0845272, CCF-0915400, CNF-0958235, CNS-1160603, and an NSA Science of Security Lablet Grant

Page 26: 1 Exposing Behavioral Differences in Cross-Language API Mapping Relations Hao Zhong Suresh Thummalapenta Tao Xie Institute of Software, CAS, China IBM

26

Tool chain + empirical study of exposing behavioral differences of API mapping relations

Behavioral differences are pervasive and dangerous

8 findings with valuable implications for API-library implementers and users + 4 defects confirmed

Conclusion

Original Wrapper

Translated Wrapper

Apply translation tool

Original Test case

Translated Test case

Generate test on original wrapper

Execute test on translated wrapper

Apply translation tool