program analysis via graph reachability thomas reps university of wisconsin pldi 00 tutorial,...

153
Program Analysis via Graph Reachability Thomas Reps University of Wisconsin DI 00 Tutorial, Vancouver, B.C., June 18, 20 http://www.cs.wisc.edu/~reps/

Upload: augustus-marshall

Post on 16-Dec-2015

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Program Analysis via Graph Reachability

Thomas Reps

University of Wisconsin

PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000

http://www.cs.wisc.edu/~reps/

Page 2: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

PLDI 00 Registration Form

• PLDI 00: …………………….. $ ____

• Tutorial (morning): …………… $ ____

• Tutorial (afternoon): ………….. $ ____

• Tutorial (evening): ……………. $ – 0 –

Page 3: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Applications• Program optimization• Program-understanding and

software-reengineering• Security

– information flow

• Verification– model checking– security of crypto-based protocols for

distributed systems

Page 4: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

1987

1993

1994

1995

1997

1998

1996

Slicing&

Applications

DataflowAnalysis Demand

Algorithms

SetConstraints

Structure-TransmittedDependences

CFLReachability

Page 5: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

. . . As Well As . . .• Flow-insensitive points-to analysis

• Complexity results– Linear . . . cubic . . . undecidable variants– PTIME-completeness

• Model checking of recursive hierarchical finite-state machines– “infinite”-state systems– linear-time and cubic-time algorithms

Page 6: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

. . . And Also

• Analysis of attribute grammars• Security of crypto-based protocols for

distributed systems [Dolev, Even, & Karp 83]

• Formal-language problems– CFL-recognition (given G and , is L(G)?)

– 2DPDA- and 2NPDA-simulation

• Given M and , is L(M)?

• String-matching problems

Page 7: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Unifying Conceptual Modelfor Dataflow-Analysis Literature

• Linear-time gen-kill [Hecht 76], [Kou 77]• Path-constrained DFA [Holley & Rosen 81]• Linear-time GMOD [Cooper & Kennedy 88]• Flow-sensitive MOD [Callahan 88]• Linear-time interprocedural gen-kill

[Knoop & Steffen 93]• Linear-time bidirectional gen-kill [Dhamdhere 94]• Relationship to interprocedural DFA

[Sharir & Pneuli 81], [Knoop & Steffen 92]

Page 8: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Collaborators

• Susan Horwitz

• Mooly Sagiv

• Genevieve Rosay

• David Melski

• David Binkley

• Michael Benedikt

• Patrice Godefroid

Page 9: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Themes

• Harnessing CFL-reachability

• Relationship to other analysis paradigms

• Exhaustive alg. Demand alg.

• Understanding complexity– Linear . . . cubic . . . undecidable

• Beyond CFL-reachability

Page 10: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Backward Slice

Backward slice with respect to “printf(“%d\n”,i)”

Page 11: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Backward Slice

Backward slice with respect to “printf(“%d\n”,i)”

Page 12: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

int main() {

int i = 1;while (i < 11) {

i = i + 1;}

printf(“%d\n”,i);}

Slice Extraction

Backward slice with respect to “printf(“%d\n”,i)”

Page 13: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Forward Slice

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Forward slice with respect to “sum = 0”

Page 14: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Forward slice with respect to “sum = 0”

Forward Slice

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Page 15: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

What Are Slices Useful For?• Understanding Programs

– What is affected by what?

• Restructuring Programs– Isolation of separate “computational threads”

• Program Specialization and Reuse– Slices = specialized programs– Only reuse needed slices

• Program Differencing– Compare slices to identify changes

• Testing– What new test cases would improve coverage?– What regression tests must be rerun after a change?

Page 16: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Line-Character-Count Program

void line_char_count(FILE *f) {int lines = 0;int chars;BOOL eof_flag = FALSE;int n;extern void scan_line(FILE *f, BOOL *bptr, int *iptr);scan_line(f, &eof_flag, &n);chars = n;while(eof_flag == FALSE){

lines = lines + 1;scan_line(f, &eof_flag, &n);chars = chars + n;

}printf(“lines = %d\n”, lines);printf(“chars = %d\n”, chars);

}

Page 17: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Character-Count Program

void char_count(FILE *f) {int lines = 0;int chars;BOOL eof_flag = FALSE;int n;extern void scan_line(FILE *f, BOOL *bptr, int *iptr);scan_line(f, &eof_flag, &n);chars = n;while(eof_flag == FALSE){

lines = lines + 1;scan_line(f, &eof_flag, &n);chars = chars + n;

}printf(“lines = %d\n”, lines);printf(“chars = %d\n”, chars);

}

Page 18: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Line-Character-Count Program

void line_char_count(FILE *f) {int lines = 0;int chars;BOOL eof_flag = FALSE;int n;extern void scan_line(FILE *f, BOOL *bptr, int *iptr);scan_line(f, &eof_flag, &n);chars = n;while(eof_flag == FALSE){

lines = lines + 1;scan_line(f, &eof_flag, &n);chars = chars + n;

}printf(“lines = %d\n”, lines);printf(“chars = %d\n”, chars);

}

Page 19: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Line-Count Program

void line_count(FILE *f) {int lines = 0;int chars;BOOL eof_flag = FALSE;int n;extern void scan_line2(FILE *f, BOOL *bptr, int *iptr);scan_line2(f, &eof_flag, &n);chars = n;while(eof_flag == FALSE){

lines = lines + 1;scan_line2(f, &eof_flag, &n);chars = chars + n;

}printf(“lines = %d\n”, lines);printf(“chars = %d\n”, chars);

}

Page 20: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Specialization Via Slicing

wc -lc

wc -c wc -l

void line_count(FILE *f);

Not partial evaluation!

Page 21: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Control Flow Graph

Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

T

F

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Page 22: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Flow Dependence Graphint main() {

int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

} Enter

sum = 0 printf(sum) printf(i)

sum = sum + i i = i + i

Flow dependence

p q Value of variableassigned at p may beused at q.

i = 1 while(i < 11)

Page 23: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

q is reached from pif condition p istrue (T), not otherwise.

Control Dependence Graph

Control dependence

p qT

p qF

Similar for false (F).

Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

T T

TT T

TTT

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Page 24: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Program Dependence Graph (PDG)int main() {

int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

} Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

T

TT T

T

Control dependence

Flow dependence

TT

T

Page 25: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Program Dependence Graph (PDG)int main() {

int i = 1;int sum = 0;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

} Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

T

TT T

TTT

T

Opposite Order

Same PDG

Page 26: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Backward Sliceint main() {

int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

} Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

T

TT T

TTT

T

Page 27: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Backward Slice (2)int main() {

int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

} Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

T

TT T

TTT

T

Page 28: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Backward Slice (3)int main() {

int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

} Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

T

TT T

TTT

T

Page 29: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Backward Slice (4)int main() {

int sum = 0;int i = 1;while (i < 11) {

sum = sum + i;i = i + 1;

}printf(“%d\n”,sum);printf(“%d\n”,i);

} Enter

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

sum = sum + i i = i + i

TT

TT T

TTT

Page 30: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Slice Extractionint main() {

int i = 1;while (i < 11) {

i = i + 1;}

printf(“%d\n”,i);} Enter

i = 1 while(i < 11) printf(i)

i = i + iT

TT

TT

Page 31: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CodeSurfer

Page 32: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps
Page 33: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Browsing a Dependence Graph

Pretend this is your favorite browser

What does clicking on a link do?You geta new page

Or you move to an internal tag

Page 34: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps
Page 35: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps
Page 36: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Slice

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

int add(int x, int y) {return x + y;

}

Backward slice with respect to “printf(“%d\n”,i)”

Page 37: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Slice

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

int add(int x, int y) {return x + y;

}

Backward slice with respect to “printf(“%d\n”,i)”

Page 38: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Interprocedural Slice

int add(int x, int y) {return x + y;

}

Superfluous components included by Weiser’s slicing algorithm [TSE 84]Left out by algorithm of Horwitz, Reps, & Binkley [PLDI 88; TOPLAS 90]

Page 39: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

System Dependence Graph (SDG)

Enter main

Call p Call p

Enter p

Page 40: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

SDG for the Sum ProgramEnter main

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

Call add Call add

xin = sum yin = i sum = xout xin = i yin= 1 i = xout

Enter add

x = xin y = yin x = x + y xout = x

Page 41: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward SliceEnter main

Call p Call p

Enter p

Page 42: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward Slice (2)Enter main

Call p Call p

Enter p

Page 43: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward Slice (3)Enter main

Call p Call p

Enter p

Page 44: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward Slice (4)Enter main

Call p Call p

Enter p

Page 45: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward Slice (5)Enter main

Call p Call p

Enter p

Page 46: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward Slice (6)Enter main

Call p Call p

Enter p

[

]

)

(

Page 47: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Matched-Parenthesis Path

)(

)[

Page 48: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward Slice (6)Enter main

Call p Call p

Enter p

Page 49: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward Slice (7)Enter main

Call p Call p

Enter p

Page 50: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Slice ExtractionEnter main

Call p

Enter p

Page 51: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Slice of the Sum ProgramEnter main

i = 1 while(i < 11) printf(i)

Call add

xin = i yin= 1 i = xout

Enter add

x = xin y = yin x = x + y xout = x

Page 52: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability[Yannakakis 90]

• G: Graph (N nodes, E edges)

• L: A context-free language

• L-path from s to t iff

• Running time: O(N 3)

Lts ,*

Page 53: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Slicingvia CFL-Reachability

• Graph: System dependence graph

• L: L(matched) [roughly]

• Node m is in the slice w.r.t. n iff there

is an L(matched)-path from m to n

Page 54: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Asymptotic Running Time [Reps, Horwitz, Sagiv, & Rosay 94]

• CFL-reachability

– System dependence graph: N nodes, E edges

– Running time: O(N 3)

• System dependence graph Special structure

Running time: O(E + CallSites MaxParams3)

Page 55: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

( e [

e

]e

[

e ] ] e )

matched | e | [ matched ] | ( matched ) | matched matched

CFL-Reachability

s ts

( e e e e e e[[ [

t

)]] ]

s ts t

Ordinary Graph Reachability

Page 56: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability via Dynamic Programming

GrammarGraph

BC

A

A B C

Page 57: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

s t

Degenerate Case: CFL-Recognition

“(a + b) * c” L(exp) ?

exp id | exp + exp | exp * exp | ( exp )

)( a cb+ *

Page 58: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

*a + +)b c

s t

Degenerate Case: CFL-Recognition

“a + b) * c +” L(exp) ?

exp id | exp + exp | exp * exp | ( exp )

Page 59: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CYK: Context-Free Recognition

= “( [ ] ) [ ]”

Is L(M)?

M M M | ( M ) | [ M ] | ( ) | [ ]

Page 60: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CYK: Context-Free Recognition

M M M | ( M ) | [ M ] | ( ) | [ ]

M M M | LPM ) | LBM ] | ( ) | [ ]LPM ( MLBM [ M

Page 61: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Is “( [ ] ) [ ]” L(M)?

( [ ] ) [ ]

{M}

{M} {M}

{LPM}

{ ( } { [ }{ ) } { ] }{ [ } { ] }

length

start

M [ ]LPM ( M

Page 62: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Is “( [ ] ) [ ]” L(M)?

( [ ] ) [ ]

{M}

{M}

{M} {M}

{LPM}

{ (} { [ }{ ) } { ] }{ [ } { ] }

length

start

M? M M M

Page 63: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CYK: Graphs vs. Tables

Is “( [ ] ) [ ]” L(M)?

s t

( [ ] ) [ ]

M M M | LPM ) | LBM ] | ( ) | [ ] LPM ( M LBM [ M

M M

LPMM

M

Page 64: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability via Dynamic Programming

GrammarGraph

BC

A

A B C

Page 65: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Dynamic Transitive Closure ?!

• Aiken et al.– Set-constraint solvers– Points-to analysis

• Henglein et al.– type inference

• But a CFL captures a non-transitive reachability relation [Valiant 75]

Page 66: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

S T

Program Chopping

Given source S and target T, what program points transmit effects from S to T?

Intersect forward slice from S with backward slice from T, right?

Page 67: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Non-Transitivity and Slicing

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

int add(int x, int y) {return x + y;

}

Forward slice with respect to “sum = 0”

Page 68: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

Forward slice with respect to “sum = 0”

Non-Transitivity and Slicing

int add(int x, int y) {return x + y;

}

Page 69: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Non-Transitivity and Slicing

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

int add(int x, int y) {return x + y;

}

Backward slice with respect to “printf(“%d\n”,i)”

Page 70: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Non-Transitivity and Slicing

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

int add(int x, int y) {return x + y;

}

Backward slice with respect to “printf(“%d\n”,i)”

Page 71: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Forward slice with respect to “sum = 0”

Non-Transitivity and Slicing

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

int add(int x, int y) {return x + y;

}

Backward slice with respect to “printf(“%d\n”,i)”

Page 72: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Non-Transitivity and Slicing

int main() {int sum = 0;int i = 1;while (i < 11) {

sum = add(sum,i);i = add(i,1);

}printf(“%d\n”,sum);printf(“%d\n”,i);

}

int add(int x, int y) {return x + y;

}

Chop with respect to “sum = 0” and “printf(“%d\n”,i)”

Page 73: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Non-Transitivity and SlicingEnter main

sum = 0 i = 1 while(i < 11) printf(sum) printf(i)

Call add Call add

xin = sum yin = i sum = xout xin = i yin= 1 i = xout

Enter add

x = xin y = yin x = x + y xout = x

( ]

Page 74: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Program Chopping

Given source S and target T, what program points transmit effects from S to T?

S T

“Precise interprocedural chopping”[Reps & Rosay FSE 95]

Page 75: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CF-Recognition vs. CFL-Reachability• CF-Recognition

– Chain graphs– General grammar: sub-cubic time [Valiant75]– LL(1), LR(1): linear time

• CFL-Reachability– General graphs: O(N3)– LL(1): O(N3)– LR(1): O(N3)– Certain kinds of graphs: O(N+E)– Regular languages: O(N+E)

Gen/kill IDFA

GMOD IDFA

Page 76: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Regular-Language Reachability[Yannakakis 90]

• G: Graph (N nodes, E edges)

• L: A regular language

• L-path from s to t iff

• Running time: O(N+E)

• Ordinary reachability (= transitive closure)

– Label each edge with e

– L is e*

Lts ,*

vs. O(N3)

Page 77: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Security of Crypto-Based Protocols for Distributed System

• “Ping-pong” protocols

(1) X —EncryptY(M X) Y

(2) Y —EncryptX(M) X

• [Dolev & Yao 83]–O(N8) algorithm

• [Dolev, Even, & Karp 83]– Less well known than [Dolev & Yao 83]–O(N3) algorithm

Page 78: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

[Dolev, Even, & Karp 83]

Id EncryptX Id DecryptX

Id DecryptX Id EncryptX

Id . . .

Id ?

Message SaboteurEY

EY

AX

AZ

Page 79: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Themes

• Harnessing CFL-reachability

• Relationship to other analysis paradigms

• Exhaustive alg. Demand alg.

• Understanding complexity– Linear . . . cubic . . . undecidable

• Beyond CFL-reachability

Page 80: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Relationship to Other Analysis Paradigms

• Dataflow analysis

–reachability versus equation solving

• Deduction

• Set constraints

Page 81: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Dataflow Analysis

• Goal: For each point in the program, determine a superset of the “facts” that could possibly hold during execution

• Examples– Constant propagation– Reaching definitions– Live variables– Possibly uninitialized variables

Page 82: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Useful For . . .

• Optimizing compilers

• Parallelizing compilers

• Tools that detect possible logical errors

• Tools that show the effects of a proposed modification

Page 83: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Possibly Uninitialized VariablesStart

x = 3

if . . .

y = x

y = w

w = 8

printf(y)

},,.{ yxwV

}{. xVV

VV .VV .

}{. wVV

}{ else }{ then

if .

yVyV

VxV

}{ else }{ then

if .

yVyV

VwV

{w,x,y}

{w,y}

{w,y}

{w,y}

{w}

{w,y}{}

{w,y}

{}

Page 84: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Precise Intraprocedural Analysis

start n

C

ffffpf 121

kkp

)(]MOP[]PathsTo[

pf Cnnp

p

f 1 f 2 f kf 1k

Page 85: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

x = 3

p(x,y)

return from p

printf(y)

start main

exit main

start p(a,b)

if . . .

b = a

p(a,b)

return from p

printf(b)

exit p

(

)

]

(

Page 86: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Precise Interprocedural Analysis

start n

C

f 4

f 5

f 3

start q exitq

callq ret

)(]MOMP[]hsTo[MatchedPat

pf Cnnp

p

f 1 f 2 f kf 1k

f 2k

f 3k

( )

[Sharir & Pnueli 81]

Page 87: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Representing Dataflow Functions

Identity Function

VV .f

}{.f bVConstant Function

a b c

a b c

},{}),f({ baba

}{}),f({ bba

Page 88: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Representing Dataflow Functions

}{}){.(f cbVV

}{ else }{ then

if .f

bVbV

VaV

“Gen/Kill” Function

Non-“Gen/Kill” Function a b c

a b c

},{}),f({ caba

},{}),f({ baba

Page 89: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

x = 3

p(x,y)

return from p

printf(y)

start main

exit main

start p(a,b)

if . . .

b = a

p(a,b)

return from p

printf(b)

exit p

x y a b

Page 90: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

else }{ then

if .f 2

cVbV

a b c}{ else }{then

if .f 1

bVbV

VaV

b ca

Composing Dataflow Functions

}{ else }{then

if .f 1

bVbV

VaV

b ca

else }{ then

if .f 2

cVbV

}),({ff 12 ca }{c

Page 91: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

x = 3

p(x,y)

return from p

start main

exit main

start p(a,b)

if . . .

b = a

p(a,b)

return from p

exit p

x y a b

printf(y)

Might b beuninitializedhere?

printf(b) NO!

(

]

Might y beuninitializedhere?

YES!

(

)

Page 92: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

matched matched matched

| (i matched )i 1 i CallSites | edge |

stack

) ( (

((

(

)

) )

)

stack

( )

Off Limits!

Page 93: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

) (

(

((

(

)

)

)

( )

(

stack

(

(

unbalLeft matched unbalLeft

| (i unbalLeft 1 i CallSites |

stack

Off Limits!

Page 94: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Dataflow Analysisvia CFL-Reachability

• Graph: Exploded control-flow graph

• L: L(unbalLeft)

• Fact d holds at n iff there is an L(unbalLeft)-path

from dnstartmain , to,

Page 95: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Asymptotic Running Time [Reps, Horwitz, & Sagiv 95]

• CFL-reachability– Exploded control-flow graph: ND nodes– Running time: O(N3D3)

• Exploded control-flow graph Special structure

Running time: O(ED3)

Typically: E N, hence O(ED3) O(ND3)

“Gen/kill” problems: O(ED)

Page 96: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Why Bother?“We’re only interested in million-line programs”

• Know thy enemy!– “Any” algorithm must do these operations– Avoid pitfalls (e.g., claiming O(N2) algorithm)

• The essence of “context sensitivity”• Special cases

– “Gen/kill” problems: O(ED)• Compression techniques

– Basic blocks– SSA form, sparse evaluation graphs

• Demand algorithms

Page 97: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Relationship to Other Analysis Paradigms

• Dataflow analysis

–reachability versus equation solving

• Deduction

• Set constraints

Page 98: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

The Need for Pointer Analysis

int main() { int sum = 0; int i = 1; int *p = &sum; int *q = &i; int (*f)(int,int) = add; while (*q < 11) { *p = (*f)(*p,*q); *q = (*f)(*q,1); } printf(“%d\n”,*p); printf(“%d\n”,*q);}

int add(int x, int y) { return x + y;}

Page 99: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

The Need for Pointer Analysis

int main() { int sum = 0; int i = 1; int *p = &sum; int *q = &i; int (*f)(int,int) = add; while (*q < 11) { *p = (*f)(*p,*q); *q = (*f)(*q,1); } printf(“%d\n”,*p); printf(“%d\n”,*q);}

int add(int x, int y) { return x + y;}

Page 100: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

The Need for Pointer Analysis

int main() { int sum = 0; int i = 1; int *p = &sum; int *q = &i; int (*f)(int,int) = add; while (i < 11) { sum = add(sum,i); i = add(i,1); } printf(“%d\n”,sum); printf(“%d\n”,i);}

int add(int x, int y) { return x + y;}

Page 101: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Flow-Sensitive Points-To Analysis

p = &q;

p = q;

p = *q;

*p = q;

p q

pr1

r2

q

r1

r2

q

s1

s2

s3

p

ps1

s2

qr1

r2

p q

pr1

r2

q

r1

r2

q

s1

s2

s3

p

ps1

s2

qr1

r2

Page 102: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Flow-Sensitive Flow-Insensitive

start main

exit main

3

2

1

45

3

2

1

4

5

Page 103: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Flow-Insensitive Points-To Analysis[Andersen 94, Shapiro & Horwitz 97]

p = &q;

p = q;

p = *q;

*p = q;

p q

pr1

r2

q

r1

r2

q

s1

s2

s3

p

ps1

s2

qr1

r2

Page 104: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Flow-Insensitive Points-To Analysis

a = &e; b = a; c = &f;*b = c; d = *a;

a

d

b

cf

e

Page 105: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Flow-Insensitive Points-To Analysis• Andersen [Thesis 94]

– Formulated using set constraints– Cubic-time algorithm

• Shapiro & Horwitz (1995; [POPL 97])– Re-formulated as a graph-grammar problem

• Reps (1995; [unpublished])– Re-formulated as a Horn-clause program

• Melski (1996; see [Reps, IST98])– Re-formulated via CFL-reachability

Page 106: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability via Dynamic Programming

GrammarGraph

BC

A

A B C

Page 107: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability = Chain Programs

Grammar

A B C

Graph

BC

a(X,Z) :- b(X,Y), c(Y,Z).

zx

y

A

Page 108: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Base Facts for Points-To Analysis

p = &q;

p = q;

p = *q;

*p = q;

assignAddr(p,q).

assign(p,q).

assignStar(p,q).

starAssign(p,q).

Page 109: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Rules for Points-To Analysis (I)

pointsTo(P,Q) :- assignAddr(P,Q).

pointsTo(P,R) :- assign(P,Q), pointsTo(Q,R).

p = &q; p q

p = q; pr1

r2

q

Page 110: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Rules for Points-To Analysis (II)

pointsTo(P,S) :- assignStar(P,Q),pointsTo(Q,R),pointsTo(R,S).

pointsTo(R,S) :- starAssign(P,Q),pointsTo(P,R),pointsTo(Q,S).

p = *q; r1

r2

q

s1

s2

s3

p

*p = q; ps1

s2

qr1

r2

Page 111: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Creating a Chain Program

pointsTo(R,S) :- starAssign(P,Q),pointsTo(P,R),pointsTo(Q,S).

*p = q; ps1

s2

qr1

r2

pointsTo(R,S) :- pointsTo(P,R),starAssign(P,Q),pointsTo(Q,S).

pointsTo(R,S) :- pointsTo(R,P),starAssign(P,Q),pointsTo(Q,S).

pointsTo(R,P) :- pointsTo(P,R).

Page 112: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Base Facts for Points-To Analysis

p = &q;

p = q;

p = *q;

*p = q;

assignAddr(p,q).

assign(p,q).

assignStar(p,q).

starAssign(p,q).starAssign(q,p).

assignStar(q,p).

assign(q,p).

assignAddr(q,p).

Page 113: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Creating a Chain ProgrampointsTo(P,Q) :- assignAddr(P,Q).

pointsTo(P,R) :- assign(P,Q), pointsTo(Q,R).

pointsTo(P,S) :- assignStar(P,Q),pointsTo(Q,R),pointsTo(R,S).

pointsTo(Q,P) :- assignAddr(Q,P).

pointsTo(R,S) :- pointsTo(R,P),starAssign(P,Q),pointsTo(Q,S).

pointsTo(S,P) :- pointsTo(S,R),pointsTo(R,Q),assignStar(Q,P).

pointsTo(S,R) :- pointsTo(S,Q),starAssign(Q,P),pointsTo(P,R).

pointsTo(R,P) :- pointsTo(R,Q), assign(Q,P).

Page 114: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

. . . and now to CFL-Reachability

pointsTo assign pointsTo

pointsTo assignStar pointsTo pointsTo

pointsTo assignAddr

pointsTo assignAddr

pointsTo pointsTo starAssign pointsTo

pointsTo pointsTo pointsTo assignStar

pointsTo pointsTo starAssign pointsTo

pointsTo pointsTo assign

Page 115: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Relationship to Other Analysis Paradigms

• Dataflow analysis

–reachability versus equation solving

• Deduction

• Set constraints

Page 116: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

1987

1993

1994

1995

1997

1998

1996

Slicing&

Applications

DataflowAnalysis Demand

Algorithms

SetConstraints

Structure-TransmittedDependences

CFLReachability

Structure-TransmittedDependences Set

Constraints

Page 117: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Structure-Transmitted Dependences [Reps1995]

McCarthy’s equations: car(cons(x,y)) = x cdr(cons(x,y)) = y

w = cons(x,y); v = car(w);

v

w

yx

dep

dep

dephd dep hddep -1

hd tl

hd 1-

Page 118: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Set Constraintsw = cons(x,y); ),cons( YXW

)(cons 11 WV v = car(w);

McCarthy’s Equations Revisited

)(provided ,)),(cons(cons 11 Y IXYX

)}(),(|{))(cons( 2111

1 VIvvconsvVI

Semantics of Set Constraints

)}( and )( |),({)),(cons( 22112121 VIvVIvvvconsVVI

Page 119: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachabilityversus

Set Constraints

• Lazy languages: CFL-reachability is more natural– car(cons(X,Y)) = X

• Strict languages: Set constraints are more natural– car(cons(X,Y)) = X, provided I(Y)

• But . . . SC and CFL-reachability are equivalent! – [Melski & Reps 97]

Page 120: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Solving Set Constraints

aW

),cons( YXW )(cons 1

2 WU

X is “inhabited”

),cons( YXW

X is “inhabited”Y is “inhabited”

)(cons 11 WV

),cons( YXW Y is “inhabited”

W is “inhabited”

W is “inhabited”XV YU

Page 121: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

W

Simulating “Inhabited”

aW

dep inhab depinhab

inhab

a

dep dep

inhab

Page 122: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

W

YX

Simulating “Inhabited”

hd tlhd tl),cons( YXW

inhabinhab

tlinhab tl hd inhab hdinhab

inhab

Page 123: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

V

W

YX

Simulating “Provided I(Y) ”

),cons( YXW )(cons 1

1 WV

hd tlhd tl

inhab

dep

hd tlinhab tl hddep 1-hd 1-

provided I(Y)

Page 124: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Themes

• Harnessing CFL-reachability

• Relationship to other analysis paradigms

• Exhaustive alg. Demand alg.

• Understanding complexity– Linear . . . cubic . . . undecidable

• Beyond CFL-reachability

Page 125: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Exhaustive Versus Demand Analysis

• Exhaustive analysis: All facts at all points

• Optimization: Concentrate on inner loops

• Program-understanding tools: Only some facts

are of interest

Page 126: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Exhaustive Versus Demand Analysis

• Demand analysis:– Does a given fact hold at a given point?– Which facts hold at a given point?– At which points does a given fact hold?

• Demand analysis via CFL-reachability– single-source/single-target CFL-reachability– single-source/multi-target CFL-reachability– multi-source/single-target CFL-reachability

Page 127: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

x = 3

p(x,y)

return from p

printf(y)

start main

exit main

start p(a,b)

if . . .

b = a

p(a,b)

return from p

printf(b)

exit p

x y a b

YES!

(

)

NO!

“Semi-exhaustive”:All “appropriate” demands

Might y beuninitializedhere?

Might b beuninitializedhere?

Page 128: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Experimental Results[Horwitz , Reps, & Sagiv 1995]

• 53 C programs (200-6,700 lines)• For a single fact of interest:

– demand always better than exhaustive

• All “appropriate” demands beats exhaustive when percentage of “yes” answers is high– Live variables– Truly live variables– Constant predicates– . . .

Page 129: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

A Related Result [Sagiv, Reps, & Horwitz 1996]

• [Uses a generalized analysis technique]• 38 C programs (300-6,000 lines)

– copy-constant propagation– linear-constant propagation

• All “appropriate” demands always beats exhaustive– factor of 1.14 to about 6

Page 130: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Exhaustive Versus Demand Analysis

• Demand algorithms for

– Interprocedural dataflow analysis

– Set constraints

– Points-to analysis

Page 131: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Demand Analysis and LP Queries (I)

• Flow-insensitive points-to analysis– Does variable p point to q?

• Issue query: ?- pointsTo(p, q).• Solve single-source/single-target L(pointsTo)-

reachability problem

– What does variable p point to?• Issue query: ?- pointsTo(p, Q).• Solve single-source L(pointsTo)-reachability problem

– What variables point to q?• Issue query: ?- pointsTo(P, q).• Solve single-target L(pointsTo)-reachability problem

Page 132: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Demand Analysis and LP Queries (II)

• Flow-sensitive analysis– Does a given fact f hold at a given point p?

?- dfFact(p, f).– Which facts hold at a given point p?

?- dfFact(p, F).– At which points does a given fact f hold?

?- dfFact(P, f).

• E.g., flow-sensitive points-to analysis?- dfFact(p, pointsTo(x, Y)).?- dfFact(P, pointsTo(x, y)).etc.

Page 133: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Themes

• Harnessing CFL-reachability

• Relationship to other analysis paradigms

• Exhaustive alg. Demand alg.

• Understanding complexity– Linear . . . cubic . . . undecidable

• Beyond CFL-reachability

Page 134: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Interprocedural Backward SliceEnter main

Call p Call p

Enter p

[

]

)

(

Page 135: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

x = 3

p(x,y)

return from p

start main

exit main

start p(a,b)

if . . .

b = a

p(a,b)

return from p

exit p

x y a b

printf(y)printf(b)

y may beuninitialized here

[

])

(

Page 136: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Structure-Transmitted Dependences [Reps1995]

McCarthy’s equations: car(cons(x,y)) = x cdr(cons(x,y)) = y

w = cons(x,y); v = car(w);

v

w

yx

hd tl

hd 1-

Page 137: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Dependences + Matched Paths?

Enter main

Enter p

w=cons(x,y) Call p

w

Call p

v = car(w)

w

w

x y

hd

hd-1

( )

tl

[ ]

Page 138: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Undecidable![Reps, TOPLAS 00]

hd hd-1( )

Interleaved Parentheses!

Page 139: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Themes

• Harnessing CFL-reachability

• Relationship to other analysis paradigms

• Exhaustive alg. Demand alg.

• Understanding complexity– Linear . . . cubic . . . undecidable

• Beyond CFL-reachability

Page 140: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability via Dynamic Programming

GrammarGraph

BC

A

A B C

Page 141: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Beyond CFL-Reachability:Composition of Linear Functions

x.3x+5x.2x+1

x.6x+11

(x.2x+1) (x.3x+5) = x.6x+11

Page 142: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Beyond CFL-Reachability:Composition of Linear Functions

• Interprocedural constant propagation– [Sagiv, Reps, & Horwitz TCS 96]

• Interprocedural path profiling– The number of path fragments contributed

by a procedure is a function

– [Melski & Reps CC 99]

Page 143: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Model-Checking of Recursive HFSMs [Benedikt, Godefroid, & Reps (in prep.)]

• Non-recursive HFSMs [Alur & Yannakakis 98]

• Ordinary FSMs– T-reachability/circularity queries

• Recursive HFSMs– Matched-parenthesis T-reachability/circularity

• Key observation: Linear-time algorithms for matched-parenthesis T-reachability/cyclicity– Single-entry/multi-exit [or multi-entry/single-exit]– Deterministic, multi-entry/multi-exit

Page 144: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

T-Cyclicity inHierarchical Kripke Structures

SN/SX SN/MX MN/SX MN/MXnon-rec: O(|k|) non-rec: O(|k|) ? ?rec: O(|k|3) rec: ?

SN/SX SN/MX MN/SX MN/MXO(|k|) O(|k|) O(|k|) O(|k|3)

O(|k||t|) [lin rec] O(|k|) [det]

Page 145: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Recursive HFSMs: Data Complexity

SN/SX SN/MX MN/SX MN/MX LTL non-rec: O(|k|) non-rec: O(|k|) ? ?

rec: P-time rec: ?

CTL O(|k|) bad ? badCTL* O(|k|2) [L2] bad ? bad

Page 146: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Recursive HFSMs: Data Complexity

SN/SX SN/MX MN/SX MN/MXLTL O(|k|) O(|k|) O(|k|) O(|k|3)

O(|k||t|) [lin rec] O(|k|) [det]

CTL O(|k|) bad O(|k|) badCTL* O(|k|) bad O(|k|) bad

Not Dual Problems!

Page 147: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability: Scope of Applicability

• Static analysis– Slicing, DFA, structure-transmitted dep.,

points-to analysis

• Verification– Security of crypto-based protocols for

distributed systems [Dolev, Even, & Karp 83]– Model-checking recursive HFSMs

• Formal-language theory– CF-, 2DPDA-, 2NPDA-recognition– Attribute-grammar analysis

Page 148: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

CFL-Reachability: Benefits• Algorithms

– Exhaustive & demand

• Complexity– Linear-time and cubic-time algorithms– PTIME-completeness– Variants that are undecidable

• Complementary to– Equations– Set constraints– Types– . . .

Page 149: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Most Significant Contributions: 1987-2000

• Asymptotically fastest algorithms– Interprocedural slicing– Interprocedural dataflow analysis

• Demand algorithms– Interprocedural dataflow analysis [CC94,FSE95]– All “appropriate” demands beats exhaustive

• Tool for slicing and browsing ANSI C– Slices programs as large as 75,000 lines– University research distribution– Commercial product: CodeSurfer

(GrammaTech, Inc.)

Page 150: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

Most Significant Contributions: 1987-2000

• Unifying conceptual model– [Kou 77], [Holley&Rosen 81], [Cooper&Kennedy 88],

[Callahan 88], [Horwitz,Reps,&Binkley 88], . . .

• Identifies fundamental bottlenecks– Cubic-time “barrier”– Litmus test: quadratic-time algorithm?!– PTIME-complete limits to parallelizability

• Existence proofs for new algorithms– Demand algorithm for set constraints– Demand algorithm for points-to analysis

Page 151: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

References

• Papers by Reps and collaborators:– http://www.cs.wisc.edu/~reps/

• CFL-reachability– Yannakakis, M., Graph-theoretic methods in

database theory, PODS 90.– Reps, T., Program analysis via graph

reachability, Inf. and Softw. Tech. 98.

Page 152: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

References• Slicing, chopping, etc.

– Horwitz, Reps, & Binkley, TOPLAS 90– Reps, Horwitz, Sagiv, & Rosay, FSE 94– Reps & Rosay, FSE 95

• Dataflow analysis– Reps, Horwitz, & Sagiv, POPL 95– Horwitz, Reps, & Sagiv, FSE 95, TR-1283

• Structure dependences; set constraints– Reps, PEPM 95– Melski & Reps, Theor. Comp. Sci. 00

Page 153: Program Analysis via Graph Reachability Thomas Reps University of Wisconsin PLDI 00 Tutorial, Vancouver, B.C., June 18, 2000 reps

References• Complexity

– Undecidability: Reps, TOPLAS 00?– PTIME-completeness: Reps, Acta Inf. 96.

• Verification– Dolev, Even, & Karp, Inf & Control 82.– Benedikt, Godefroid, & Reps, In prep.

• Beyond CFL-reachability– Sagiv, Reps, Horwitz, Theor. Comp. Sci 96– Melski & Reps, CC 99, TR-1382