source analysis for security
DESCRIPTION
Source Analysis for Security. Trent Jaeger March 29, 2004. Example 1. Example 2. get_free_buffer (struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next; - PowerPoint PPT PresentationTRANSCRIPT
Source Analysis for Security
Trent Jaeger
March 29, 2004
Example 1
sys_fcntl(int fd, ...) {struct file *filp; ...filp = fget(fd); ...err = security_ops->file_ops->fcntl(filp, ...); ...do_fcntl(fd, ...); ...
}
do_fcntl(int fd, ...) { ...err = fcntl_setlk(fd, ...); ...
}
fcntl_setlk(int fd, ...) { ...struct file *filp; ...filp = fget(fd); /* is this the one checked in sys_fcntl? */
Example 2
get_free_buffer(struct stripe_head *sh, …) {struct buffer_head *bh;unsigned long flags;
save_flags(flags);cli();if ((bh = sh->buffer_pool) == NULL)
return NULL;sh->buffer_pool – bh->b_next;bh->b_size = b_size;restore_flags(flags);return bh;
}
Example 3
sys_fcntl(int fd, ...) {err = security_ops->file_ops->fcntl(filp, ...); ...do_fcntl(fd, ..., filp); ...
}
do_fcntl(int fd, ..., struct file *filp) { ...case F_SETLEASE:
fcntl_setlease(fd, filp, ...); ...case F_SETOWN:
err = security_ops->file_ops->set_fowner(filp, ...); ... if (err) ... break; filp->f_owner.pid = arg; filp->f_owner.uid = current->uid; filp->f_owner.euid = current->euid; ...}
Example 3 (con’t)
fcntl_setlease(int fd, struct file *filp, ...) { ... if (my_before != NULL) { error = lease_modify(my_before, arg, fd, filp);
...} ...}
lease_modify(..., int fd, struct file *filp, ...) { ... if (arg == F_UNLCK) { /* should these be f_setowner auth'd ? */ filp->f_owner.pid = 0; filp->f_owner.uid = 0; filp->f_owner.euid = 0; filp->f_owner.signum = 0;}
Example 4
int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry->d_inode; …if (inode->i_op && inode->i_op->setattr) {
error = security_inode_setattr(dentry, attr); if (!error)
error = inode->i_op->setattr(dentry, attr);…
}
Find Software Bugs
Education– Difficult to know how code will be used
Testing– Misses many code paths, time consuming
Manual Inspection– Tedious and error prone
Compiler checking– Context independent
4GL– Incomplete and don’t know how source code will be used
Assurance– Extremely costly and complex – what do we do about existing code?
Limited Source Code Analysis
Source code is the level security is defined– Problems manifest in errors in code (although design can
be a problem too) Compilers can check for various properties
– Rules on program source Programmers can express some properties
– Semantic properties– Must specify correctly (no/few false negatives)– Must not be too conservative (few false positives)– Like to be robust with code changes
Source Code Analysis
Covert source code into a model Convert property into a computation on
model Report positive cases (violate/meet property) Determine if cases are true or false Resolve true cases Refine model or property and repeat
Some Properties
Never/always do X– Never use floating point in kernel
Do X rather than Y Always do X before/after Y
– LSM mediation (Example 1) Never do X before/after Y In situation X, do (not) Y
– Re-enable disabled interrupts (Example 2) In situation X, do Y rather than X
Program Models
Abstract Syntax Tree Control flow Data flow Def-use chain Aliases Type constraints …
Abstract Syntax Tree
Func_declSys_fcntl
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
Expr_stmt=
Var_declerr
Cmpd_stmtSecurity_op
call_decldo_fcntl
Func_declDo_fcntl
Expr_stmt=
Var_declerr
Call_stmtFcntl_setlk(fd)
Func_declFcntl_setlk
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
cmpd_stmtUse filp
Control Flow (Interprocedural)
Func_declSys_fcntl
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
Expr_stmt=
Var_declerr
Cmpd_stmtSecurity_op
call_decldo_fcntl
Func_declDo_fcntl
Expr_stmt=
Var_declerr
Call_stmtFcntl_setlk(fd)
Func_declFcntl_setlk
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
cmpd_stmtUse filp
Control Flow (Intraprocedural)
Func_declSys_fcntl
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
Expr_stmt=
Var_declerr
Cmpd_stmtSecurity_op
call_decldo_fcntl
Func_declDo_fcntl
Expr_stmt=
Var_declerr
Call_stmtFcntl_setlk(fd)
Func_declFcntl_setlk
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
cmpd_stmtUse filp
Data Flow
Func_declSys_fcntl
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
Expr_stmt=
Var_declerr
Cmpd_stmtSecurity_op
call_decldo_fcntl
Func_declDo_fcntl
Expr_stmt=
Var_declerr
Call_stmtFcntl_setlk(fd)
Func_declFcntl_setlk
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
cmpd_stmtUse filp
Def-Use
Func_declSys_fcntl
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
Expr_stmt=
Var_declerr
Cmpd_stmtSecurity_op
call_decldo_fcntl
Func_declDo_fcntl
Expr_stmt=
Var_declerr
Call_stmtFcntl_setlk(fd)
Func_declFcntl_setlk
var_declStruct file *filp
Expr_stmt=
Var_declfilp
call_declFget(fd)
cmpd_stmtUse filp
Property Models
Finite State Automata– Start Operation – Disable Interrupts– Enable Interrupts – End Operation
Type Constraints– Unchecked type– Checked type– Expect checked type
disable
disableEnd Openable
double_enable double_disableExit w/ disabled
enable
CQUAL Static Analysis
CQUAL is a type-based static analysis tool from UC Berkeley
Enables qualification of types, analogous to const
Enables verification that the type passed to a function is the type expected
Used previously for verification of format string vulnerabilities
– Wagner’s group at UC Berkeley in USENIX Security 2001
CQUAL Principles
Interprocedural control flow– do_fcntl calls fcntl_getlk
Def-Use data flow– Assignments tracked back to def where type is declared– Type inference
Variables have type restrictions– Cannot assign a variable to another of an incompatible type– Cannot send a variable as a parameter to a function unless
its type is compatible
CQUAL Approach
Initializing Function
Initializing Function
Authorizing Function
Controlled Function
Controlled Function
X
unchecked inode i
checked inode i
checkedinode i
require
Identify Declarations
foo(){struct inode * unchecked i; /*declaration*/...security_ops->inode_ops->check(i); /*check*/...map = i->i_mapping; /*local controlled op*/...map->readpage(i); /* function call */}
readpage(struct inode *j) {...} /* subroutine */
Identify Controlled Params
foo(){struct inode * unchecked i; /*declaration*/...security_ops->inode_ops->check(i); /*check*/...map = i->i_mapping; /*local controlled op*/...map->readpage(i); /* function call */}
readpage(struct inode * checked j) {...} /* subroutine */
Create “Checked” Variable
foo(){struct inode * unchecked i; /*declaration*/struct inode * checked i2;...security_ops->inode_ops->check(i); /*check*/i2 = (struct inode * checked) i;...map = i2->i_mapping; /*local controlled op*/...map->readpage(i2); /* function call */}
readpage(struct inode * checked j) {...} /* subroutine */
Verify Local Controlled Ops
foo(){struct inode * unchecked i; /*declaration*/struct inode * checked i2;...security_ops->inode_ops->check(i); /*check*/i2 = (struct inode * checked) i;...map = i2->i_mapping; /*local controlled op*/...map->readpage(i2); /* function call */}
readpage(struct inode * checked j) {...} /* subroutine */
Find Assignments to ‘Checked’
foo(){struct inode * unchecked i, * unchecked i3; /*declaration*/struct inode * checked i2, * checked i4;struct dentry * unchecked dentry;...security_ops->inode_ops->check(i); /*check*/i2 = (struct inode * checked) i;i2 = i3; /* unchecked to checked -- fail */i2 = i4; /* checked to checked -- OK */i2 = (struct inode * unchecked) 0xc0000000; /* fail */map->readpage(dentry->d_inode); /* infer checked -- wrong */...}
Verify Interprocedural Paths
foo(){struct inode * unchecked i; /*declaration*/struct inode * checked i2;...security_ops->inode_ops->check(i); /*check*/i2 = (struct inode * checked) i;...map = i2->i_mapping; /*local controlled op*/...map->readpage(i2); /* function call */}
readpage(struct inode * checked j) {...} /* subroutine */
Verify Interprocedural Paths
bar(){struct inode * unchecked i; /*declaration*/...map = i->i_mapping; /*local controlled op*/...map->readpage(i); /* function call */}
readpage(struct inode * checked j) {...} /* subroutine */
CQUAL will detect "type error" on 'i'
Find Example 1 Error
sys_fcntl(int fd, ...) {struct file unchecked *filp; ...filp = fget(fd); ...err = security_ops->file_ops->fcntl(filp, ...); ... /* new checked filp2 */do_fcntl(fd, ...); ...}
do_fcntl(int fd, ...) { ...err = fcntl_setlk(fd, ...); ...}
fcntl_setlk(int fd, ...) { ...struct file unchecked *filp; ...filp = fget(fd); /* is this the one checked in sys_fcntl? */
Sensitivity: Flow and Context
Flow-sensitivity– The order of statements in a function matters– CQUAL is not flow-sensitive– Must create new ‘checked’ variable– Must use GCC to verify intraprocedural paths– Must use GCC to find reassignments after ‘checked’
Context-sensitivity– A function is treated differently depending on calling site– CQUAL is not context-sensitive– If two functions call the same descendant must have the
same requirements in CQUAL
CQUAL Postscript
Flow-sensitive CQUAL– Initial performance was not good
Field level data flow– Extensions at UC Berkeley
We switched to new tool (JaBA)– Interprocedural control flow– Intraprocedural control flow (flow-sensitive)– Context-sensitive– Variable and field-level data flow– Replicated analyses of Example 1 and 3 while preventing
false positives of Example 4
Meta-compilation
Compilers– Have program source– Can implement straightforward rules for source
checking– Lack domain semantics of programs
Programmers– Have domain semantics of programs– Need a means to express these semantics such
that they can be checked
Meta-compilation
Model– GCC abstract syntax tree– Compute interprocedural control flow graph– Compute intraprocedural control flow graph
Properties– Finite state automata – Generate extensions from specification
Computation– FSA state transitions are represented by patterns– Find syntactic patterns in code– Build intraprocedural paths with relevant state changes– For each path, compute resultant state transitions
Properties: Meta Language (metal)
{ #include “linux-includes.h” }
sm check_interrupts {
// Variables used in patterns
decl { unsigned } flags;
// Patterns to specify enable/disable fns
pat enable = { sti(); }
| { restore_flags(flags); } ;
pat disable = { cli() };
// States – implicit initial state
is_enabled: disable is_disabled
enable { err(“double enable”); } ;
is_disabled: disable { err(“double disable”); }
| $end of path$ { err(“exiting w/ intr disabled”); }
disable
disableEnd Openable
double_enable double_disableExit w/ disabled
enable
Example 2 Processing
get_free_buffer(struct stripe_head *sh, …) {struct buffer_head *bh;unsigned long flags;
save_flags(flags);cli();if ((bh = sh->buffer_pool) == NULL)
return NULL;sh->buffer_pool – bh->b_next;bh->b_size = b_size;restore_flags(flags);return bh;
}
disable
enableend of path
end of path err
Meta-Compilation System
Compile Metal State Machine (SM) with mcc Dynamically link SM into xg++
– Compile-time, command line flag
It is “pushed down” “both paths”– Paths are built and checked against SM
All paths vs one pass (flow-sensitive vs. insensitive)– Prune paths that reach join in same state– Fixed point: loop until reach all possible paths
Prune Paths
disable
enable
Choice of paths doesnot matter, so only oneneeds to be kept
Assertion Checking – Side Effects
{ #include “linux-includes.h” }sm Assert flow-insensitive {
// Match expressionsdecl { any } expr, x, y, z;decl { any_call } any_fcall;decl { any_args } args;
// States: find asserts and detect side effectsstart: { assert(expr); }
{mgk_expr_recurse(expr, in_assert); } ;
in_assert: { any_fcall(args) } { err(“fn call”); }
| { x = y } { err(“assignment”); }| { z++ } { err(“post-increment”); }| { z-- } { err(“post-decrement”); }
xgcc Extension (PLDI 2002)
Match patterns to statements– Identify state transitions
Compute intraprocedural paths– Prune those that cannot matter (no state changes)
Combine intraprocedural paths into complete paths– Analysis instance based on a transition from a start state– Paths are generated for each instance – Assignments result in creating a new instance that is a copy
Checking memory management
unknown
allocation
null not-null
Conditional check on ptrimplying null
Conditional check on ptrimplying not null
freed
free
overwrite
stop
dereference
free, dereference
end path
free,dereference
free
Checking memory management
Intraprocedural control flow– Distinguish between paths with null and non-null pointers
Interprocedural control flow– “Global analysis” done in PLDI by combining intraprocedural paths
Data flow– None, pure syntactic comparison– Assignment does result in replication of state machine for assigned
variable Finds bugs, but does not guarantee absence
– No track of assignment to a structure field– No Aliases
False positives– Syntactic path-sensitivity keeps them moderate
Other Example Analyses
Example 3 – (check fcntl and set_fowner)– If we know the required authorizations for each operation,
we can define the states of these ops– Don’t know this (tedious to specify)– We use a consistency analysis (ACM TISSEC, May 2004)
Example 4 – (distinguish between dentryinode and inode)
– Specify that { inode = dentryinode } links inode state with dentry state
– Note that this does not compute from 1st principles, so manual effort is required to ensure it is correct
xgcc Postscript
Lots of papers on finding bugs using these techniques– Lots of simple errors in code
Other aspects – Automating annotation– Statistical analysis
Coverity, Inc.
GCC Architecture
Compilers for C, C++, Java Consists of a sequence of compilation steps
all of which can be hooked (3.0 and greater) Eventually, has a single representation of all
(gimple) Then converts to Register Transfer
Language (RTL) at which point all typing is lost
MOPS
Aim to provide a ‘sound’ analysis architecture– That is, no false negatives for their model
Program model– Pushdown automata of program
Property model– Finite state automata of security property– Temporal properties
Like xgcc, there is no real data flow analysis Unlike xgcc, language for properties is not defined
Formal Basis
FSA M accepts a language of security property violations B– All operation sequences that obey M violate security property
PDA P accepts all feasible program traces T– Traces are interprocedural combination of intraprocedural control
flow paths– Note that traces are control flow representation
Problem: Decide if any trace violates security property– As whether T B = null– Represented by L(M) L(P) = null– Intersection of PDA and FSA can be computed efficiently– Note that TL(P), so some infeasible traces are in L(P)
Example 2
disable
disableEnd Openable
double_enable double_disableExit w/ disabled
enable
get_free_buffer(struct stripe_head *sh, …) {struct buffer_head *bh;unsigned long flags;
save_flags(flags);cli();if ((bh = sh->buffer_pool) == NULL)
return NULL;sh->buffer_pool – bh->b_next;bh->b_size = b_size;restore_flags(flags);return bh;
}
Example 1
check
use
unmediated
assignsys_fcntl(int fd, ...) {struct file *filp; ...filp = fget(fd); ...err = security_ops->file_ops->fcntl(filp, ...); ...do_fcntl(fd, ...); ...
}
do_fcntl(int fd, ...) { ...err = fcntl_setlk(fd, ...); ...
}
fcntl_setlk(int fd, ...) { ...struct file *filp; ...filp = fget(fd); /* is this the one checked in sys_fcntl? */
assign
useuse
Unassigned Use unmediated
assign
check
zero, free
MOPS Distinguishing Features
Modularity– Can create a hierarchy of FSAs– Haven’t seen this used…
Pattern variables– “bound to any expression that satisfies context constraints”– Difference from xgcc patterns?
Modeling– PDA and FSA a combined into a composite PDA that
accepts L(M) L(P) – Can determine all the FSA states that an instruction can be
executed in
Modeling OS for MOPS
Find all kernel variables that affect security– Done manually
Determine the states in the FSA for each– Done manually
Determine transitions between states– Transition in FSA– Automated state space explorer – Execute all paths and create transitions
automatically
Setuid
Variable euid determines privilege Euid can be modified by several functions:
– setuid, seteuid, setreuid, setresuid Value of euid depends on value of other variables on
input to these system calls – ruid, suid– cap_effective, cap_permitted– Are found manually
Transitions indicate system calls that lead to changes in variables
Impact of Soundness
Control flow sound– Combination of PDA and FSA is sound– Context-sensitive– Different than xgcc?
Construction of FSA has manual steps– Identification of variables– Identification of system calls that impact variables– Could these be automated? Data flow…– FSA states are defined manually– Support for finding transitions automatically once we know
the system calls that matter – different than xgcc? Construction of PDA is automated
– Different from xgcc?
Dataflow
Find variables– Manually determine syntactic matches
Definitions Dependencies/States
– Manually determine syntactic dependencies Assignments Parameters Structure members Operations that change state
Values associated with variables– Assume different for each variable; same for struct
Fget(fd) returns a different fd Dentry->inode is same for dentry
Ignore aliases– Could detect in thread; Not usually there
Multiple possible assignments
Classification of Analysis Tools
Specialized checkers (syntactic bugs)– Lint, ITS4, JTest
Annotation checkers (buffer overflows, parse errors)– LCLint, CQual
Automata checkers (temporal bugs – no data flow)– xgcc/metal, MOPS
Control and data flow for custom analyses (temporal ++)– PREfix (C/C++), JaBA (Java)
Predicate Refinement (driver – small program -- verification)– SLAM, MAGIC
More Analysis
Runtime Analysis– Consistency, buffer overflows
Policy Analysis– SELinux
Intrusion Detection– Represent feasible paths through a program