slide c.1 sas mathematicalmarketing appendix c: sas software uses of sas crm datamining data...
TRANSCRIPT
Slide C.Slide C.11SASSAS
MathematicalMathematicalMarketingMarketing
Appendix C: SAS Software
Uses of SAS
CRM datamining data warehousing
linear programming forecasting econometrics nonlinear parameter estimation
Data Types SAS Can Deal with
massaging sorting merging
lookups reporting
Web log data questionnaires
scanner data
Ideal When You Are …
simulation marketing models statistical analysis
panel data relational databases
transforming manipulating
Slide C.Slide C.22SASSAS
MathematicalMathematicalMarketingMarketing
Two Types of SAS Routines
DATA Steps• Read and Write Data
• Create a SAS dataset
• Manipulate and Transform Data
• Open-Ended - Procedural Language
• Presence of INPUT statement creates a Loop
PROC Steps• Analyze Data
• Canned or Preprogrammed Input and Output
Slide C.Slide C.33SASSAS
MathematicalMathematicalMarketingMarketing
A Simple Example
data my_study ;
input id gender $ green recycle ;
cards ;
001 m 4 2
002 m 3 1
003 f 3 2
••• ••• ••• •••
;
proc reg data=my_study ;
class gender ;
model recycle = green gender ;
Slide C.Slide C.44SASSAS
MathematicalMathematicalMarketingMarketing
The Sequence Depends on the Need
data step to read in scanner data;
data step to read in panel data ;
data step to merge scanner and panel records ;
data step to change the level of analysis to the household ;
proc step to create covariance matrix ;
data step to write covariance matrix in LISREL compatable format ;
Slide C.Slide C.55SASSAS
MathematicalMathematicalMarketingMarketing
The INPUT Statement - Character Data
List input
$ after a variable - character var
input last_name $ first_name $ initial $ ;
Formatted input
$w. after a variable
input last_name $22. first_name $22. initial $1.
Column input
$ start-column - end-column
input last_name $ 1 - 22 first_name $ 23 - 44 initial $ 45 ;
Slide C.Slide C.66SASSAS
MathematicalMathematicalMarketingMarketing
The INPUT Statement - Numeric Data
List input input score_1 score_2 score_3 ;
Formatted input
w.d (field width and number of digits after an implied decimal point) after a variable
input score_1 $10. score_2 $10. score_3 10.
Column input
$ start-column - end-column
input score_1 1 - 10 score_2 11 - 20 score_3 21 - 30 ;
Slide C.Slide C.77SASSAS
MathematicalMathematicalMarketingMarketing
Grouped INPUT Statements
input (var1-var3) (10. 10. 10.) ;
input (var1-var3) (3*10.) ;
input (var1-var3) (10.) ;
input (name var1-var3) ($10. 3*5.1) ;
Slide C.Slide C.88SASSAS
MathematicalMathematicalMarketingMarketing
The Column Pointer in the INPUT Statement
input @3 var1 10. ;
input more @ ;
if more then input @15 x1 x2 ;
input @12 x1 5. +3 x2 ;
Slide C.Slide C.99SASSAS
MathematicalMathematicalMarketingMarketing
Documenting INPUT Statements
input @4 green1 4. /* greeness scale first item */
@9 green2 4. /* greeness scale 2nd item */
@20 aware1 5. /* awareness scale first item */
@20 aware2 5. ; /* awareness scale 2nd item */
Slide C.Slide C.1010SASSAS
MathematicalMathematicalMarketingMarketing
The Line Pointer
input x1 x2 x3 / x4 x4 x6 ;
input x1 x2 x3 #2 x4 x5 x6 ;
input x1 x2 x3
#2 x4 x5 x6 ;
Slide C.Slide C.1111SASSAS
MathematicalMathematicalMarketingMarketing
The PUT Statement
put _all_ ;
put a= b= ;
put _infile_ ;
put _page_ ;
col1 = 22 ; col2 = 14 ;
put @col1 var245 @col2 var246 ;
put x1 x2 x3 @ ;input x4 ;put x4 ;
put x1 #2 x2 ;
put x1 / x2 ;
Slide C.Slide C.1212SASSAS
MathematicalMathematicalMarketingMarketing
Copying Raw Data
infile in ′c:\old.data′ ;
file out ′c:\new.data′ ;
data _null_ ;
infile in ;
outfile out ;
input ;
put _infile_ ;
Slide C.Slide C.1313SASSAS
MathematicalMathematicalMarketingMarketing
SAS Constants
'21Dec1981'D
'Charles F. Hofacker'
492992.1223
Slide C.Slide C.1414SASSAS
MathematicalMathematicalMarketingMarketing
Assignment Statement
x = a + b ;
y = x / 2. ;
prob = 1 - exp(-z**2/2) ;
Slide C.Slide C.1515SASSAS
MathematicalMathematicalMarketingMarketing
The SAS Array Statement
array y {20} y1-y20 ;
do i = 1 to 20 ;
y{i} = 11 - y{i} ;
end ;
Slide C.Slide C.1616SASSAS
MathematicalMathematicalMarketingMarketing
The Sum Statement
variable+expression ;
retain variable ;
variable = variable + expression ;
n+1 ;
cumulated + x ;
Slide C.Slide C.1717SASSAS
MathematicalMathematicalMarketingMarketing
IF Statement
if a >= 45 then a = 45 ;
if 0 < age < 1 then age = 1 ;
if a = 2 or b = 3 then c = 1 ;
if a = 2 and b = 3 then c = 1 ;
if major = "FIN" ;
if major = "FIN" then do ; a = 1 ; b = 2 ;end ;
Slide C.Slide C.1818SASSAS
MathematicalMathematicalMarketingMarketing
More IF Statement Expressions
name ne 'smith'
name ~= 'smith'
x eq 1 or x eq 2
x=1 | x=2
a <= b | a >= c
a le b or a ge c
a1 and a2 or a3
(a1 and a2) or a3
if then etc ;
Slide C.Slide C.1919SASSAS
MathematicalMathematicalMarketingMarketing
Concatenating Datasets Sequentially
data both ; set first second ;
first:
id x y1 2 32 1 23 3 1
second:
id x y4 3 25 2 16 1 1
both:
id x y1 2 32 1 23 3 14 3 25 2 16 1 1
Slide C.Slide C.2020SASSAS
MathematicalMathematicalMarketingMarketing
Interleaving Two Datasets
proc sort data=store1 ;
by date ;
proc sort data=store2 ;
by date ;
data both ;
set store1 store2 ;
by date ;
Slide C.Slide C.2121SASSAS
MathematicalMathematicalMarketingMarketing
Concatenating Datasets Horizontally
data both ; merge left right ;
left:
id y1 y21 2 32 1 23 3 1
right:
id x1 x21 3 22 2 13 1 1
both:
id y1 y2 x1 x21 2 3 3 22 1 2 2 13 3 1 1 1
Slide C.Slide C.2222SASSAS
MathematicalMathematicalMarketingMarketing
Table LookUp
proc sort data=database out=sorted by part ; data both ; merge table sorted ; by part ;
table:
part desc0011 hammer0012 nail0013 bow
database:
id part1 00112 00113 0013
both:
id part desc1 0011 hammer2 0011 hammer3 0013 bow
The last observations is repeated if one of the input data sets is smaller
Slide C.Slide C.2323SASSAS
MathematicalMathematicalMarketingMarketing
Update
master:
part desc0011 hammer0012 nail0013 bow
transaction:
Part desc0011 jackhammer
data new_master ;
update master transaction ;
by part ;
new_master:
part desc0011 jackhammer0012 nail0013 bow
Slide C.Slide C.2424SASSAS
MathematicalMathematicalMarketingMarketing
Changing the Level of Analysis 1
Subject Time Score
A 1 A1
A 2 A2
A 3 A3
B 1 B1
B 2 B2
B 3 B3
Subject Score1 Score2 Score3
A A1 A2 A3
B B1 B2 B3
Before
After
Slide C.Slide C.2525SASSAS
MathematicalMathematicalMarketingMarketing
Changing the Level of Analysis 1
data after ; keep subject score1 score2 score3 ; retain score1 score2 ; set before ;
if time=1 then score1 = score ; else if time=2 then score2 = score ; else if time=3 then do ; score3 = score ; output ; end ;
Slide C.Slide C.2626SASSAS
MathematicalMathematicalMarketingMarketing
Changing the Level of Analysis 2
Day Score Student
1 12 A
1 11 B
1 13 C
2 14 A
2 10 B
2 9 C
Day Highest Student
1 13 C
2 14 A
Before
After
Slide C.Slide C.2727SASSAS
MathematicalMathematicalMarketingMarketing
Changing the Level of Analysis 2FIRST. and LAST. Variable Modifiers
proc sort data=log ; by day ;
data find_highest ; retain hightest ; drop score ;
set log ; by day ;
if first.day then highest=. ; if score > highest then highest = score ; if lastday then output ;
Slide C.Slide C.2828SASSAS
MathematicalMathematicalMarketingMarketing
The KEEP and DROP Statements
keep a b f h ;
drop x1-x99 ;
data a(keep = a1 a2) b(keep = b1 b2) ;
set x ;
if blah then output a ;
else output b ;
Slide C.Slide C.2929SASSAS
MathematicalMathematicalMarketingMarketing
Changing the Level of Analysis 3Spreading Out an Observation
Subject Score1 Score2 Score3
A A1 A2 A3
B B1 B2 B3
Subject Time Score
A 1 A1
A 2 A2
A 3 A3
B 1 B1
B 2 B2
B 3 B3
Before
After
Slide C.Slide C.3030SASSAS
MathematicalMathematicalMarketingMarketing
Changing the Level of Analysis 3 – SAS Code
data spread ;
drop score1 score2 score3 ;
set tight ;
time = 1 ; score = score1 ; output ;
time = 2 ; score = score2 ; output ;
time = 3 ; score = score3 ; output ;
Slide C.Slide C.3131SASSAS
MathematicalMathematicalMarketingMarketing
Use of the IN= Dataset Indicator
data new ;
set old1 (in=from_old1)
old2 (in=from_old2) ;
if from_old1 then … ;
if from_old2 then … ;
Slide C.Slide C.3232SASSAS
MathematicalMathematicalMarketingMarketing
Proc Summary for Aggregation
proc summary data=raw_purchases ;
by household ;
class brand ;
var x1 x2 x3 x4 x5 ;
output out=household mean=overall ;
Slide C.Slide C.3333SASSAS
MathematicalMathematicalMarketingMarketing
Using SAS for Simulations
data monte_carlo ;
keep y1 - y4 ;
array y{4} y1 - y4 ;
array loading{4} l1 - l4 ;
array unique{4} u1 - u4 ;
l1 = 1 ; l2 = .5 ; l3 = .5 ; l4 = .5 ;
u1 = .2 ; u2 = .2 ; u3 = .2 ; u4 = .2 ;
do subject = 1 to 100 ;
eta = rannor(1921) ;
do j = 1 to 4 ;
y{j} = eta*loading{j} + unique{j}*rannor(2917) ;
end ;
output ;
end ;
proc calis data=monte_carlo ;
etc. ;
SimulationLoop
Slide C.Slide C.3434SASSAS
MathematicalMathematicalMarketingMarketing
External Data Sets and Windows/Vista
filename trans 'C:\Documents\june\transactions.data' ;
libname clv 'C:\Documents\customer_projects\' ;
...
data clv.june ;
infile trans ;
input id 3. purch 2. day 3. month $ ;