sas guide revf10

Upload: rednri

Post on 02-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 SAS Guide Revf10

    1/12

    USING SAS AT JAMES MADISON UNIVERSITY:

    A SHORT GUIDE for SAS on Windows

    Joanne M. Doye

    !"#da$ed %&'((% )y Wiia* +. Wood,

    !"#da$e Se#. '(%( )y Joanne M. Doye,

    In$rod"-$ion

    SAS is a statistical software package used extensively in many statistical fields, includingeconometrics. JMU recently upgraded SAS to version 9.2. t can !e found in most ma"or la!s. #ou canfind a list of t$ese la!s at t$e following link% $ttp%&&www."mu.edu&computing&la!s&and click on'software title(.

    )earning to use SAS involves learning t$e syntax of t$e program, t$at is, t$e rules of creating andexecuting a program as well as learning $ow to use t$e software in t$e windows environment. SAS canoperate in eit$er a !atc$ mode or in an interactive mode of *indows applications. +$is guide willfocus on !atc$ mode and t$e !asics of writing and executing a program of commands.

    n !atc$ mode, SAS executes your instructions line!yline from a command file. #ou t$en examine t$eresulting output and make any necessary c$anges. +$is approac$ is not as easy to use as interactivesoftware, !ut it conserves computing resources to apply raw processing power to t$e statistical task at$and.

    +$e two !asic steps for all SAS analyses are - writing t$e program and 2 executing t$e program.

    I. SAS for Windows: ASI+S

    /rom t$e S$ar$menu, go to /ro0ra*s, andfindSAS 1.%!En0is2,. t may !e located in t$e JMU Appspart of t$e start menu. +$is will launc$ t$e program. As it comes up you will find several windows ont$e screen, eac$ wit$ a certain function.

    %, T2e /ro0ra**in0 windows

    +$e windows t$at are used for SAS programming are t$e 0rogram 1ditor, )og, and utput windows.

    a Edi$or Window% allows you to write, edit and su!mit SAS programs. A SAS programconsists of a list of commands telling SAS w$ere to find t$e data t$at youwant to analy3e and w$at analysis you want to do on t$e data.

    ! 3o0 Window: displays messages from t$e SAS System. +$is is w$ere you will find errormessages telling you t$at SAS ran into an error in your program and can4tproceed.

    c O"$#"$ Window: displays t$e output of your program.

    http://www.jmu.edu/computing/labs/http://www.jmu.edu/computing/labs/
  • 8/11/2019 SAS Guide Revf10

    2/12

    d Res"$s Window: $elps you navigate t$e information in t$e utput *indow. 5eep in mindt$at it contains not$ing t$at isn4t already in t$e utput *indow6 t$erefore,we won4t !e using it.

    e E4#orer Window: also a navigation tool t$at we can ignore for now.

    t is possi!le to $ave all of t$ese windows, or a su!set of t$em open at one time. n fact, w$en youlaunc$ t$e program, you will $ave t$e )7 and t$e 18+ windows open, as well as t$e 1xplorerwindow. t will look like t$is%

    nce you run some procedures, SAS will open up t$e U+0U+ and 1SU)+S windows.

    II. O5er5iew

    #ou will create a program file in t$e 1ditor window. +$e program file contains t$e SAS commands tocarry out statistical analyses. /or example, you can give a command t$at calculates t$e mean, standarddeviation and ot$er sample statistics for a list of varia!les. :efore you start writing code, it issuggested t$at you go to t$e /)1 menu and c$oose SA;1. 7iven t$e program a filename and t$e fileextensions .sas. +$ese program files are really

  • 8/11/2019 SAS Guide Revf10

    3/12

    2 /RO+statements. >0= stands for 0=18U1

    rgani3e your data, suc$ as sorting and listing contents of t$e data set

    Analy3e your data, suc$ as estimating descriptive statistics and estimating a least s

  • 8/11/2019 SAS Guide Revf10

    4/12

    Y X;RUN;

    n eac$ of t$e t$ree a!ove examples, SAS considers t$ere to !e t$ree > lines of code. Bowever, t$efollowing code will not work. t is like a run on sentence t$at makes no sense to SAS%

    PROCCORRVAR X YRUN;

    SAS is also rat$er picky a!out t$e ordering of t$e commands. All commands that read in the DATAand create new variables must precede any of the PROC commands.

    )et4s look at a sample program, one set up to analy3e t$e $ousing price data in +a!le C.- ofamanat$an4sIntroductory Econometrics. +$e data are in a file named BUS1.txt6 a print out of t$is

    file appears !elow.

    +$is text data file was created in 1xcel and saved as a .+@+ file so t$at t$e values in eac$ row areseparated !y ta!s. +$is is important information t$at SAS needs to know w$en reading t$e data file.

    Bere is w$at t$e SAS command file looks like%

    OPTIONSLINESIZE=78;DATAwhatever;

    INFILE'J:\OUSE!TXT' "ELI#ITER='$%'& (r)to*)=2;INPUT PRI+E S,FT -E" -AT;

    NE.PRI+E = PRI+E/1000;

    +$is is t$e data file namedBUS1.+@+. ?otice t$evaria!le names in t$e first rowand data values t$at follow.*e need to write code to tellSAS w$ere t$is file is, and

    w$at is in it.

  • 8/11/2019 SAS Guide Revf10

    5/12

    PROCREG;TITLE'o0)(12 Re2re))(o1 U)(12 S30are Feet';#O"ELPRI+E = S,FT;RUN;

    PROCREG;TITLE'o0)(12 Re2re))(o1 w(th New Pr(4e Var(a*5e';

    #O"ELNE.PRI+E = S,FT;RUN;

    *$at t$is code does%

    - +$e first line tells SAS to make t$e output DE columns wide, so it can easily !e read on screen orprinted to a printer.

    2 +$e second line names t$e data set as Fw$ateverF. *$en you name your data set, do not use spaces%'w$at ever( is !ad, 'w$atever( is good.

    )ook at t$e line t$at starts wit$ F?/)1F. t tells SAS w$ere to get t$e data. n t$is case it4s in afile called GBUS1.+@+4 t$at is located on t$e J%H drive. +$e first line of t$e file BUS1.+@+contains varia!le names6 t$e actual numerical values start on t$e second line. +$erefore, we tell SAS tostart reading t$e data values in row 2 >firsto!sI2.

    C +$e next line of t$e program starts wit$ ?0U+. *it$ t$is statement, you are telling SAS w$ic$names to assign to t$e columns of data. SAS will not get t$e varia!le names from t$e first row of t$edatafile. ;aria!le names s$ould !e s$ort >eig$t c$aracters or fewer and memora!le, andshould notcontain any spacesor punctuation.

    +$e next line generates a varia!le called ?1*0=1, w$ic$ is e

  • 8/11/2019 SAS Guide Revf10

    6/12

    t$at t$e construction of ?1*0=1 >or any ot$er new varia!le must appear !efore any of t$e 0=commands.

    III. +REATING A SAS /ROGRAM

    A. DATA SETS

    METHOD %: IN6I3E and IN/UT for a $e4$ da$a fie.

    +$e easiest met$od for reading in data is to $ave SAS read it from a simple text file w$ere t$evalues in eac$ row are separated !y spaces.

    f you create your data in 1xcel and save it as a text file, 1xcel will separate t$e values in a rowwit$ ta!s, not spaces. #ou must tell SAS t$at t$e values are delimited !y ta!s. #ou must alsotell SAS t$e varia!le names and t$eir proper order. +$e following ?/)1 and ?0U+

    statements will work%

    (1(5e '6:\(5e1a7e!t&t' "ELI#ITER='$%'& (r)to*)=8;(190t var var8 var< var;

    n t$is example, varia!le 'var-( is a c$aracter varia!le, so we follow its varia!le name wit$ a Lsign to tell SAS to expect letters, not num!ers. ?+1% t$e values for t$is c$aracter varia!lecannot $ave any spaces. /or example, suppose var- is S+A+1, for state name. +$en t$e valuefor '?ew Jersey( s$ould !e ?ewJersey. /or your numerical varia!les, t$e values s$ould containno commas, no dollar signs, no percentage signs. SAS won4t read t$e data correctly. ?um!ers asinnocent as C.O and KK,-E need to !e c$anged to K.KC and KK-E to !e correctly read !ySAS. +$e !est way to do t$is in 1xcel is to select t$e data, t$en c$oose /ormat =ells and apply

    t$e 7eneral format to all num!ers t$at will !e used !y SAS.

    Bere is an example or a c$aracter varia!le in t$e first column. ?otice t$at t$e c$aracter valuescannot $ave any spaces in t$em. +$e underscore was inserted to remove all spaces.

    country EDUC GDP POP

    Dominican_R 122.7964 10350 7684

    Oman 397.9194 12918.86 2116

    Guatemaa 194.2203 12983.2 10322

    !o"a#ia 535.2473 13746.29 5325

    !o"enia 719.8354 14385.53 1925

    $uni%ia 854.0925 15625.74 8820Uru&uay 392.0578 16250.27 3168

    Ecua'or 519.3915 16605.59 11221

    (emen 1213.739 22380.43 14329

    )orocco 1452.461 30350.97 26025

    U_*ra+_Emir 700.3541 36665.76 2157

    (1(5e '6:\4o01tr>?e&a795e!t&t' "ELI#ITER='$%'& (r)to*)=8;(190t 4o01tr> E" @"P POP;

  • 8/11/2019 SAS Guide Revf10

    7/12

    Nuestion% after reading in t$is data set, will SAS t$ink t$e varia!le for education is named18U= or 18P

    Answer% its name is 18. *it$ t$e ?/)1, ?0U+ met$od of reading in data, SAS gets t$evaria!le names from t$e ?0U+ line. *e tell SAS to ignore t$e first line of t$e datafile !yusing t$e firsto!sI2 command.

    f you would rat$er $ave SAS get t$e varia!le names from t$e file, t$en use 0= M0+ asdescri!ed !elow.

    METHOD ': /RO+ IM/ORT

    #ou can leave your data in an 1xcel file >wit$ filename extension .xls or .xlsx and import it intoSAS wit$ t$e following commands using an 1xcel version of t$e $ousing data $ouse.xlsx.?+1% 7iven new security procedures in JMU la!s, sometimes SAS won4t read in a file!ecause of its location or ot$er reason. A)*A#S US1 A US: drive for your data files. f youcontinue to $ave trou!le, you can save t$e file as a .=S; file using SA;1 AS in 1xcel.

    OPTIONSLINESIZE=78;PROCIMPORT "ATAFILE=J:\ho0)e!&5)& OUT=7>Bat re95a4e;RUN;

    PROCCONTENTS;RUN;

    DATA7>Bat8; SET7>Bat;

    NE.PRI+E = PRI+E/1000;

    PROCREG; TITLE'#oBe5 U)(12 New Pr(4e Var(a*5e'; #O"ELNE.PRI+E = S,FT;RUN;

    ?otice t$e use of dou!le

  • 8/11/2019 SAS Guide Revf10

    8/12

    A!ove you $ave seen parts of a sample SAS program. n t$is section you will create an entire SASprogram.

    1nter t$e SAS program >if you are not already in SAS !y going to t$e Start Menu in *indows,0rograms, >or JMU Apps and locating SAS 9.- >1nglis$. #ou want to get into t$e 1ditor *indow.*$en launc$ing SAS you will get an empty 1ditor window named '1ditor R Untitled-(. f you ever

    lose t$is, you can get !ack to it !y clicking on t$e 1ditor !utton on t$e !ottom !ar, or !y going to t$e;iew Menu and c$oosing '1n$anced 1ditor(. #ou can start entering your program in t$e editor. nceyou are finis$ed, you save it !y going to t$e /ile menu and c$oosing Save. #ou will !e prompted for afile name and SAS will automatically give it a file extension of .sas.

    ?ote% +$ere are actually two editors in SAS% one titled '0rogram 1ditor( and t$e ot$er '1n$anced1ditor(. *$en you launc$ SAS, it automatically gives you t$e 1n$anced 1ditor in a window. #ou canfind t$e 0rogram 1ditor from t$e ;iew Menu. :asically, t$e 0rogram editor is an older version of t$e1n$anced editor. 1n$anced 1ditor is !etter !ecause it is 'en$anced( t is designed to assist you inwriting programs !y using color codest$at $elp you know w$ere command lines start and stop >wit$ asemicolon.

    OPTIONSLINESIZE=78;DATAOUSE;INFILE 'J:\ho0)e!t&t' "ELI#ITER='$%'& (r)to*)=2;INPUT PRI+E S,FT -E"R#S -ATS;PROC SORT; -Y PRI+E;RUN;PROC PRINT; TITLE'TA-LE ! OUSE PRI+ES';RUN;PROCMEANS; VAR PRI+E S,FT -E"R#S -ATS;

    RUN;PROCREG; TITLE'o0)(12 Re2re))(o1 E30at(o1'; #O"ELPRI+E = S,FT;RUN;PROCREG; TITLE'#05t(95e Re2re))(o1 o0)(12 E30at(o1'; #O"ELPRI+E = S,FT -E"R#S -ATS;

    RUN;

    IV. /ROGRAM E7E+UTION

    So far, you $avenTt actually computed any statistics or regressions. #ou $ave created a program ofcommands in t$e 1ditor window. ?ow you $ave to execute it using t$e SAS command. #ou can su!mitt$e program in a num!er of ways

    n t$e too!ar, t$ere is a !utton on t$e rig$t side of a little 'person running(. t is t$e t$ird !utton fromt$e rig$t along t$e too!ar at t$e top of t$e screen. =lick t$is !utton and SAS will execute t$ecommands in your program file >note% you must $ave t$e editor window active for t$is to work% look at

  • 8/11/2019 SAS Guide Revf10

    9/12

    t$e top of t$e window for a !rig$t !lue !ar t$at tells you w$ic$ window is active. SAS will executeyour program.

    *$en it is done, you will $ave information in t$e )7 and t$e U+0U+ windows. +$e )7 file isimportant only for finding errors in your program code. +$e output from t$e 0=edures will appearin t$e U+0U+ window.

    V. E7AMINING THE RESU3TS

    - =$eck t$e )7 window for errors. +$ere will !e a lot of "unk in t$is file. emem!er, it $as noresults in it. Scroll t$roug$ t$e window looking for 1 statements. f you do $ave an error, youwon4t necessarily $ave detailed information on w$at errors you $ave made. #ou will $ave to go !ack tot$e program in t$e 1ditor window and look for errors like misspelled words and missing semicolons.

    2 ?ext examine your results !y clicking on t$e U+0U+ window

    t is always a good idea to examine output files !efore you print !ecause you may $ave errors in your

    program file t$at prevents SAS from carrying out t$e appropriate commands. Scroll t$roug$ t$e output.>If you have errors in your program, you may not even any results in the OUTPUT window.

    VI. RUNNING THE /ROGRAM AGAIN

    f you found an error in t$e program, or rerun it for some ot$er reason >suppose you $it t$e SU:M+icon >little man running over and over. 1ac$ time you su!mit t$e program, SAS adds moreinformation to t$e )7 and U+0U+ files, appending it to t$e !ottom of t$ese files. So, yourU+0U+ and )7 windows can get clogged up. f you resu!mit your program for execution, firstopen t$e )7 window, go to t$e 18+ menu and c$oose =lear All. +$is will completely empty t$iswindow, making it ready to receive information from a new run. +$en open t$e U+0U+ window, go

    to t$e 18+ menu and c$oose =lear All. #ou can now go to t$e 1ditor window t$at contains yourprogram and give t$e su!mit command.

    VII. /RINTING YOUR RESU3TS

    %, +O/Y OUT/UT INTO WORD: =lick on t$e U+0U+ window, copy and paste w$at you wantto print into *8 and print from *8. #ou can reduce t$e amount of paper you use !y removingexcessive carriage returns in *8. /or grap$s, click on t$e grap$ in t$e grap$ window, copy andpaste into *8. #ou can t$en s$rink t$e si3e of t$e grap$ window.

    ', NEW8 HTM3 o"$#"$. f you want, you can set up SAS dump all of t$e output into B+M)

    formatted +a!les.

    7o to t$e +)S menu, c$oose 0+?S, from $ere c$oose 01/11?=1S. n t$is dialog !ox,click on t$e 1SU)+S ta!, c$oose =reate B+M) as s$own !elow. Style is your c$oice, !ut /estivalworks well from a si3e and color prespective. +$e dialog !ox s$ould look like t$is%

  • 8/11/2019 SAS Guide Revf10

    10/12

    *it$ t$is setup, you can run your program, doing so will create anot$er window, called t$e esults

    ;iewer. /rom $ere you can print out t$ese ta!les as you see it. Alternatively, you can rig$t click onone of t$ese ta!les in t$e esults ;iewer and c$oose, 1@0+ + 1@=1) to get t$e output dumpedinto 1xcel.

    VIII. E7TENSIONS O6 ASI+ /RO+EDURES

    SAS can perform many operations ot$er t$an !asic regression analysis. ts Fextensi!ilityF is consideredone of its ma"or virtues in commercial applications. *e will !e using a few extensions of !asicregression procedures. Bere are t$e most important ones, wit$ t$e command lines used to invoke t$em%

    %. +rea$in0 /o$s !/RO+ G/3OT,:

    *$en a grap$ is created wit$ 0= 70)+, a new window opens up containing t$e grap$. Any oft$ese grap$s can !e cut and pasted into *8 !y rig$t clicking on t$e grap$ in t$e grap$ window,c$oose edit, copy. +$en get into *8 and to a 0AS+1 command.

    +$e following code will construct a @# scatterplot of red dots including a legend.

    )>7*o5(=1o1e v=Bot 4=*5a4C;procgplot;5a*e5tot)at=Dav2 )at 7ath ver*a5GD

    e&999=De&9e1B(t0re) 9er 909(5D; 95ot tot)at/e&999=)tate ;run;

    +o 0lot a t$e @# values and plot t$e least s

  • 8/11/2019 SAS Guide Revf10

    11/12

    '. To -ond"-$ an ordinary eas$ s9"ares re0ression for-in0 $2e -ons$an$ $er* $o ;ero so $2a$ $2e

    e9"a$ion 2as no in$er-e#$:

    PROCREG;#O"ELYVAR = XVAR H NOINT;

  • 8/11/2019 SAS Guide Revf10

    12/12

    PROCREG;#O"ELYVAR = XVAR;OUTPUTOUT=STUFF RESI"UAL=E;DATAT.O;#ER@EONE STUFF;-YYEAR;

    E8 = E//2; th() 4reate) a var(a*5e o )30areB re)(B0a5)G.

    +$en include any statements you want using 1 as a varia!le, w$ere 1 is t$e residual for eac$o!servation.V

    . To -ond"-$ a s$andard $=$es$ on differen-es of *eans:

    >?ote% +$is test involves looking at rents paid !y minority and nonminority apartment dwellers in agiven city. 0= ++1S+ invokes t$e +test procedure. t divides t$e sample into classes !y minoritystatus >t$e =)ASS M?+# statement and it specifies t$at rent is t$e varia!le of interest >;A1?+ statement.

    OPTIONSLINESIZE=78;DATARENTSET;INFILE'4:\7> Bo407e1t)\45a))e)\e4