1、sasadvancedbook2Topic Advanced Programming Techniques II1. Array2. PROC DATASETS3. PROC COMPARE4. PROC FORMAT5. PROC SORT6. Create the First. and Last. Temporary Variables7. Using SAS DATA Step View To Conserve Data Storage Space8. How to Control Which Variables and Observations You Want to Read and
2、 Write by Using SAS 9. Creating Integrity Constraints 10. The Efficiency of SAS Programming 1. Review of ARRAY1.1 DefinitionArray is a temporary grouping of SAS variables that are arranged in a particular order and identified by an array-name. The array exists only for the duration of the current DA
3、TA step. The array-name it is not a variable. In SAS, an array is not a data structure but is just a convenient way of temporarily identifying a group of variables.1.2 Basic ConceptsArray processing: is a method that enables you to perform the same tasks for a series of related variables or a group
4、of variables.Array reference: is a method to reference the elements of an array.One-dimensional array is a simple grouping of variables that, when processed, results in output that can be represented in simple row format.Multiple-dimensional array is a more complex grouping of variables that, when p
5、rocessed, results in output that could have two or more dimensions, such as columns and rows.1.3 Why use SAS Array? read data repeat an action or set of actions on each of a group of variables create several related variables write shorter programs restructure a SAS data set to change the unit of ob
6、servation 1.4 THE SYNTAXThe general syntax for defining an array is as follows:ARRAY array-name dimension $ length elements (initial values);- Array-name is the name we create for the array. It must be a valid SAS name and is recommended to not be the same as a SAS Function name. In Version 7 and be
7、yond the array name can be up to 32 characters in length.- Dimension indicates the number of elements in this array. Array subscript must be enclosed within: braces , square brackets , and parentheses (). When the array subscript is an asterisk (*) it is not necessary to know how many elements are w
8、ithin the array.- $ - included on the ARRAY statement only if the array is character, that is, if the array will be referencing new character variables. The dollar sign is not necessary if the elements in the array were previously defined as character elements.- Length can be used to define the leng
9、th of the new character variables referenced by the array or specifies the length of elements in the array that were not previously assigned a length.- Elements can be used to define the variables that the array will reference, either existing variables or new variables. They can be listed in any or
10、der and must be all numeric or all character.Special variables may be used to select all variables or all variables of a select type:_numeric_, character_ , and _all_.- Initial Values can be included to give the elements of the array initial values. This also causes these variables to be retained du
11、ring the data step (i.e. not reinitialized to missing at the execution of the DATA statement).- Dim function can be used to return the count of elementsAdvice: array statement is not an executable statement.1.5 Temporary ArraysWhen elements are constants needed only for duration of DATA step, you ca
12、n omit variables from an array and instead use temporary array elements.-temporary array elements behave like variables-temporary array elements do not have names-array elements do not appear in the output data set-they are automatically retainedBasic examples: Example 1: Defining Arrays array rain
13、5 janr febr marr aprr mayr; array days7 d1-d7;Example 2: Assigning Initial Numeric Values array test4 t1 t2 t3 t4 (90 80 70 70);Example 3: Defining Initial Character Values array test2* a1 a2 a3 (a,b,c);Example 4: Defining More Advanced Arrays array x5,3 score1-score15;Example 5: Defining More Advan
14、ced Arrays lower bound/high bound array yr00:06 yr00-yr06;/*Understanding Array structure*/data arry; array sm* x1-x5; input x1-x5; do i=1 to dim(sm); /* do i=1 to 5 */ /* do i=1, 2, 3, 4, 5;*/ /* do i=1 dim (*); */ new=sm(i)+10; output; end; datalines;1 2 3 4 5;proc print data=arry;run;/*Using Char
15、acter Variables in an array;*/options nodate pageno=1 linesize=80 pagesize=60;data arry_01; array names* $ n1-n10; array capitals* $ c1-c10; input names*; do i=1 to 10; capitalsi=upcase(namesi); end; datalines;smithers michaels gonzalez hurth frank bleigh rounder joseph peters sam;proc print data=ar
16、ry_01; title Names Changed from Lowercase to Uppercase;run;/* Create New Variables*/data a;input x1-x3 ;cards;10 20 30 20 40 60 30 60 90;data aa;set a ;array xs 3 x1-x3;array rate 3 ;do i=1 to 3;ratei=xsi;end;run;proc print data=aa;run;*-*| Example: Using a multi-dimensional array to restructure | |
17、 a data set |*-*;Data WT_ONE; input ID WT1 WT2 WT3 WT4 WT5 WT6;datalines;01 155 158 162 149 148 14702 110 112 114 107 108 109;DATA WT_MANY; SET WT_ONE; ARRAY WTS 2,3 WT1-WT6; DO COND = 1 TO 2; DO TIME = 1 TO 3; WEIGHT = WTSCOND, TIME; OUTPUT; End; END; DROP WT1-WT6;RUN;Quiz: Based on the ARRAY state
18、ment below, select the array reference for the array element q50. array qs 4,20 q1-q80;data pop; input male0_9 male10_19 male20_29 male30_39 male40_49 female0_9 female10_19 female20_29 female30_39 female40_49;datalines;85 65 70 110 205 90 70 85 105 22075 85 80 100 225 100 80 95 125 24070 55 65 105 2
19、15 105 85 90 125 21080 65 80 120 200 85 75 100 100 250;proc format; value sexpop 1=male 2=female ; value agegrppop 1=0 -9 2=10-19 3=20-29 4=30-39 5=40-49 ;run;data newpop(keep= sex agegrp size_pop); set pop; array x(2,5) _numeric_; do sex=1 to 2; do agegrp=1 to 5; size_pop=x(sex,agegrp); output; end
20、; end;run;2. PROC DATASETS 2.1 IntroductionI love the DATA step. Its powerful. Its flexible. Almost everyone needs the DATA step from time to time, but some programmers like it so much, they use it for everything - creating data sets, copying data sets, moving data sets. If you are manipulating your
21、 data, the DATA step is the perfect tool. However, if you are performing data management tasks such as copying a DATA set or adding FORMAT and LABELS, or you want to avoid running a DATA step, PROC DATASETS is a powerful tool. 2.2 Why we need to use PROC DATASETS? There are two main reasons for this
22、. -First, there is always the chance you will make a programming mistake and destroy your data set.-Second, using a DATA step for data management is often inefficient. Advantage of PROC DATASETS-if all you need is to copy or change the descriptor portion of the data set, it is usually easier and mor
23、e efficient to use PROC DATASETS. -they may not even keep SAS datasets around. -PROC DATASETS would really help with efficiency during the execution of the program, either by avoiding a DATA STEP or by freeing up working storage. -there are a number of neat things that only PROC DATSETS does, such a
24、s repairing and modification.PROC DATASETS is a data management procedure that allows you to do these tasks without completely re-writing the data set. The procedure is a utility that allows you toefficiently manage your SAS files. PROC DATASETS ; Age/Append/Audit/Change/Copy/Exclude/Select/Delete/M
25、odify/FormatIC Create/IC Delete/IC Reactivate/Index Create/Index Delete/InformatLabel/Rename/Repair/SaveExamplesAge - renames a group libname ee c:;data ee.current; input brkdnt: date7. vehicle $3. ;cards;2mar94 AAA 20may94 AAA 19jun94 AAA 29nov94 AAA4jul94 BBB 29may94 CCC 24dec94 CCC;data ee.bkup;
26、input brkdndt: date7. vehicle $3. ;cards;2mar95 AAA 20may95 AAA 19jun96 AAA 19nov94 AAA 5jul96 BBB 30may98 CCC 14dec95 CCC;proc datasets library=ee nowarn;age current bkup;exchange bkup= current;run;quit;Append adds observations from one data set to another. proc datasets ;append base=current data=b
27、kup;/*similar to set or proc append*/;run;quit;Change changes then name of a SAS file in the input data library (or within a SAS directory).proc datasets ;change bkup=new;run;quit;Copy copies some or all members of one SAS library to another. This is primarily used to move datasets from one system o
28、r version to another.To limit copying to specific members, use either the SELECT or EXCLUDE options. To specify a different library to copy from use either the DATASETS LIBRARY option to specify a default library or use the IN= option. To move a member from one library to another and then delete the
29、 original member, use the MOVE optionThe following example moves two members from lib1 to lib2:LIBNAME dest1 SAS-data-library;LIBNAME dest2 SAS-data-library;proc datasets;copy in=dest1 out=dest2 move memtype=data;select current;*exclude aa;run;quit;Delete gets rid of unneeded files. It helps to free
30、 up memory by getting rid of any dataset you have finished using.proc datasets library=dest1;/*work dir*/delete aa;run;quit;Kill delete SAS filesproc datasets library=work kill;run ;quit;Save will delete all files except those listed on the SAVE statement.Modify to change specific dataset or variable attributes. Works only on one dataset at a time. Allows you to change or specify formats, informats, and labels, rename variables and create and delete indexes.For an existing dataset the MODIFY command is the best way to make changes because no observations are read in or written out durin
copyright@ 2008-2023 冰点文库 网站版权所有
经营许可证编号:鄂ICP备19020893号-2