Monday, July 11, 2011

Data Importing

we can import the 'backpain data' by,

libname mylib'E:\ast';
data mylib.backpain;
PROC IMPORT OUT= MYLIB.BACKPAIN
DATAFILE= "E:\ast\backpain.txt"
DBMS=TAB REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
proc print data=mylib.backpain;
run;

INFILE Command

INFILE command is used for importing data.
in our 'backpain' data , we can use the INFILE command as,

libname mylib'E:\ast';
data mylib.backpain;
infile'E:\ast\backpain.txt';
input a b$ c d;
run;
proc print data=mylib.backpain;
run;

Friday, May 6, 2011

Explore your library


Suppose you are working with a library named my_lib and you want to check how many datasets are under your library and which are they, to get this information use the following code given below,

proc datasets library=my_lib;
run;

Suppose you have two datasets DATA1, DATA2 under your library my_lib. If you want to check the contents in DATA2 use the following codes,

proc contents data=my_lib.DATA1;
run;

Thursday, April 14, 2011

The CLASS Variable

We can use the variable CLASS in the place of BY in our 'SAS Assignment 1' for obtaining the output in a same table.
ie;
in the place of
proc means data=mylib.Assessment sum mean ;
by x1;
run;
we can use
proc means data=mylib.Assessment sum mean ;
class x1;
run;

Tuesday, April 12, 2011

MERGE Command

When you want to match observations from one dataset with observations from another, use the MERGE statement in the DATA step.

Merging SAS data set is a simple process. First, if the data are not already sorted, use the SORT procedure to sort all data sets by the common variables. Then, in the DATA statement, name the new SAS data set to hold the results and follow with a MERGE statement listing the data sets to be combined . use a BY statement to indicate the common variables. For an example,

We have 2 datasets a and b.

First dataset -a. It contains

Name score1

Anu 34

Raju 38

Ravi 56

Anna 21

Second dataset –b it contains

Name score2

Anu 35

Raju 39

Ravi 58

Anna 25



I need a data set with name, score1 and score2 .

data mylib.dataone;

input name$ score1;

datalines;

Anu 34

Raju 38

Anna 56

Ann 21

;

run;

data mylib.datatwo;

input name$ score2;

datalines;

Anu 35

Raju 39

Anna 57

Ann 28

;

run;

proc sort data=mylib.dataone;

by name;

run;

proc sort data=mylib.datatwo;

by name;

run;

data mylib.data;

merge mylib.dataone mylib.datatwo;

by name;

run;

proc print data=mylib.data;

run;

Monday, April 4, 2011

SAS Assignment 1

Question 1
● Use the Base SAS windowing environment to write and submit a simple SAS program
● Given the data set:

Salesperson ,
Target company,
Number of visits,
Number of phone calls,
Units sold,

Brown American 3 12 28000
Johnson VRW 6 14 33000
Rivera Texam 2 6 8000
Brown Standard 0 22 0
Brown Knowles 2 19 12000
Rivera Metro 4 8 13000
Rivera Uniman 8 7 27000
Johnson Oldham 3 16 8000
Johnson Rondo 2 14 2000

1. Write a SAS program to compare the sales records of the company’s three sales people –
that is, compute the sum and mean for the number of visits, phone calls and units sold for
each salesperson.

ans;

libname mylib'E:\ast';
data mylib.Assessment;
input x1$ x2 x3 x4 x5;
datalines;
Brown American 3 12 28000
Johnson VRW 6 14 33000
Rivera Texam 2 6 8000
Brown Standard 0 22 0
Brown Knowles 2 19 12000
Rivera Metro 4 8 13000
Rivera Uniman 8 7 27000
Johnson Oldham 3 16 8000
Johnson Rondo 2 14 2000
;
run;
data mylib.Assessment;
set mylib.Assessment;
label x1='Salesperson'
x2='Target company'
x3='Number of visits'
x4='Number of phone calls'
x5='Units sold'
;
run;
proc print data=mylib.Assessment;
run;
proc sort data=mylib.Assessment;
by x1;
run;
proc means data=mylib.Assessment sum mean ;
by x1;
run;

Saturday, March 26, 2011

Correlation in SAS


To establish the linear  relationship between two variables  we can use correlation. We can use PROC CORR  in SAS to do this. The syntax is given below,

proc corr data=mylib.file1;
var variable1 variable2;
run;

This code will give the Pearson's Correlation Coefficient.  To find another correlation coefficient like Spearman’s , Kendall, etc... just type

proc corr data= mylib.file1 spearman;

If you want to find both of them in a single expression 

proc corr data= mylib.file1 spearman pearson;

Try these codes and give your feedback. Download different SAS data sets try these types of operations into it. You can visit http://www.principlesofeconometrics.com/sas.htm to download SAS data sets

Wednesday, March 23, 2011

Q-Q Plot


Now  we’re going to learn how to create Q-Q Plot. It is ver simple, just try the codes given below.

proc univariate data=library_name.file_name;
var variable_name ;
qqplot;
run;


or just try,

proc univariate data=.;
qqplot variable_name;
run;

the above code will creat a Q-Q Plot of a particular variable.
If you want Q-Q Plot of two varibles the rewrite the above command with

qqplot variable1 variable2;

Similar manner you can try with any number of variables

Sunday, March 20, 2011

Box Plot

Pls try with our titanic data

proc sort data=mylib.titanic;
by increasing fare pclass;
run;
proc boxplot data=mylib.titanic;
plot fare*pclass;
run;

Plot histogram with normal curve and inset option

proc univariate data=mylib.titanic1;
title "Histogram for variable fare";
histogram fare/cfill=ligr normal cframe=liy barwidth=8 cv=black;
inset mean std max min;
run;

Saturday, March 19, 2011

DROP variable

Continued - dropping variables
Instead of wanting to keep just a handful of variables, it is possible that we might want to get rid of just a handful of variables in our titanic data file. Below we show how we could get rid of the variables pclass and name.


data mylib.titanic;
set mylib.titanic;
drop pclass name;
run;
proc contents data=mylib.titanic;
run;
proc print data=mylib.titanic;
run;

KEEP variable

Continued Keeping variable
Our titanic data file had many many variables, say16, but we only care about just
a handful of them, pclass survived name. We can subset our data file to keep just those variables as
shown below.

data mylib.titanic;
set mylib.titanic;
keep pclass survived name;
run;
proc contents data=mylib.titanic;
run;
proc print data=mylib.titanic;
run;

Tuesday, March 15, 2011

Scatter diagram

Hai dear friends...Its me Anton...
actually I am not confident about blogging..and i dont konw is this the way to post some thing on blogg..
Any way please try this on SAS. its pretty intresting..
You just pick Titanic data from the library and follow this..
proc print data=____.titanic;
run;
proc gplot data=____.titanic;
plot fare*age;
run;


if you got thjis please reply me..
waiting for your reply..

Monday, March 7, 2011

Sorting of data


Sorting is a very important technique. To get certain information from our data we need to use sorting technique. Now we are going to study how to sort given data in SAS...
Consider the data give below
id         age      sex                   time
1          41        female             53
2          45        female             28
3          48        male                69
4          54        male                58
5          40        female             54
6          31        male                25
7          53        male                51
8          49        male                61
9          36        male                57
10        52        male                57

First you must create a library and save this data into a file. To sort data in SAS we can useproc sort”.Now  we are going to sort this data with respect to the variable “age”. Write down the following commands given below.

proc sort data=mylib.file1;
by age;
run;

This will sort our data in the ascending order. To sort in descending order use the following commands.

proc sort data=mylib.file1;
by descending age;
run;

You can sort this data with respect to more than one variables. That is, to sort this data with respect to “time” and “age” then try the codes below.

proc sort data=mylib.file1;
by time age;
run;

http://support.sas.com/documentation/onlinedoc/code.samples.htmltry this link you will get different data sets to download.

Wednesday, March 2, 2011

Data import


For accessing data which is in the form of "Excel spreadsheet(*.xls)",the following
procedure can be used in SAS.

libname test C:\mysas\; /*Assigns a library named test on location C:\mysas\*/
proc import out=test.data1/*set data1 as the name for new data on "test" library*/
datafile="C:\..\data.xls" /*Location of the datafile to be imported*/
dbms=excel replace;
sheet="sheet1$"; /*specifies the sheetname to be read*/
getnames=yes; /*to get the variable names from the first row of data*/
mixed=no; /*to convert numeric values to characters in a mixed type column*/
scantext=yes; /*to use the largest text size in a column as sas variable length*/
usedate=yes; /*use DATE. format for a Date/Time column*/
scantime=yes; /*use TIME. format if only time values found in a column*/
run;
Similarly,for accessing a Tab Delimited(*.txt) file:


libname test C:\mysas\; /*Assigns a library namest test on location C:\mysas\ */
proc import out=test.data1 /*set data1 as the name for new data*/
datafile="C:\..\data.txt" /*Location of the datafile to be imported*/
dbms=tab replace;
getnames=yes; /*to get the variable names from the first row of data*/
datarow=2; /*to select the row on which data starts*/

run;
And for getting a Comma Seperated Values(*.csv)file:

libname test C:\mysas\; /*Assigns a library namest test on location C:\mysas\ */
proc import out=test.data1 /*set data1 as the name for new data*/
datafile="C:\..\data.csv" /*Location of the datafile to be imported*/
dbms=csv replace;
getnames=yes; /*to get the variable names from the first row of data*/
datarow=2; /*to select the row on which data starts*/

run;

Tuesday, March 1, 2011

FORMAT command


Consider the data given below
Name              Gender
John                           1
Ram                            1
Geetha                      0
Nithin                        1
Meena                       0

In this data ‘0’ and ‘1’ represents “ Female” and “Male “ respectively. We need to change those 0’s and 1’s by their respective labels. For this we can use FORMAT command in SAS.

libname mylib "D:\sas\";/*Creating a library named mylib*/
data mylib.file1;/*Storing the given data into file1*/
input name$ gender;
datalines;
John 1
Ram 1
Geetha 0
Nithin 1
Meena 0
;
run;
proc format;/*Creating a format named sex*/
value sex
0='female'
1='male'
;
run;
proc print data=mylib.file1;
format gender sex.;/*Format is applying  to the variable gender*/
run;


These codes can only format the output. If you want to make changes in the file, use the follwing code given below.

data mylib.file1;
set mylib.file1;
format gender sex.;
run;

If you are using “ data mylib.file2; ” instead of “ data mylib.file1; ” then SAS will create a new file named ‘file2’ and a formatted version of this data will save into it.

Sunday, February 27, 2011

WHERE Statement

LIBNAME MYSAS 'C:\SASDATA\';

DATA mysas.auto ;
LENGTH make $ 20 ;
INPUT make $ 1-17 price mpg rep78 hdroom trunk weight
length turn displ gratio foreign ;
CARDS;
AMC Concord 4099 22 3 2.5 11 2930 186 40 121 3.58 0
AMC Pacer 4749 17 3 3.0 11 3350 173 40 258 2.53 0
AMC Spirit 3799 22 . 3.0 12 2640 168 35 121 3.08 0
Audi 5000 9690 17 5 3.0 15 2830 189 37 131 3.20 1
Audi Fox 6295 23 3 2.5 11 2070 174 36 97 3.70 1
BMW 320i 9735 25 4 2.5 12 2650 177 34 121 3.64 1
Buick Century 4816 20 3 4.5 16 3250 196 40 196 2.93 0
Buick Electra 7827 15 4 4.0 20 4080 222 43 350 2.41 0
Buick LeSabre 5788 18 3 4.0 21 3670 218 43 231 2.73 0
Buick Opel 4453 26 . 3.0 10 2230 170 34 304 2.87 0
Buick Regal 5189 20 3 2.0 16 3280 200 42 196 2.93 0
Buick Riviera 10372 16 3 3.5 17 3880 207 43 231 2.93 0
Buick Skylark 4082 19 3 3.5 13 3400 200 42 231 3.08 0
Cad. Deville 11385 14 3 4.0 20 4330 221 44 425 2.28 0
Cad. Eldorado 14500 14 2 3.5 16 3900 204 43 350 2.19 0
Cad. Seville 15906 21 3 3.0 13 4290 204 45 350 2.24 0
Chev. Chevette 3299 29 3 2.5 9 2110 163 34 231 2.93 0
Chev. Impala 5705 16 4 4.0 20 3690 212 43 250 2.56 0
Chev. Malibu 4504 22 3 3.5 17 3180 193 31 200 2.73 0
Chev. Monte Carlo 5104 22 2 2.0 16 3220 200 41 200 2.73 0
Chev. Monza 3667 24 2 2.0 7 2750 179 40 151 2.73 0
Chev. Nova 3955 19 3 3.5 13 3430 197 43 250 2.56 0
Datsun 200 6229 23 4 1.5 6 2370 170 35 119 3.89 1
Datsun 210 4589 35 5 2.0 8 2020 165 32 85 3.70 1
Datsun 510 5079 24 4 2.5 8 2280 170 34 119 3.54 1
Datsun 810 8129 21 4 2.5 8 2750 184 38 146 3.55 1
Dodge Colt 3984 30 5 2.0 8 2120 163 35 98 3.54 0
Dodge Diplomat 4010 18 2 4.0 17 3600 206 46 318 2.47 0
Dodge Magnum 5886 16 2 4.0 17 3600 206 46 318 2.47 0
Dodge St. Regis 6342 17 2 4.5 21 3740 220 46 225 2.94 0
Fiat Strada 4296 21 3 2.5 16 2130 161 36 105 3.37 1
Ford Fiesta 4389 28 4 1.5 9 1800 147 33 98 3.15 0
Ford Mustang 4187 21 3 2.0 10 2650 179 43 140 3.08 0
Honda Accord 5799 25 5 3.0 10 2240 172 36 107 3.05 1
Honda Civic 4499 28 4 2.5 5 1760 149 34 91 3.30 1
Linc. Continental 11497 12 3 3.5 22 4840 233 51 400 2.47 0
Linc. Mark V 13594 12 3 2.5 18 4720 230 48 400 2.47 0
Linc. Versailles 13466 14 3 3.5 15 3830 201 41 302 2.47 0
Mazda GLC 3995 30 4 3.5 11 1980 154 33 86 3.73 1
Merc. Bobcat 3829 22 4 3.0 9 2580 169 39 140 2.73 0
Merc. Cougar 5379 14 4 3.5 16 4060 221 48 302 2.75 0
Merc. Marquis 6165 15 3 3.5 23 3720 212 44 302 2.26 0
Merc. Monarch 4516 18 3 3.0 15 3370 198 41 250 2.43 0
Merc. XR-7 6303 14 4 3.0 16 4130 217 45 302 2.75 0
Merc. Zephyr 3291 20 3 3.5 17 2830 195 43 140 3.08 0
Olds 98 8814 21 4 4.0 20 4060 220 43 350 2.41 0
Olds Cutl Supr 5172 19 3 2.0 16 3310 198 42 231 2.93 0
Olds Cutlass 4733 19 3 4.5 16 3300 198 42 231 2.93 0
Olds Delta 88 4890 18 4 4.0 20 3690 218 42 231 2.73 0
Olds Omega 4181 19 3 4.5 14 3370 200 43 231 3.08 0
Olds Starfire 4195 24 1 2.0 10 2730 180 40 151 2.73 0
Olds Toronado 10371 16 3 3.5 17 4030 206 43 350 2.41 0
Peugeot 604 12990 14 . 3.5 14 3420 192 38 163 3.58 1
Plym. Arrow 4647 28 3 2.0 11 3260 170 37 156 3.05 0
Plym. Champ 4425 34 5 2.5 11 1800 157 37 86 2.97 0
Plym. Horizon 4482 25 3 4.0 17 2200 165 36 105 3.37 0
Plym. Sapporo 6486 26 . 1.5 8 2520 182 38 119 3.54 0
Plym. Volare 4060 18 2 5.0 16 3330 201 44 225 3.23 0
Pont. Catalina 5798 18 4 4.0 20 3700 214 42 231 2.73 0
Pont. Firebird 4934 18 1 1.5 7 3470 198 42 231 3.08 0
Pont. Grand Prix 5222 19 3 2.0 16 3210 201 45 231 2.93 0
Pont. Le Mans 4723 19 3 3.5 17 3200 199 40 231 2.93 0
Pont. Phoenix 4424 19 . 3.5 13 3420 203 43 231 3.08 0
Pont. Sunbird 4172 24 2 2.0 7 2690 179 41 151 2.73 0
Renault Le Car 3895 26 3 3.0 10 1830 142 34 79 3.72 1
Subaru 3798 35 5 2.5 11 2050 164 36 97 3.81 1
Toyota Celica 5899 18 5 2.5 14 2410 174 36 134 3.06 1
Toyota Corolla 3748 31 5 3.0 9 2200 165 35 97 3.21 1
Toyota Corona 5719 18 5 2.0 11 2670 175 36 134 3.05 1
Volvo 260 11995 17 5 2.5 14 3170 193 37 163 2.98 1
VW Dasher 7140 23 4 2.5 12 2160 172 36 97 3.74 1
VW Diesel 5397 41 5 3.0 15 2040 155 35 90 3.78 1
VW Rabbit 4697 25 4 3.0 15 1930 155 35 89 3.78 1
VW Scirocco 6850 25 4 2.0 16 1990 156 36 97 3.78 1
;
RUN;

The where statement allows us to run on a subset of records.
For example, instead of printing all records in the file,
the following program prints only cars where the value
for
rep78 is 3 or greater.


PRINT DATA=MYSAS.auto;
WHERE (rep78 >= 3);
VAR make rep78;
RUN;
Missing values and the where statement 

In the example above, note that some of the records print a '.' instead of a value for rep78. These are records where rep78 is missing. SAS stores missing values for numeric variables as '.' and treats them as negative infinity, or the lowest number possible. To exclude missing values, modify the where statement as follows (the rep78 ^= . indicates rep78 is not equal to missing).

PROC PRINT DATA=mysas.auto;
WHERE (rep78 <= 2) and (rep78 ^= .) ; VAR make price rep78 ; RUN;



This program generates summary statistics for price, but only for cars with repair histories of 1 or 2:

PROC MEANS DATA=auto;
WHERE (rep78 = 1) OR (rep78 = 2) ;
VAR price ;
RUN;

Here is the output from the proc means. By default, proc means will generate the following statistics: mean, minimum and maximum values, standard deviation, and the number of non-missing values for the analysis variable (in this case price).

Analysis Variable : price
N Mean Std Dev Minimum Maximum
----------------------------------------------------------
10 5687.00 3216.38 3667.00 14500.00
----------------------------------------------------------

To see summary statistics for price for cars with repair histories of 3, 4 or 5, modify the where statement accordingly:

PROC MEANS DATA=auto;
WHERE (rep78 = 3) or (rep78 = 4) or (rep78 = 5) ;
VAR price ;
RUN;

Or:

PROC MEANS DATA=auto;
WHERE (3 <= rep78 <= 5) ; VAR price ; RUN;


Analysis Variable : price
N Mean Std Dev Minimum Maximum
----------------------------------------------------------
59 6223.85 2880.45 3291.00 15906.00
----------------------------------------------------------

The where statement also works with the in operator as follows:

PROC MEANS DATA=auto;
WHERE rep78 in (3,4,5);
VAR price ;
RUN;