Advanced SAS:

PROC SQL and SAS Macros

What is SQL?

  • Structured Query Language
  • SQL is a widely used language to retrieve and update relational tables and databases.
  • Q: How are relational databases different from data sets we have used in this class?
  • Q: What is the different between a database and a table?

What can SQL do?

  • Execute queries from against a database
  • Retrieve data from a database
  • Insert/update records in a database
  • Create new databases/tables
  • Compute summary statistics
  • In SAS, the PROC SQL command can be applied to SAS data sets as an alternative to using DATA steps.

Database Lingo

  • RDBMS (Relational Database Management System) is the basis for SQL and is used in all modern database systems.
  • The data in a relational database is stored in object called tables.
  • Databases typically contain several tables.
  • Tables are analogous to datasets we have been using in class.
  • For efficient computing only relevant data are loaded into memory, rather than an entire database.
  • Tables contain columns called fields and rows called records.

SQL Syntax

A basic SQL cheatsheet is available at http://files.zeroturnaround.com/pdf/zt_sql_cheat_sheet.pdf.

There are many commands in SQL some of the most important are illustrated below:

  • SELECT col_name FROM table_name WHERE condition
  • UPDATE table_name SET col_name = 5 WHERE condition
  • INSERT INTO table_name (col_name1, col_name2,...) VALUES (value1, value2, ...)
  • DELETE FROM table_name WHERE condition;
  • SELECT COUNT(col_name) FROM table_name WHERE condition
  • SELECT table1.col1, table1.col2, table2.col1 FROM table1 INNER JOIN table2 on table1.key= table2.key

PROC SQL Syntax

Using SAS the PROC SQL procedure can be used to query datastate systems or SAS data sets. The syntax follows as:

PROC SQL;
    SQL statements;
RUN;

We will focus on SAS syntax, but there are minimal (or no) differences in other applications.

For this lecture and the lab, we will use a subset of baseball tables from https://www.kaggle.com/seanlahman/the-history-of-baseball.

  • all_star
  • batting
  • college
  • park
  • pitching
  • player
  • team
In [2]:
libname STAT408 '/folders/myfolders/';

/* Note when using SAS OnDemand set this libname to
LIBNAME STAT408 "/courses/d716b355ba27fe300";
*/
Out[2]:

574  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
575
576 libname STAT408 '/folders/myfolders/';
NOTE: Libref STAT408 was successfully assigned as follows:
Engine: V9
Physical Name: /folders/myfolders
577
578 /* Note when using SAS OnDemand set this libname to
579 LIBNAME STAT408 "/courses/d716b355ba27fe300";
580 */
581 ods html5 close;ods listing;

582

SELECT

Select is the workhorse command for SQL queries. Select can be used to query data from tables.

A select query must have a SELECT statement specifying which columns to query and a FROM statement to identify the table. There are additional options below, which must follow this order.

  1. SELECT (variables to include in query, * is a wildcard for all columns)
  2. FROM (name of table)
  3. WHERE (condition for variables, applies to each record)
  4. GROUP BY (can aggregate by a variable, when specify a function of selected variables)
  5. HAVING (condition for aggregated variables specified in group by statement)
  6. ORDER BY (allows sorting of output)

Q1. Create a query from STAT408.player to obtain all the player first and last names for all players born in 1993.

Q2. Select all rows from STAT408.park where the park is in California.

In [3]:
/* Create a query from STAT408.player to obtain all the player 
first and last names for all players born in 1993. */
PROC SQL;
    SELECT name_first, name_last 
    FROM STAT408.player
    WHERE birth_year = 1993;
RUN;
Out[3]:
SAS Output

The SAS System

name_first name_last
Miguel Almonte
Byron Buxton
Michael Conforto
Zach Davies
Michael Feliz
Brandon Finnegan
Joey Gallo
Oscar Hernandez
Adrian Houser
Keone Kela
Max Kepler
Jacob Lindgren
Francisco Lindor
Jorge Lopez
Ketel Marte
Lance McCullers
Frankie Montas
Aaron Nola
Daniel Norris
Jorge Polanco
Jurickson Profar
Eduardo Rodriguez
Joe Ross
Miguel Sano
Luis Sardinas
Kyle Schwarber
Pedro Severino
Trea Turner

Out[7]:

In [4]:
*Select all rows from STAT408.park where the park is in California.;
PROC SQL;
    SELECT * 
    FROM STAT408.park
    WHERE state = 'CA';
RUN;
Out[4]:
SAS Output

The SAS System

park_id park_name park_alias city state country
ANA01 Angel Stadium of Anaheim Edison Field; Anaheim Stadium Anaheim CA US
LOS01 Los Angeles Memorial Coliseum   Los Angeles CA US
LOS02 Wrigley Field   Los Angeles CA US
LOS03 Dodger Stadium Chavez Ravine Los Angeles CA US
OAK01 Oakland-Alameda County Coliseum Network Associates Coliseum Oakland CA US
SAN01 Qualcomm Stadium San Diego/Jack Murphy Stadium San Diego CA US
SAN02 PETCO Park   San Diego CA US
SFO01 Seals Stadium   San Francisco CA US
SFO02 Candlestick Park 3Com Park San Francisco CA US
SFO03 AT&T Park Pacific Bell Park; SBC Park San Francisco CA US

Using Group by and SQL aggregate functions

There are four main sets of aggregate type functions in SQL

  • COUNT (returns the number of rows)
  • SUM
  • AVG
  • MIN / MAX

Q: How do we write a query to calculate the total number of homeruns in each year since 1980?

In [5]:
*write a query to calculate the total number of homeruns in each year since 1980;
PROC SQL;
    SELECT year, SUM(hr)
    FROM STAT408.batting
    GROUP BY year
    HAVING year GT 1979
    ORDER BY year;
RUN;
Out[5]:
SAS Output

The SAS System

year  
1980 3087
1981 1781
1982 3379
1983 3301
1984 3258
1985 3602
1986 3813
1987 4458
1988 3180
1989 3083
1990 3317
1991 3383
1992 3038
1993 4030
1994 3306
1995 4081
1996 4962
1997 4640
1998 5064
1999 5528
2000 5693
2001 5458
2002 5059
2003 5207
2004 5451
2005 5017
2006 5386
2007 4957
2008 4878
2009 5042
2010 4613
2011 4552
2012 4934
2013 4661
2014 4186
2015 4909

Creating and Modify Tables

Thus far we have only been querying tables and producing output. Next we will see how to create and modify tables.

Create a new table
PROC SQL;
    CREATE TABLE table_name as
    SELECT col1, col2, col3
    FROM old_table;
RUN;

Q1. Create a table called born93 with info from all players born in 1993.

In [6]:
*\ Create a new table with info from players born in 1993 \*
PROC SQL;
    CREATE TABLE born93 as
    SELECT player_id, birth_country, birth_state, name_first, name_last FROM STAT408.player
    WHERE birth_year = 1993;
RUN;
Out[6]:
SAS Output
Modifying a table

There are several ways to modify an existing table:

  • INSERT INTO
  • UPDATE
  • DELETE
INSERT INTO

You can manually insert values into a table as follows:

INSERT INTO table_name (col1, col2, col3, ...)
VALUES (val1, val2, val3);

Note the values can be computed from a SELECT statement as well

INSERT INTO table_name (col1, col2, col3, ...)
SELECT val1, val2, val3 FROM table_name2;

Q: Write a PROC SQL command to add your name to the born93 table.

In [7]:
PROC SQL;
    INSERT INTO born93 (player_id, birth_country, birth_state, name_first, name_last)
    VALUES ('hoeghan01', 'USA','IA','Andrew','Hoegh');
RUN;
    
Out[7]:
SAS Output
In [8]:
PROC SQL;
    SELECT * 
    FROM  born93
    WHERE birth_state = 'IA';
RUN;
Out[8]:
SAS Output

The SAS System

player_id birth_country birth_state name_first name_last
hoeghan01 USA IA Andrew Hoegh
UPDATE

The update value allows you to update multiple records using a WHERE clause.

 UPDATE table_name 
 SET col1 = value
 WHERE col2= value;
In [9]:
PROC SQL;
    UPDATE born93 
    SET birth_state = 'Other'
    WHERE birth_country NE 'USA';
RUN;

PROC SQL;
    SELECT *
    FROM born93
    WHERE birth_state = 'Other';
RUN;
Out[9]:
SAS Output

The SAS System

player_id birth_country birth_state name_first name_last
almonmi01 D.R. Other Miguel Almonte
felizmi01 D.R. Other Michael Feliz
hernaos01 Venezuela Other Oscar Hernandez
keplema01 Germany Other Max Kepler
lindofr01 P.R. Other Francisco Lindor
lopezjo02 P.R. Other Jorge Lopez
marteke01 D.R. Other Ketel Marte
montafr01 D.R. Other Frankie Montas
polanjo01 D.R. Other Jorge Polanco
profaju01 Curacao Other Jurickson Profar
rodried05 Venezuela Other Eduardo Rodriguez
sanomi01 D.R. Other Miguel Sano
sardilu01 Venezuela Other Luis Sardinas
severpe01 D.R. Other Pedro Severino
DELETE

This allows you to delete columns, meeting a specified criteria, from a table.

DELETE FROM table
WHERE condition;
In [10]:
PROC SQL;
    DELETE FROM born93
    WHERE name_last LIKE "%egh";
RUN;

* query number of rows;
PROC SQL;
    SELECT COUNT(*)
    FROM born93;
RUN;
Out[10]:
SAS Output

The SAS System

 
28

Table Joins with SQL

There are several types of joins, some of which can be visualized below:

Inline-style: join

  • Inner join: returns only records in both tables
  • Outer join: returns all records in both tables
  • Left/Right (outer) join: returns all records in one table and matching records in other table
Join Syntax
SELECT l.var1, l.var2, r.var3, ...
FROM lefttab as l 
INNER JOIN righttab as r
ON l.key=r.key;

Other options for the join argument are:

  • LEFT JOIN
  • RIGHT JOIN
  • FULL JOIN
Inner Join

Q: Sketch out a SQL command to conduct an inner join on born93 and batting.

In [11]:
PROC SQL;
    SELECT * 
    FROM born93 as b
    INNER JOIN STAT408.batting as s
    ON b.player_id = s.player_id;
RUN;
Out[11]:
SAS Output

The SAS System

player_id birth_country birth_state name_first name_last player_id year stint team_id league_id g ab r h double triple hr rbi sb cs bb so ibb hbp sh sf g_idp
profaju01 Curacao Other Jurickson Profar profaju01 2012 1 TEX A 9 17 2 3 2 0 1 2 0 0 0 4 0 0 0 0 1
profaju01 Curacao Other Jurickson Profar profaju01 2013 1 TEX A 85 286 30 67 11 0 6 26 2 4 26 63 0 5 6 1 1
finnebr01 USA TX Brandon Finnegan finnebr01 2014 1 KCA A 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
norrida01 USA TN Daniel Norris norrida01 2014 1 TOR A 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
polanjo01 D.R. Other Jorge Polanco polanjo01 2014 1 MIN A 5 6 2 2 1 1 0 3 0 0 2 2 0 0 0 0 0
sardilu01 Venezuela Other Luis Sardinas sardilu01 2014 1 TEX A 43 115 12 30 6 0 0 8 5 1 5 21 0 2 3 0 5
almonmi01 D.R. Other Miguel Almonte almonmi01 2015 1 KCA A 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
buxtoby01 USA GA Byron Buxton buxtoby01 2015 1 MIN A 46 129 16 27 7 1 2 6 2 2 6 44 0 1 2 0 1
confomi01 USA WA Michael Conforto confomi01 2015 1 NYN N 56 174 30 47 14 0 9 26 0 1 17 39 0 1 0 2 4
davieza01 USA WA Zach Davies davieza01 2015 1 MIL N 6 13 1 1 0 0 0 0 0 0 0 3 0 0 1 0 0
felizmi01 D.R. Other Michael Feliz felizmi01 2015 1 HOU A 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
finnebr01 USA TX Brandon Finnegan finnebr01 2015 1 KCA A 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
finnebr01 USA TX Brandon Finnegan finnebr01 2015 2 CIN N 6 4 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0
gallojo01 USA NV Joey Gallo gallojo01 2015 1 TEX A 36 108 16 22 3 1 6 14 3 0 15 57 3 0 0 0 0
hernaos01 Venezuela Other Oscar Hernandez hernaos01 2015 1 ARI N 18 31 4 5 1 0 0 1 0 0 3 15 0 1 1 0 0
housead01 USA OK Adrian Houser housead01 2015 1 MIL N 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
kelake01 USA CA Keone Kela kelake01 2015 1 TEX A 68 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
keplema01 Germany Other Max Kepler keplema01 2015 1 MIN A 3 7 0 1 0 0 0 0 0 0 0 3 0 0 0 0 0
lindgja01 USA MS Jacob Lindgren lindgja01 2015 1 NYA A 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
lindofr01 P.R. Other Francisco Lindor lindofr01 2015 1 CLE A 99 390 50 122 22 4 12 51 12 2 27 69 0 1 1 7 1
lopezjo02 P.R. Other Jorge Lopez lopezjo02 2015 1 MIL N 2 2 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0
marteke01 D.R. Other Ketel Marte marteke01 2015 1 SEA A 57 219 25 62 14 3 2 17 8 4 24 43 0 0 2 2 1
mcculla02 USA FL Lance McCullers mcculla02 2015 1 HOU A 22 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
montafr01 D.R. Other Frankie Montas montafr01 2015 1 CHA A 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
nolaaa01 USA LA Aaron Nola nolaaa01 2015 1 PHI N 13 23 0 2 0 0 0 1 0 0 1 16 0 0 2 0 0
norrida01 USA TN Daniel Norris norrida01 2015 1 TOR A 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
norrida01 USA TN Daniel Norris norrida01 2015 2 DET A 8 2 1 1 0 0 1 2 0 0 1 1 0 0 0 0 0
polanjo01 D.R. Other Jorge Polanco polanjo01 2015 1 MIN A 4 10 1 3 0 0 0 1 1 0 2 1 0 0 0 0 0
rodried05 Venezuela Other Eduardo Rodriguez rodried05 2015 1 BOS A 21 2 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0
rossjo01 USA CA Joe Ross rossjo01 2015 1 WAS N 17 27 2 3 0 0 0 0 0 0 0 11 0 1 2 0 1
sanomi01 D.R. Other Miguel Sano sanomi01 2015 1 MIN A 80 279 46 75 17 1 18 52 1 1 53 119 1 1 0 2 4
sardilu01 Venezuela Other Luis Sardinas sardilu01 2015 1 MIL N 36 97 8 19 0 1 0 4 0 0 6 25 1 0 1 1 3
schwaky01 USA OH Kyle Schwarber schwaky01 2015 1 CHN N 69 232 52 57 6 1 16 43 3 3 36 77 1 4 0 1 4
severpe01 D.R. Other Pedro Severino severpe01 2015 1 WAS N 2 4 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0
turnetr01 USA FL Trea Turner turnetr01 2015 1 WAS N 27 40 5 9 1 0 1 1 2 2 4 12 0 0 0 0 0

Left Join

Now conduct a query that is a left join on born93 (as the left table) with batting as the right table.

PROC SQL;
    SELECT * 
    FROM born93 as b
    LEFT JOIN STAT408.batting as s
    ON b.player_id = s.player_id;
RUN;
  • Q: does the query have more, the same, or less records than the previous query?
  • Q: how would the results change if batting was the left table?
In [12]:
PROC SQL;
    SELECT * 
    FROM born93 as b
    LEFT JOIN STAT408.batting as s
    ON b.player_id = s.player_id;
RUN;
Out[12]:
SAS Output

The SAS System

player_id birth_country birth_state name_first name_last player_id year stint team_id league_id g ab r h double triple hr rbi sb cs bb so ibb hbp sh sf g_idp
almonmi01 D.R. Other Miguel Almonte almonmi01 2015 1 KCA A 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
buxtoby01 USA GA Byron Buxton buxtoby01 2015 1 MIN A 46 129 16 27 7 1 2 6 2 2 6 44 0 1 2 0 1
confomi01 USA WA Michael Conforto confomi01 2015 1 NYN N 56 174 30 47 14 0 9 26 0 1 17 39 0 1 0 2 4
davieza01 USA WA Zach Davies davieza01 2015 1 MIL N 6 13 1 1 0 0 0 0 0 0 0 3 0 0 1 0 0
felizmi01 D.R. Other Michael Feliz felizmi01 2015 1 HOU A 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
finnebr01 USA TX Brandon Finnegan finnebr01 2015 1 KCA A 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
finnebr01 USA TX Brandon Finnegan finnebr01 2015 2 CIN N 6 4 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0
finnebr01 USA TX Brandon Finnegan finnebr01 2014 1 KCA A 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
gallojo01 USA NV Joey Gallo gallojo01 2015 1 TEX A 36 108 16 22 3 1 6 14 3 0 15 57 3 0 0 0 0
hernaos01 Venezuela Other Oscar Hernandez hernaos01 2015 1 ARI N 18 31 4 5 1 0 0 1 0 0 3 15 0 1 1 0 0
housead01 USA OK Adrian Houser housead01 2015 1 MIL N 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
kelake01 USA CA Keone Kela kelake01 2015 1 TEX A 68 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
keplema01 Germany Other Max Kepler keplema01 2015 1 MIN A 3 7 0 1 0 0 0 0 0 0 0 3 0 0 0 0 0
lindgja01 USA MS Jacob Lindgren lindgja01 2015 1 NYA A 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
lindofr01 P.R. Other Francisco Lindor lindofr01 2015 1 CLE A 99 390 50 122 22 4 12 51 12 2 27 69 0 1 1 7 1
lopezjo02 P.R. Other Jorge Lopez lopezjo02 2015 1 MIL N 2 2 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0
marteke01 D.R. Other Ketel Marte marteke01 2015 1 SEA A 57 219 25 62 14 3 2 17 8 4 24 43 0 0 2 2 1
mcculla02 USA FL Lance McCullers mcculla02 2015 1 HOU A 22 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
montafr01 D.R. Other Frankie Montas montafr01 2015 1 CHA A 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
nolaaa01 USA LA Aaron Nola nolaaa01 2015 1 PHI N 13 23 0 2 0 0 0 1 0 0 1 16 0 0 2 0 0
norrida01 USA TN Daniel Norris norrida01 2014 1 TOR A 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
norrida01 USA TN Daniel Norris norrida01 2015 2 DET A 8 2 1 1 0 0 1 2 0 0 1 1 0 0 0 0 0
norrida01 USA TN Daniel Norris norrida01 2015 1 TOR A 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
polanjo01 D.R. Other Jorge Polanco polanjo01 2014 1 MIN A 5 6 2 2 1 1 0 3 0 0 2 2 0 0 0 0 0
polanjo01 D.R. Other Jorge Polanco polanjo01 2015 1 MIN A 4 10 1 3 0 0 0 1 1 0 2 1 0 0 0 0 0
profaju01 Curacao Other Jurickson Profar profaju01 2013 1 TEX A 85 286 30 67 11 0 6 26 2 4 26 63 0 5 6 1 1
profaju01 Curacao Other Jurickson Profar profaju01 2012 1 TEX A 9 17 2 3 2 0 1 2 0 0 0 4 0 0 0 0 1
rodried05 Venezuela Other Eduardo Rodriguez rodried05 2015 1 BOS A 21 2 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0
rossjo01 USA CA Joe Ross rossjo01 2015 1 WAS N 17 27 2 3 0 0 0 0 0 0 0 11 0 1 2 0 1
sanomi01 D.R. Other Miguel Sano sanomi01 2015 1 MIN A 80 279 46 75 17 1 18 52 1 1 53 119 1 1 0 2 4
sardilu01 Venezuela Other Luis Sardinas sardilu01 2014 1 TEX A 43 115 12 30 6 0 0 8 5 1 5 21 0 2 3 0 5
sardilu01 Venezuela Other Luis Sardinas sardilu01 2015 1 MIL N 36 97 8 19 0 1 0 4 0 0 6 25 1 0 1 1 3
schwaky01 USA OH Kyle Schwarber schwaky01 2015 1 CHN N 69 232 52 57 6 1 16 43 3 3 36 77 1 4 0 1 4
severpe01 D.R. Other Pedro Severino severpe01 2015 1 WAS N 2 4 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0
turnetr01 USA FL Trea Turner turnetr01 2015 1 WAS N 27 40 5 9 1 0 1 1 2 2 4 12 0 0 0 0 0

SAS Macros

SAS Macro code consists of two components:

  1. macro variables and
  2. macros.

Macro variables

Macro variables is a single variable that does not belong to a data set and the value is always the character. Names of macro variables are begin with an ampersand, such as &macrovar.

Macros

A macro is similar to an R function, but still bundles SAS DATA and PROC statements. Marco names start with a percent sign, such as %compile.

Macro Variables

Macro variables are particularly useful in industrial settings where procedures are run every week or month. Rather than modifying all of the code, a macro variable can be used to adjust the week or month.

Defining Macro Variables

Macro variables are defined with the following syntax

%LET year = 1993;

Q: Now write code to query the player data set to extract the count of players by birth year.

In [13]:
%LET year = 1993;

TITLE "Number of MLB players born in &year";
PROC SQL;
    SELECT COUNT(*) 
    FROM STAT408.player
    WHERE birth_year = &year;
RUN;
TITLE;
Out[13]:
SAS Output

Number of MLB players born in 1993

 
28
In [14]:
%LET year = 1983;

TITLE "Number of MLB players born in &year";
PROC SQL;
    SELECT COUNT(*) 
    FROM STAT408.player
    WHERE birth_year = &year;
RUN;
TITLE;
Out[14]:
SAS Output

Number of MLB players born in 1983

 
243

Macro Procedures

Similar to the last example, and functions in R, SAS macros can be used to streamline code for repeated procedures. As with R, debugging inside a macro environment can be more complicated. So make sure to that the code inserted in the macro statement is working before applying the macro commands.

The Little SAS Book describes macros as a sandwich,

%MACRO macro_name (parameter1= , ...);
    macro_text
%MEND macro_name;

where the %MACRO and %MEND statements are the bread. After being defined, the macro can then be run using the following statement %macro_name(val1); (if one parameter is defined).

Q: Now write a macro to print the number baseball players born in the years 1983, 1990, and 1993.

In [15]:
%MACRO yearborn (year =);
    libname STAT408 '/folders/myfolders/';

    TITLE "Number of MLB players born in &year";
    PROC SQL;
        SELECT COUNT(*) 
        FROM STAT408.player
        WHERE birth_year = &year;
    RUN;
    TITLE;
%MEND yearborn;

%yearborn(year = 1983);
%yearborn(year = 1990);
%yearborn(year = 1993);
Out[15]:
SAS Output

Number of MLB players born in 1983

 
243

Number of MLB players born in 1990

 
159

Number of MLB players born in 1993

 
28

Macros with Conditional Logic

Conditional logic can also be used inside a macro statements. For instance continuing the previous example, if we also want to include an option to create a table with the players names this can be done with conditional logic.

In [16]:
%MACRO yearborn (year =, create=);
    libname STAT408 '/folders/myfolders/';

    TITLE "Number of MLB players born in &year";
    PROC SQL;
        SELECT COUNT(*) 
        FROM STAT408.player
        WHERE birth_year = &year;
    RUN;
    TITLE;
    
    %IF &create = 'YES' %THEN %DO;
        PROC SQL;
            CREATE TABLE born&year as
            SELECT *
            FROM STAT408.player
            WHERE birth_year = &year;
        RUN;
    %END;
%MEND yearborn;

%yearborn(year = 1995, create= 'YES');
Out[16]:
SAS Output

Number of MLB players born in 1995

 
2
In [17]:
%yearborn(year = 1994, create = 'NO');

PROC PRINT DATA=born1994;
RUN;
Out[17]:

779  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
780
781 %yearborn(year = 1994, create = 'NO');
NOTE: Libref STAT408 was successfully assigned as follows:
Engine: V9
Physical Name: /folders/myfolders
NOTE: PROCEDURE SQL used (Total process time):
real time 1.70 seconds
cpu time 0.06 seconds

NOTE: PROC SQL statements are executed immediately; The RUN statement has no effect.
782
NOTE: PROCEDURE SQL used (Total process time):
real time 0.02 seconds
cpu time 0.02 seconds

783 PROC PRINT DATA=born1994;
ERROR: File WORK.BORN1994.DATA does not exist.
784 RUN;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds

785 ods html5 close;ods listing;

786
SAS Output

Number of MLB players born in 1994

 
8
In [18]:
PROC PRINT DATA=born1995;
RUN;
Out[18]:
SAS Output
Obs player_id birth_year birth_month birth_day birth_country birth_state birth_city death_year death_month death_day death_country death_state death_city name_first name_last name_given weight height bats throws debut final_game retro_id bbref_id
1 mondera02 1995 7 27 USA CA Los Angeles . . .       Raul Mondesi Raul Adalberto 185 73 B R . . mondr003 mondera02
2 osunaro01 1995 2 7 Mexico Sinaloa Juan Jose Rios . . .       Roberto Osuna Roberto 230 74 R R 2015-04-08 2015-10-03 osunr001 osunaro01

A brief overview of SAS PROC IML

IML stands for Interactive Matrix Language and provides some matrix algebra capacity. http://www.yorku.ca/pek/index_files/quickstart/IMLQuickStart.pdf

We are not going to cover it in detail, I only want you to know that it does exist.

In [19]:
proc iml;
a = I(3); * 3x3 identity matrix;
b = j(4,4,1); *4x4 matrix of 1's;
c=diag({1 2 4});
d=({1 2, 3 4});
print a; print b; print c; print d;
quit;
Out[19]:
SAS Output
a
1 0 0
0 1 0
0 0 1
b
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
c
1 0 0
0 2 0
0 0 4
d
1 2
3 4
In [20]:
proc iml;
X ={1 2, 3 4};
Y ={-4 3,-2 -1};
XY = X*Y; *matrix multiplication;
tX = t(X); *transpose using function;
DX = det(X); *matrix determinant;
X_inv = inv(X); *matrix inverse;
X_eigval = eigval(X); *eigen values of matrix;
X_eigvec = eigvec(X); *eigen vector of matrix;
print X; print Y; print XY; print tX; print DX; print X_inv; print X_eigval; print X_eigvec; quit;
Out[20]:
SAS Output
X
1 2
3 4
Y
-4 3
-2 -1
XY
-8 1
-20 5
tX
1 3
2 4
DX
-2
X_inv
-2 1
1.5 -0.5
X_eigval
5.3722813 0
-0.372281 0
X_eigvec
-0.415974 -0.824565
-0.909377 0.5657675
In [ ]: