Turn in one copy for each group. If group members are not present in class they will be required to complete their own lab to receive credit. Please turn a file that contains your PROC SQL statements for Q2 along with the results. Then turn in an HTML and R Markdown output for the R related questions. This is due Tuesday December 5th at 10:30 AM.

Q1. (40 points)

Visit the website: http://www.montana.edu/marketing/about-msu/, your goal is to write code to extract the table that contains the Top 10 Student Home States. In this case you could likely enter this by hand quicker, but if the table was much larger scraping would be more efficient.

Q2. (60 points)

For this question, a subset of the tables contained in the History of Baseball database are available in SAS. Additional details are available here: https://www.kaggle.com/seanlahman/the-history-of-baseball. The following tables have been added to the course folder as SAS data sets:

a. PROC SQL in SAS

Select players born in the State of Montana and compute:

  • the total number of players
  • the total number of homeruns by these players. (This will require joining the batting and player tables).

Q3. Optional (20 points extra credit)

From the website https://www.kaggle.com/seanlahman/the-history-of-baseball, download the SQLite database. Then implement SQL code in R to compute the same quantities as Q2a.