--- title: "STAT 491 - Lecture 4" date: January 23, 2018 output: pdf_document --- ```{r setup, include=FALSE} library(knitr) knitr::opts_chunk$set(echo = TRUE) set.seed(01162018) ``` # Ch.3 Probability - continued ### Normal probability density function The most common distribution is the normal or (Gaussian) distribution which is associated with the bell-curve. \vfill ```{r, out.width = '400px',fig.align='center', echo = F} x <- seq(-3, 3, by = .1) plot(x, dnorm(x), type='l', main = 'Standard Normal Density', ylab = 'p(x)') ``` \vfill The density function of a normal distribution can be written as: \vfill \vfill where: - \vfill - \vfill Note that the density of the distribution ($p(x)$) can be greater than one, let $x = \mu$. \vfill \vfill \newpage ### Mean and Variance of the distribution - *QUESTION:* Consider the distribution generated by rolling a die, what is the *Mean* of the distribution and how to we interpret it?: \vfill - _answer_: \vfill - *QUESTION:* Show mathematically the result above about the mean of the die rolls: \vfill - _answer_: \vfill - Expectation with a continuous distribution. Now assume that $x$ is a continuous quantity such as height. Now there are a countably infinite number of possibilities for $x$, how do we rewrite the expectation formula? \vfill \newpage - Consider an occupancy model, where the goal is to determine whether a species inhabits a specific region. We will later see that this translates to one of those coin flip models. As a starting point, you have no information to inform your prior belief on the probability that bats reside at a specific location, so you use a uniform model ```{r, echo=F} x <- seq(0, 1, by = .01) plot(x, dunif(x), type = 'l', ylab = 'p(x)', main='Uniform Density') ``` 1. Write the density of the uniform distribution: \vfill 2. Compute the expected value of this distribution: \vfill - We previously talked about the standard deviation in the context of the normal distribution, how does that relate to _variance_? \vfill - The variance is computed as: \vfill - We will do little calculation of variance in this class, rather we will typically use known properties from distributions. \newpage ### Highest Density Interval (HDI) - Rather than summarizing a distribution with the mean and variance, we might also use intervals. To motivate this idea, consider the two distributions below and guess the mean and variance. ```{r,fig.align='center', echo=F, out.width='400px'} x <- seq(-10,10, by = .01) par(mfcol=c(1,2)) plot(x, dnorm(x, mean = 2, sd = sqrt(4) ), type='l', ylab='p(x)', ylim=c(0,.5)) shape.val <- 1; scale.val <- 2 plot(x, dgamma(x, shape = shape.val, scale = scale.val ), type='l', ylab='p(x)', ylim=c(0,.5)) par(mfcol=c(1,1)) ``` - \vfill - Instead we will use an interval to describe them, (although the distribution as a whole is quite telling in this setting). - The Highest Density Interval (HDI), \vfill - Include a rough sketch of the 95% HDI on the figures below, note $p(0)=.5$ and $p(x) = 0$ for $x < 0$ on the figure on the right : ```{r,fig.align='center', echo=F, out.width='400px'} x <- seq(-10,10, by = .01) par(mfcol=c(1,2)) plot(x, dnorm(x, mean = 2, sd = sqrt(4) ), type='l', ylab='p(x)', ylim=c(0,.5)) #abline(v = qnorm(.025, mean = 2, sd = sqrt(4)), col='gray',lwd=2, lty=3) #abline(v = qnorm(.975, mean = 2, sd = sqrt(4)), col='gray',lwd=2, lty=3) shape.val <- 1; scale.val <- 2 plot(x, dgamma(x, shape = shape.val, scale = scale.val ), type='l', ylab='p(x)', ylim=c(0,.5)) #abline(v = qgamma(.95, shape = shape.val, scale = scale.val), col='gray',lwd=2, lty=3) #abline(v = 0, col='gray',lwd=2, lty=3) par(mfcol=c(1,1)) ``` \newpage ## Two-Way Distributions What we have seen thus far have been primarily univariate distribution. Now consider multivariate distributions such as: - \vfill Or consider the table below (from the textbook) that contains hair color and eye color: | | | Hair Color | | | | |:---------------------:|-------|:----------:|------|:------:|----------------------| | Eye Color | Black | Brunette | Red | Blond | Marginal (eye color) | | Brown | 0.11 | 0.20 | 0.04 | 0.01 | 0.37 | | Blue | 0.03 | 0.14 | 0.03 | 0.16 | 0.36 | | Hazel | 0.03 | 0.09 | 0.02 | 0.02 | 0.16 | | Green | 0.01 | 0.05 | 0.02 | 0.03 | 0.11 | | Marginal (hair color) | 0.18 | 0.48 | 0.12 | 0.21 | 1.0 | and calculate the following quantities: - $p(\text{h} = \text{blond})$ = \vfill - $p(\text{e} = \text{blue})$ = \vfill - $p(\text{h} = \text{blond}, \text{e} = \text{blue})$ = \vfill - $p(\text{h} = \text{blond}) = $, this is known as a _marginal_ distribution that is obtained my marginalizing or integrating out the other variable. \vfill - Formally, the _marginal_ distribution can be defined as: - \vfill - \vfill \newpage ### Conditional Probability With multivariate distributions, researchers are often interested in the distribution of one variable given a specific value of another variable. For instance consider, - \vfill - \vfill \vfill This is known as a conditional probability and will be an essential element in Bayesian data analysis. \vfill - *QUESTION:* using the hair - eye color table, answer the following questions: - What is the probability of a person having red hair: \vfill - What is the probability of a person having blue eyes: \vfill - What is the probability of a person having blue eyes and red hair: \vfill - What is the probability of a person having blue eyes given that they have red hair: \vfill - What is the probability of a person having red hair given that they have blue eyes: \vfill Thus far the quetions were answered with intuition, how can we mathematically formulate conditional probability for two events, $A$ and $B$? \vfill \vfill - This can also be sketched out using a Venn diagram. \vfill \newpage ### Independence of Attributes Recall the two events we stated previously: - \vfill - \vfill Do you anticipate that these probabilities are the same? \vfill Mathematically, two events ($A$ and $B$) are _independent_ if : \vfill In otherwords, if knowing $B$ give you no additional insight on the distribution of $A$, then $A$ and $B$ are independent.\vfill Two events are also independent if \vfill - Are having blue eyes and red hair independent events? \vfill