\documentclass[11pt,titlepage]{article}
\usepackage{amsmath}
\usepackage{graphicx}
\usepackage{verbatim}
\allowdisplaybreaks

\jot=.2in \pagestyle{plain}
\setlength{\topmargin}{-0.5in}
\setlength{\textheight}{9.5in}
\setlength{\oddsidemargin}{-0.2in}
\setlength{\evensidemargin}{0in}
\setlength{\textwidth}{6.7in}
\font\heada=cmbx10 scaled\magstep3
\font\headb=cmsl10 scaled\magstep1
\font\headc=cmr8
\pretolerance=10000
\setlength{\parindent}{2 em}
%\input macros

\newdimen\digitwidth
\newdimen\minuswidth
\setbox0=\hbox{\rm0}
\digitwidth=\wd0
\setbox1=\hbox{$-$}
\minuswidth=\wd1
\newdimen\starr
\setbox2=\hbox{${}^*$}
\starr=\wd2

{\catcode`?=\active
\def?{\kern\digitwidth}
\catcode`@=\active
\def@{\kern\minuswidth}
\catcode`|=\active
\def|{\kern\starr}}

\begin{document}

\noindent{\heada Project 4 Solutions}\\


\vspace{.1in}

This project is worth 25 points.


\begin{enumerate}
\item  \label{beautiful} From the November 2006 {\em Discover} article ``Do
Beautiful Parents have more daughters"

\begin{enumerate}
\item (1 pt) This an observational study since no treatment was applied to
the individuals.

\item (1 pt) The individuals are the 18-28 year old parents who were in the
National Longitudinal Study of Adolescent Health database.

\item (1 pt) Despite claiming that the study is ``nationally representative," the individuals in the
National Longitudinal Study of Adolescent Health database are not a
random sample from the US population.  First, all parents in the
study are between 18 and 28 years old, so a selection bias exists
against older parents.  Furthermore, since these parents were chosen
when they were children in school themselves, then it is not
possible for parents who were either home-schooled and drop-outs to
be selected.  Lastly, since humans can opt to take part in a study
or not, and can opt to fill out surveys or not, non-response bias is
a potential problem.

\item (2 pts) The explanatory variable is the ``desirability" of a parent, categorized on a scale of 1 (very unattractive) to 5 (very attractive).
The response variable is also categorical, whether or not a parent's
first child was a daughter.


\item (1 pt) The sample size of the parents in this study is $n=2972$.

\item (1 pt) Evolutionary psychologist Satoshi Kanazawa's comment that
``by marrying a beautiful spouse you are slightly increasing the
chance that you'll have a daughter" is speculative, and insinuates
cause and effect.  Cause-and-effect is not a valid conclusion from
an observational study.

\item (2 pts) Since this study is not of a SRS of parents in the US, then conclusions can not be drawn to the general US population.
Since this is an observational study, then cause and effect
conclusions are not appropriate either.


\item (1 pt) These are the responses given by the class:
Confounding variables could be environmental: If the beautiful
majority are from a particular geographic location, then
\underline{water quality} and/or \underline{pollution} could
increase the probability of daughters.  A parent's beauty could be a
consequence of the \underline{wealth}, which implies
\underline{better nourishment}, which may increase the proclivity of
daughters.  Perhaps beautiful people have \underline{sex more
frequently} which might favor daughters. If beautiful people have
more sex, they may try different \underline{sexual positions} and/or
have sex at \underline{times} (with respect to ovulation) which
favor daughters. The best possible confounding variable suggested is
that parents of sons may be more \underline{stressed and fatigued}
on average, and so are less beautiful than parents of daughters.


\end{enumerate}

\item Repeating the ``Beautiful Parent" study on a sample from the US
population: from the beautiful parents in the US, we take a SRS of
size 4; from the ``non-beautiful" parents in the US, we take a SRS
of size 4. And:

\begin{center}
\begin{tabular}{rcc}
  % after \\: \hline or \cline{col1-col2} \cline{col3-col4} ...
  $P($ daughter first $|$ beautiful parent ) &=& .56\\
  $P($ daughter first $|$ non-beautiful parent ) &=& .48\\
\end{tabular}
\end{center}

\begin{enumerate}
\item   (1 pt) The sampling design is a stratified random sample.

\item (1 pt) The probability that one beautiful parent does not have a daughter as a first child
is 1 - .56 = .44.  By independence, the probability that all 4
beautiful parents do not have daughters first is $.44^4\approx
0.037$.

\item (2 pts) The probability that one non-beautiful parent does not have a
daughter first is 1 - .48 = .52.  By independence, the probability
that all four non-beautiful parents do not have a daughter first is
$.52^4 \approx .073$.   Thus, the probability that at least one of
the 4 non-beautiful parents has a daughter first is $1 - .54^4 = 1 -
.073 = .927$.

\item (2 pts) The probability that at least one of the 4 of the beautiful parents has a daughter first is 1 - $.44^4 \approx .963$.  The probability that
at least one of the 4 non-beautiful parents has a daughter first is
$1 - .48^4 \approx .950$.   Thus, the probability that at least one
of the 4 of the beautiful parents has a daughter as their first
child AND that at least one of the 4 non-beautiful parents does not
have a daughter as their first child is $.963 \times .950 \approx
.915$.


\end{enumerate}

\item \label{mosquito} (4 pts) Using the January 2006 {\em Discover} article
``Malaria Parasite Makes Humans Smell More Attractive to Mosquitoes"
\begin{enumerate}

\item Since the article states that ``Twice as many mosquitoes
gravitated toward kids carrying gametocytes as flew toward the other
two groups," then $P($mosquito flies towards a kid with
gametocytes$)=\frac{2}{3}$.   Algebraically, if $x$ is equal to
number of mosquitos which fly towards the other groups, then one can
solve $2x=100-x$ and see that $x=33$.

\item \label{cond} When randomly selecting two of the mosquitos
from out of 100 (without replacement), the probability that BOTH
mosquitos go to the tent with kids carrying gametocytes is
$\frac{67}{100}\left(\frac{66}{99}\right)=\approx .446$. This is
because $P($first mosquito flies to gametocyte
tent$)=\frac{67}{100}$ and $P($second mosquito flies to gametocyte
tent$|$first mosquito flies to gametocyte tent$)=\frac{66}{99}$.


\item \label{ind} By independence, the probability that both of
the mosquitos fly towards the infected kid is
$\frac{2}{3}\left(\frac{2}{3}\right)=\frac{4}{9}=\bar 4$.

\item Since the sample size in problem \#\ref{cond} of $2 <
.05(100)$, then the conditional probability .4377 is very close to
$\bar 4$, the independent probability calculated in  problem
\#\ref{ind}.


\end{enumerate}

\item Exercise 7.28 on page 316 of your textbook.


\begin{enumerate}

\item (1 pt) $P(X\le 14.8)=P(Z\le\frac{14.8-15}{.1}) \approx 0.02275$.   See the Appendix for
R-code.

\item (1 pt) $P(14.7< X < 15.1)= P(\frac{14.7-15}{.1}< Z < \frac{15.1-15}{.1})\approx 0.84$.   See the Appendix for
R-code.

\item (2 pts) The probability that one tank holds at most 15 gallons is
$P(X\le 15)=\frac{1}{2}$.   Assuming independence, then the
probability that two tanks hold at most 15 gallons is
$\frac{1}{2}^2=\frac{1}{4}$.

\end{enumerate}




\item (2 pts) Repeating the experiment many times, then
    $$p ~\dot \sim ~N(\mu=\frac{2}{3},\sigma=\frac{\sqrt{2}}{30})$$
when the data is from a large random sample.

\begin{enumerate}
\item The probability that the sample proportion of mosquitos who
fly towards the infected tent is over 70\% is
$P(p>.7)=P\left(Z>\frac{.7-\frac{2}{3}}{\frac{\sqrt{2}}{30}}\right)
\approx P(Z > 0.7071) \approx .2398$. See Appendix for the R code.


\item The probability the sample proportion of mosquitos who fly
towards the infected tent is exactly $.6$ is 0.  This is because for
any continuous variable, the probability of attaining any single
value is always 0.

\end{enumerate}




\end{enumerate}

\newpage
\noindent {\heada Appendix}

\verbatiminput{project4Rcode.txt}


\end{document}

