Math 3215, Prob and Stats, Summer 2009

Basically, we will have two more tests this semester -- the second midterm and then the final. The two miderms each will count 30 percent of the grade, and the final will count 40 percent.

Course Diary.

So far, I have lectured on moment generating functions in section 2.5 of the text, and have said a few things about Poisson random variables. It is good for you to attend class because not everything I have talked about is in your textbook!
I would like to set a time for an office hour each week. During tomorrow's lecture, I will take a poll, to set a time for when to hold it each week -- I am right now thinking about Tuesday afternoon as a good time.
Today, June 23, we finished talking about Poisson random variables. I derived the moment generating function for a Poisson r.v., and then used it to prove that if X and Y are independent and Poisson with parameters lambda_1 and lambda_2, then their sum is also Poisson, but with parameter lambda_1 + lambda_2. I then used this to analyze a few basic problems concerning radioactive decay, as well as a problem about restaurant drive-thru queueing. I moved on to talking about continuous random variables, defining the probability density function (pdf for short) and the cumulative distribution function (cdf for short). I then gave some examples of continuous random variables. I finished the lecture with a puzzle. Be sure to read and work through all the problems in the text regarding the Poisson distribution.
*** I have set the office hour for Tuesday at 2:00 in my office, 103 Skiles ***
Today, June 25, I talked about the solution to the ``two slips of paper'' puzzle from last time. Then I defined percentile. Then I talked about how to generate samples from an arbitrary distribution, which is useful in Monte Carlo simulation (and this is not covered in your book). Then I talked about how the distribution of the gaps between Poisson events is exponentially distributed. Next, I talked about how to derive the classical ``half-life formulas'' of a radioactive sample from scratch, using (a) Bernoulli ``life functions'', (b) time independence, and (c) The Law of Large Numbers. I finished the lecture with some discussion of the Standard Normal Random variable (the validity of its pdf function), how it relates to the gamma function (the value of Gamma(1/2)), and discussion of more basic properties of the gamma function (like Gamma(1) = 1 and x Gamma(x) = Gamma(x+1)).
Go to here for an interesting stats-related video.
Today, June 30, I handed out the solutions to some homework problems. I then talked about the exponential random variable some, and worked out its moment generating function. I talked about how the Gamma distribution is a type of continuous analogue of the negative binomial distribution -- it gives one a way to compute the following probability: given that X is Poisson with parameter lambda for a time window [0,1], the probability that the rth event occurs by time delta, has a Gamma distribution. I then worked out the moment generating function for the Gamma distribution. Then I defined the chi-squared random variable with n degrees of freedom in terms of a sum of squares of n independent standard normal random variables. Using this I computed the moment generating function for a chi-squared random variable. Lastly, I briefly mentioned some applications of chi-squared random variables to ``goodness of fit of statistical models'' and the ``error in mesaurement in a RADAR detector''. Be sure to do all the homeworks at the end of the relevant chapters!
*** I have set the time for your next exam for Tuesday, July 14. ***
Today, Jul 2, we covered the chi-squared random variable in more detail, and I worked through the example of how to use it for ``goodness of fit'' of a model, and then used it in population sampling (though didn't go through in detail how to derive the `chi-square' test statistic). Here is an old note I wrote on these examples. I also mentioned that if a chi-squared random variable has 2 degrees of freedom, then it has an exponential distribution. First, I showed this via the formula relating chi-square to the gamma distribution; and then later in the lecture I did it using integration over 2D random variables. I defined the joint pdf for a 2D random variable, I defined the marginal pdfs, and I defined independent random variables. I went through a few examples, and presented a certain puzzle on 2D random variables with positive pdf on a triangle. Then, I used some integration to show directly that a chi-squared random variable with 2 degrees of freedom has an exponential distribution (involves a 2D integral and a change to polar coordinates). I explained how to find the distribution of a function U of a 2D random variable X,Y -- U itself is 1D, while the variables it depends upon form a 2D r.v. Lastly, I talked about expectation and variance of function of a 2D r.v. Be sure to do the relevant homework in the book!
On July 9 I started the lecture with conditional probability density functions, conditional expectation, and conditonal variance. I then talked about, and proved, the tower property of conditional expectation. Next, I talked about the independence of two or more random variables, and showed that if X1, ..., Xk are independent random variables, and u1, ..., uk are any functions, then the expectation E(u1(X1) ... uk(Xk)) is the product E(u1(X1)) ... E(uk(X_k)) -- that is, the expectation of a product is the product of the expectations, WHENEVER THE RANDOM VARIABLES ARE INDEPDENT. I used this to show that the moment generating function for a sum of independent random variables X1 + ... + Xk is the product of moment generating functions of the individual variables. I mentioned in passing that this will be useful when we go to try to prove the ``Central Limit Theorem'', one of the most important theorems in probability theory (and mathematics, for that matter). After that I showed that if X1, ..., Xk independent, then V(X1 + ... + Xk) = V(X1) + ... + V(Xk); that is, the variance of the sum of INDEPENDENT r.v. is the sum of variances. Finally, I used this to prove the ``Weak Law of Large Numbers''.
*** We will be having a reivew session today, Friday, either at my office 103 Skiles, or at our usual classroom *** (probably the latter)
Click here for a practice exam. Your exam this next Tuesday will be similar, except that it will have a definition question in place of one of the computations.
Here is some info about the exam: first, I want to hold the exam for only 1 hour -- from 10:05-11:05 -- and it will have only 5 questions (in multiple parts), each worth 20 pionts. You may bring a simple calculator to the exam, no programmable ones; I will supply any tables that you will need for the exam. There will be a ``definition question'', where you will be asked to define 5 terms; there will be two relatively straightforward computation questions -- one will ask you to compute a constant that makes a certain 2D function into a pdf; and one will ask you to compute some probabilities associated with either Poisson, or Chi-squared, or Gamma, and/or marginals pdfs, \ and/or conditional pdfs, and/or conditional expectations. One of your questions will be related to moment generating functions -- perhaps you will be asked to compute the mgf for some r.v., and then use it to compute, say, the 6th moment of the r.v. Finally, one of your questions will ask you to give a proof of a standard fact from class; for example, you might be asked to show that the pdf for the standard normal r.v. is indeed a pdf (recall that this required converting the integral into a 2D one, and then using polar coordinates), or you may be asked to prove something like that the expectation of a Poisson with parameter lambda is, in fact, lambda -- if you look back through your notes you will see there are lots of basic proofs I did in class.
For a copy of the exam, click here . NOTE: I WILL NOT HOLD OFFICE HOURS TODAY, JULY 14, AT 2:00 AS USUAL.
Today, July 16, 2009, we first talked about some applications of the Law of Large Numbers to ``noise cancellation'' and to the analysis of investment portfolios. Click here and here for some notes on these (the second note has some more advanced material on `maximum liklihood estimators'). We then talked about the normal distribution (also known as the Gaussian distribution), and introducted the notation N(mu, sigma^2) for the normal distribution with mean mu and variance sigma^2. We worked out its pdf using the pdf for the standard normal N(0,1). We introduced the notation Phi(x) for the cdf of the standard normal, and talked about a few of its properties, such as Phi(x) + Phi(-x) = 1. I mentioned the language ``within 1 standard deviation'', and explained what it meant. Next, we computed the moment generating function for an N(0,1) r.v. -- it turned out the be exp(t^2/2). I mentioned, though didn't prove, that using moment generating functions in an obvious way, one can show that if X_1, ..., X_n are independent r.v.s with X_i ~ N(mu_i, sigma^2_i), then the sum X_1 + ... + X_n has a normal distribution (which therefore means it has mean mu_1 + ... + mu_n and variance sigma^2_1 + ... + sigma^2_n). This is useful in, for example, the theory of ``noise cancellation'', as I discussed. Lastly, I introduced the Central Limit Theorem, ONE OF THE MOST IMPORTANT THEOREMS IN ALL OF MATHEMATICS. IT IS DIFFICULT TO OVERSTATE JUST HOW IMPORTANT A THEOREM IT IS. I finished the lecture by providing a rough outline of its proof (there are lots of details I left out). I will not expect you to know how to prove CLT for the final, but I will hold you accountable for knowing how to manipulate moment generating functions to find the distribution of sums of r.v.'s (in fact, count on having one problem about this on your final exam). Be sure to work all the relevant HWs from your textbook in chapter 5.
During the July 21, 2009 lecture I first went over the problems from the exam (actually, I didn't discuss problem 1 as it can be looked up in the book). I then starting talking about how one can use the Central Limit Theorem to approximate the cdf of a binomial r.v. using the normal distribution. I mentioned that one doesn't actually need CLT to do this -- through a messy application of Stirling's formula one can prove it directly, but the CLT approach is much more conceptual. Next, I talked about how one can use this approximation of binomial r.v.'s to do some hypothesis testing. I gave an example of how to test a newspaper's claim that ``20% of adult Georgians smoke'' by polling 1000 randomly selected people. Of course, we have seen a statistical test for this before, using the chi-squared distribution, in something I called a ``chi square test''. Then, I asked what the probability is that the test incorrectly rejects the claim ``20% of Georgian's smoke''; that is, given that 20% of Georgians smoke, what is the prob. that the test erroneously concludes that ``the claim should be rejected''? I mentioned that such an error is called a Type I error, and that we would talk about it further during the next lecture. Just before the 5 min. break, I talked about how CLT can be used to estimate parameters, and I introduced the concept of a ``confidence interval''. I then talked about the following standard ``pedagogical example'': suppose that X is a r.v. having known variance sigma^2, but unknown mean mu. And suppose we have access to, say, 100 independent samples of values of X, call them X_1, ..., X_100. Treat these samplings as i.i.d. random variables, and then to estimate the parameter mu, simply average them X^bar = (X_1 + ... + X_100)/100. Using CLT, one can find delta such that the probability that mu is in [X^bar - delta, X^bar + delta] is 95%. This interval [X^bar - delta, X^bar + delta] is called a ``95% confidence interval for mu''. Of course, we have assumed here that our sample size (i.e. 100) is big enough for CLT to closely approximate X^bar with a normal distribution, and we have assumed that sigma^2 is known. This is why it is only a ``pedagogical example'' of confidence intervals.
Click here for a copy of an alternate version of midterm 2 that I gave to some people who missed the first class (i.e. it is a `makeup' exam).
*** Your final exam is set for Tuesday, July 28, 2:50-5:40 ***
In the July 23 lecture, I started by discussing the student-t distribution, and how it is used to obtain a confidence interval for the mean of a r.v. X, which we assume is normal with mean mu and variance sigma^2 (not a bad assumption as normal r.v.'s are all-pervasive, due to Central Limit-type phenomena). Then, I discussed how the chi-squared distribution can be used to obtain a confidence interval for sigma^2. Next, I talked about the difference between biased and unbiased estimators, and showed that the sample mean (of a group of i.i.d. r.v.'s) is unbiased, as is the sample variance sum_{i=1}^k (X_i - X^bar)^2/(k-1) -- note that we divide by k-1 (if we divided by k, the sample variance would be biased!). After the break I discussed hypothesis testing, and talked about the concepts of the null hypothesis H_0, the alternate hypothesis H_a, Type I error and how to compute its probability of occurrence, and Type II error. I explained how H_0 and H_a can be expressed in terms of the test statistic z -- say H_0 becomes ``z = z_0''. In reality, one usually expresses H_0 in terms of something called the ``p-value'', which is computed from the z-value, but since I didn't have time to cover it, we will stick with ``z=z_0''.
Here is some info about your exam: the exam will have 10 questions, 1 will be a definition; 3 will come from the material that Christian Houdre covered; 3 will come from the material I covered up to the second exam; and 3 will come from the material I covered since the second exam. 1 of your questions will be ``prove that int_{-infty}^infty e^{-x^2/2} dx = sqrt(2 pi),'' as I think it is important enough that your should know it. 1 question will be about hypothesis testing -- perhaps I will ask you to compute the probability of making a Type I error, using table lookups. 1 problem will be about computing a 95% confidence interval, perhaps using the student-t tables or chi^2 tables (to compute the CI for sigma^2) or maybe just using a standard normal approximation. 1 of your problems will ask you to find a constant c that makes certain function f(x,y) into a probability density function (just like on your second midterm). 1 of your problems will involve Bayes's Theorem -- you will be asked to compute some conditional probability. I will post a study sheet later today.
I have written a study sheet for the final exam. Click here to see it.
Here is an old exam I gave in a previous class (Prob and Stats Math 3770) that covers roughly the material that Houdre taught in the first half of our course (well, maybe except for the stats problem, which is problem 5). And here is a copy of the soulutions sheet to that exam.
And here is a copy of an old final exam from that same class Math 3770. You final will be slightly easier, and will have more focus on random variables than that exam.
You are encouraged to bring a calculator to the exam; however, only simple ones will be allowed. The calculator must not have programmable features.
Click here for a copy of the solutions sheet to the ``alternate midterm 2'' listed above.
70 was the median score on your midterm 2, so I will add 5 points to everyone's score on that exam, to raise the median to 75.
For a copy of the final exam, click here

Math 3215, Prob and Stats, Summer 2009

Basically, we will have two more tests this semester -- the second midterm and then the final. The two miderms each will count 30 percent of the grade, and the final will count 40 percent.

Course Diary.

So far, I have lectured on moment generating functions in section 2.5 of the text, and have said a few things about Poisson random variables. It is good for you to attend class because not everything I have talked about is in your textbook!

I would like to set a time for an office hour each week. During tomorrow's lecture, I will take a poll, to set a time for when to hold it each week -- I am right now thinking about Tuesday afternoon as a good time.

*** I have set the office hour for Tuesday at 2:00 in my office, 103 Skiles ***

Go to here for an interesting stats-related video.

*** I have set the time for your next exam for Tuesday, July 14. ***

*** We will be having a reivew session today, Friday, either at my office 103 Skiles, or at our usual classroom *** (probably the latter)

Click here for a practice exam. Your exam this next Tuesday will be similar, except that it will have a definition question in place of one of the computations.

For a copy of the exam, click here . NOTE: I WILL NOT HOLD OFFICE HOURS TODAY, JULY 14, AT 2:00 AS USUAL.

Click here for a copy of an alternate version of midterm 2 that I gave to some people who missed the first class (i.e. it is a `makeup' exam).

*** Your final exam is set for Tuesday, July 28, 2:50-5:40 ***

I have written a study sheet for the final exam. Click here to see it.

Here is an old exam I gave in a previous class (Prob and Stats Math 3770) that covers roughly the material that Houdre taught in the first half of our course (well, maybe except for the stats problem, which is problem 5). And here is a copy of the soulutions sheet to that exam.

And here is a copy of an old final exam from that same class Math 3770. You final will be slightly easier, and will have more focus on random variables than that exam.

You are encouraged to bring a calculator to the exam; however, only simple ones will be allowed. The calculator must not have programmable features.

Click here for a copy of the solutions sheet to the ``alternate midterm 2'' listed above.

70 was the median score on your midterm 2, so I will add 5 points to everyone's score on that exam, to raise the median to 75.

For a copy of the final exam, click here

* I have set the office hour for Tuesday at 2:00 in my office, 103 Skiles *

* I have set the time for your next exam for Tuesday, July 14. *

* We will be having a reivew session today, Friday, either at my office 103 Skiles, or at our usual classroom * (probably the latter)

* Your final exam is set for Tuesday, July 28, 2:50-5:40 *