# Thread: Statistics Marathon & Questions

1. ## Re: University Statistics Discussion Marathon

Me too I've been thinking about it for a few days......................im guessing it would be lower. I read somewhere if you increase the sample size the p-value decreases slightly.

2. ## Re: University Statistics Discussion Marathon

Originally Posted by davidgoes4wce
Me too I've been thinking about it for a few days......................im guessing it would be lower. I read somewhere if you increase the sample size the p-value decreases slightly.
If the mean or SD didn't change, that means the Z score probably didn't change right ?

If that doesn't change, the p value doesn't change too .

3. ## Re: University Statistics Discussion Marathon

Originally Posted by nerdasdasd
If the mean or SD didn't change, that means the Z score probably didn't change right ?

If that doesn't change, the p value doesn't change too .
OK I'll take your word for it. I'll look to read up on it a bit further as I have actually never thought it about it in too much detail.

4. ## Re: University Statistics Discussion Marathon

My stats knowledge is minimal, but wouldn't it decrease?

I assume that whats going on is you have the hypothesis that the avg sleeping hours of a student are normally distributed with mean 8 and variance V. We are doing a one-tailed test with test statistic our sample mean.

If our p-value is 0.1 that means the observed sample mean m of our initial sample of 20 students is quite a bit less than 8. (So the probability of a randomly selected student from our population sleeping less than m hours on average is only 0.1).

But the distribution of sample means in samples of size n is given by N(8,V/n). So as n increases, the probability of the the test statistic being smaller than m decreases, i.e. the p-value decreases.

Is there some other choice of hypothesis/test statistic here? Without specifying such a choice, the one I have made above seems natural to me.

5. ## Re: University Statistics Discussion Marathon

Originally Posted by seanieg89
My stats knowledge is minimal, but wouldn't it decrease?

I assume that whats going on is you have the hypothesis that the avg sleeping hours of a student are normally distributed with mean 8 and variance V. We are doing a one-tailed test with test statistic our sample mean.

If our p-value is 0.1 that means the observed sample mean m of our initial sample of 20 students is quite a bit less than 8. (So the probability of a randomly selected student from our population sleeping less than m hours on average is only 0.1).

But the distribution of sample means in samples of size n is given by N(8,V/n). So as n increases, the probability of the the test statistic being smaller than m decreases, i.e. the p-value decreases.

Is there some other choice of hypothesis/test statistic here? Without specifying such a choice, the one I have made above seems natural to me.
With these MCQ questions (they were given to UNSW business statistics students)

The section or topic of the quiz , the majority of the questions were based on the t-test.

$t=\frac{\bar{x}-\mu}{s \div \sqrt{n}}$

6. ## Re: University Statistics Discussion Marathon

The more I think about it the more I think the P-Value actually decreases

$I'll provide 2 cases where we are hypothesizing where we make a claim that the students under sleep (lets say 6 hours). I've always like using real numbers, we make an assumption that \mu=8 and s=2 do not change in both cases.$

Case 1:

$\bar{x}=6, \mu=8, s=2, n =20$

$t=\frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}}=\frac{6-8}{2 \div \sqrt{20}}=-4.472$

$The t_{stat}=-4.472 at 19 degrees of Freedom, gives a P-Value = 0.000131$

Case 2:

$\bar{x}=6, \mu=8, s=2, n=100$

$t=\frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}}=\frac{6-8}{2 \div \sqrt{100}}-34.00$

$The t_{stat}=-34 at 99 degrees of Freedom, gives a P-Value = 1.04 E-56$

So what I can interpret from this is it decreases.

7. ## Re: University Statistics Discussion Marathon

Also I will admit , I did not know how to calculate a P-Value manually up until last week. (having studied high school stats+ uni stats for around 6 years) If you guys are looking to get better at statistics with Excel highly recommend this book by : Mark Berenson, David Levine and Kathryn Szabat, BUSS1020 Quantitative Business Analysis.

I personally feel they explain stats better than the Science, Advanced Science, Engineering or Psychology way of statistics.

8. ## Re: University Statistics Discussion Marathon

Sure, if you are using the t-test then use that formula instead, it will still decrease. You can compute the p-value in terms of n by using the t-distribution and show it decreases by using calculus or whatever else you like.

Just think about it intuitively, a sample of two students who on average undersleep by H hours is far less significant than a sample of 1,000,000 students who on average undersleep by the same amount. (With both samples having the same s.d.)

The latter is far greater evidence of a trend of undersleeping students, and any reasonable statistical test should reflect this.

9. ## Re: University Statistics Discussion Marathon

$Consider a sample size n=10 where the first sample mean is 8 and sample variance is 13. Consider a second sample that is exactly the same as the first excerpt that there is an additional observation that takes on the value of 8. What is the sample variance for this larger sample with n=11 observations? (Your answer should be correct to 1 decimal place.)$

Ans : 11.7

10. ## Re: University Statistics Discussion Marathon

This is the theory behind the above question

It seems like in that 2nd line of working, they did an expansion, I for one could not see that expansion of

$\sum(y_{i}-\bar{y})^2=\sum \ y_{i} ^2 -2 \bar{y} \sum \ y_i + n \bar{y}^2$

$Would it be ok to write that summation ?$

$\sum(y_{i}-\bar{y})^2=\sum \ y_{i} ^2 -2 \sum \bar{y} \ y_i + n \bar{y}^2$

I got a bit lost with the 'n' chucked in the third term of the expansion as well.

11. ## University Statistics Discussion Marathon

Expand the square and note that the mean (and also its square) is a constant.

In the second summand there is a common factor which can be factorised out.

In the third summand note that you are summing the same constant value n times.

12. ## Re: University Statistics Discussion Marathon

Originally Posted by davidgoes4wce
$Consider a sample size n=10 where the first sample mean is 8 and sample variance is 13. Consider a second sample that is exactly the same as the first excerpt that there is an additional observation that takes on the value of 8. What is the sample variance for this larger sample with n=11 observations? (Your answer should be correct to 1 decimal place.)$

Ans : 11.7

$I managed to figure it out$

$s^2=\frac{1}{n-1} \sum ^{10} _{i=1} (y_{i}-\bar{y})^2=\frac{1}{n-1} (\sum_{i=1} ^{10} y_i ^2 -n \bar{y}^2)$

$s^2=13, \bar{y}=8, n=10 . We can then$

$13=\frac{1}{9} (\sum_{i=1} ^{10} y_i ^2 -10 (8)^2)$

$\sum_{i=1} ^{10} y_i ^2 =757$

$With n=11 our new summand is , \sum_{i=1} ^{11} y_i ^2 =757+8^2=821, n=11, \bar{y}=8$

$s^2=\frac{1}{n-1} (\sum_{i=1} ^{10} y_i ^2 -n \bar{y}^2)$

$=\frac{1}{10} (821-11(8)^2)=11.7$

13. ## Re: University Statistics Discussion Marathon

Here is something more theoretical.

Suppose $X_j\sim \mathcal{N}(\mu,\sigma^2)$ for j=1,...,n are i.i.d random variables for some unknown parameters $\mu,\sigma^2$.

1. Define $\overline{X}:=\frac{1}{n}\sum_{j=1}^n X_j, s^2:=\frac{1}{n-1}\sum_{j=1}^n (X_j-\overline{X})^2.$

Compute $\mathbb{E}(\overline{X}),\mathbb{E}(s^2), \textrm{Var}(\overline{X}) , \textrm{Var}(s^2).$

2. Hence define in terms of $\overline{X},s,n$ a random variable that has expected value $\mu$ and variance $1$. Compute the pdf of this random variable in terms of special functions.

3. What happens as $n\rightarrow\infty$? Prove this.

(This question outlines some of the theory behind something used several times in this thread.)

14. ## Re: University Statistics Discussion Marathon

Can you guys stop dropping the big guns... you're making statistics look hard.

15. ## Re: University Statistics Discussion Marathon

Originally Posted by BlueGas
Can you guys stop dropping the big guns... you're making statistics look hard.
Huh? Not all of statistics is just plugging numbers into memorised formulae, where do you think these formulae come from?

As always, you have the option of ignoring any question not to your taste.

In any case, this particular question is easier than most of the mathematical questions posted in these forums.

16. ## Re: University Statistics Discussion Marathon

Originally Posted by seanieg89
Here is something more theoretical.

Suppose $X_j\sim \mathcal{N}(\mu,\sigma^2)$ for j=1,...,n are i.i.d random variables for some unknown parameters $\mu,\sigma^2$.

1. Define $\overline{X}:=\frac{1}{n}\sum_{j=1}^n X_j, s^2:=\frac{1}{n-1}\sum_{j=1}^n (X_j-\overline{X})^2.$

Compute $\mathbb{E}(\overline{X}),\mathbb{E}(s^2), \textrm{Var}(\overline{X}) , \textrm{Var}(s^2).$

2. Hence define in terms of $\overline{X},s,n$ a random variable that has expected value $\mu$ and variance $1$. Compute the pdf of this random variable in terms of special functions.

3. What happens as $n\rightarrow\infty$? Prove this.

(This question outlines some of the theory behind something used several times in this thread.)
$I'll have a crack at Question 1$

$Defining \bar{X} and s^2$

$If the n observations in a sample are denoted by \bar{x}=\frac{x_1+x_2+....+x_n}{n}=\frac{\sum _{i=1}^n x_i}{n} \textcircled{1}$

$Computation of s^2. The computation of s^2 requires calculation of \bar{x}, n subtractions, and n squaring and adding operations. If the original observations or the deviations x_i-\bar(x} are not integers, the deviations x_i-\bar{x} may be tedious to work with, and several decimals may have to be carried to ensure numerical accuracy. A more efficient computational formula for the sample variance is obtained as follows:$

$s^2=\frac{\sum_{i=1}^n (x_i-\bar{x})^2}{n-1} =\frac{\sum_{i=1}^n (x_{i}^2+\bar{x}^2-2\bar{x}x_{i})}{n-1}$

$and since \bar{x}=\frac{1}{n} \sum_{i=1}^n x_i, the last equation reduces to$

$s^2=\frac{\sum_{i=1}^n x_{i}^2-\frac{(\sum_{i=1}^n x_i)^2}{n}}{n-1}=s^2=\frac{\sum_{i=1}^n (x_i-\bar{x})^2}{n-1}$

$Taking the expectation of both sides in \textcircled{1}$

$E[\bar{X}]=\frac{1}{n} \sum_{i=1}^nE[X] \textcircled{2}$

$Since X_1,X_2,....,X_n are independent and identically distributed (i.i.d.) RVs , we have E[X_1]=E[X_2] =...E[X_n]=\mu_x. Substitituing these values into \textcircled{2}, we find that the sample mean variable satisfies$

$E[\bar{X}]=\mu_x$

$Which asserts that \bar{x} is an unbiased estimate of \mu_x. An unbiased estimate is one that is, on the average, right on target.$

$Var[\bar{X}]=E[(\bar{X}-E[{\bar{X})]^2]$

$where \bar{X}-E[{\bar{X}]=\bar{X}-\mu_x can be rewritten as$

$\bar{X}-E[\bar{X}]=\frac{1}{n} \sum_{i=1} ^n (X_i-\mu_x)=\frac{1}{n} \sum_{i=1}^n Y_i$

$Where$

$Y_i \triangleq \ X_i-\mu_x, i=1,2,....,n$

$\therefore, Var[\bar{X}]=E[(\frac{1}{n} \sum_{i=1}^n Y_i)^2]= \frac{1}{n^2} \sum_{i=1}^n E[Y_i ^2] +\frac{1}{n^2} \sum_{i=1}^n \sum_{j=1 \ (j \neq i)}^n E[Y_i Y_j]$

$Since the random variables {Y_i; 1 \leq i \leq n} are statistically independent with zero mean and variance \sigma_x ^2 we have$

$Var[\bar{X}]=\frac{\sigma_x ^2}{n}$

$Thus the variance of the sample mean variable is the population variance divided by the sample size.$

$The deviations of the individual observations from the sample mean provide information about the dispersion of the x_i about \bar{x}. We define the sample variance s_x^2 by$

$S_x^2 \triangleq \frac{1}{n-1} \sum_{i=1} ^n (x_i-\bar{x})^2 \textcircled{3}$

$The quantity can be viewed as an instance of the sample variance variable, which is also commonly called the sample variance. We find, after some rearrangement$

$S_x^2=\frac{1}{n} \sum_{i=1}^n Y_i ^2-\frac{1}{n(n-1)} \sum_{i=1}^n \sum_{j=1 (j \neq i)} ^n Y_i Y_j$

$Taking expectations, we have$

$E[S_x ^2]=\frac{1}{n} \sum_{i=1}^n E[Y_i ^2]=\sigma_x ^2$

$The reason for using n-1 rather than n as a divisor in \textcircled{3} is to make E[S_x^2] equal to \sigma_x^2; that is , to make s^2 an unbiased estimate of \sigma_x^2. The positive square root of the sample variance, s_x, is called the sample standard deviation$

17. ## Re: University Statistics Discussion Marathon

Yep good stuff, it remains to compute Var(s^2), but what you have done is enough to motivate the later parts of the question so don't worry about that if you don't want to.

18. ## Re: University Statistics Discussion Marathon

Screen Shot 2016-06-04 at 11.14.13 pm.zip

Can someone explain for me please

19. ## Re: University Statistics Discussion Marathon

Originally Posted by edwardjoh2
Screen Shot 2016-06-04 at 11.14.13 pm.zip

Can someone explain for me please
That's dodgy. Please leave the screenshot in the form of an actual image. I don't want to risk getting an infection.

20. ## Re: University Statistics Discussion Marathon

$A regression analysis was performed to determine which, if any, of protein (grams), fiber (grams), carbohydrates (grams), and sugars (grams) are associated with calories in a sample of 34 breakfast cereals. Which of the following is the null hypothesis for the F-test in this problem?$

$Choose the correct answer below$

$A) All of protein, total fat, fiber, carbohydrates, or sugars help to explain calories in breakfast cereals.$

$B) At least one of protein, total fat, fiber, carbohydrates, or sugars helps to explain calories in breakfast cereals.$

$C) None of protein, total fat, fiber, carbohydrates, or sugars helps to explain calories in breakfast certeals.$

$D) Exactly one of protein, total fat, fiber, carbohydrates, or sugars helps to explain calories in breakfast cereals.$

$E) \beta_i=0$

21. ## Re: University Statistics Discussion Marathon

Im thinking the answer is E ( I don't have the solution on me ) Here is my explanation and thinking.

$When there is more than one independent variable, we need a method to test the overall utility of the model. The technique is a version of the analysis .$

$To test the utility of the regression model, we specify the following hypotheses:$

$H_0:\beta_1=\beta_2=....=\beta_k$

$H_a: At least one \beta_i is not equal to zero$

$IF the null hypotheses is true , none of the independent variables x_1,x_2,...,x_k is linearly related to y, and therefore the model is useless.$

22. ## Re: University Statistics Discussion Marathon

$If all the data points fall on the least-squares regression line in simple linear regression which of the following is true?$

$A) SS(Regression Model)=SS(Trial)$

$B) F-Statistic=0$

$C) MS(Regression Model)=100 \%$

$D) SS(Total)=0$

$E) None of the above statements are true.$

23. ## Re: University Statistics Discussion Marathon

A sample of 50 observation is taken from a normal population, with mean of 100 and Standard Deviation 10. If the population is finite with N=250

Find

P(xbar > 103)

24. ## Re: University Statistics Discussion Marathon

Originally Posted by Rhinoz8142
A sample of 50 observation is taken from a normal population, with mean of 100 and Standard Deviation 10. If the population is finite with N=250

Find

P(xbar > 103)
$As the population is normally distributed, \bar{x} is normally distributed. Therefore, as the population standard deviation \sigma is also known, the standardised test statistic Z has a standard normal distribution.$

$Test statistic :$

$Z=\frac{\bar{X}-\mu}{\sigma \div \sqrt{n}}$

$Z=\frac{103-100}{10/ \sqrt{50}}=2.12$

$P(Z > 2.12) =0.017$

25. ## Re: University Statistics Discussion Marathon

Originally Posted by Rhinoz8142
A sample of 50 observation is taken from a normal population, with mean of 100 and Standard Deviation 10. If the population is finite with N=250

Find

P(xbar > 103)
$Since X\sim N(100,100) then \bar{X}\sim N\left(100,\frac{100}{50}\right) so \\\\P(\bar{X} > 103)\\\\=P\left(\dfrac{\bar{X}-100}{\sqrt{\frac{100}{50}}}>\dfrac{103-100}{\sqrt{\frac{100}{50}}}\right)\\\\=P\left(Z> \dfrac{3}{\sqrt{2}}\right)\: where Z\sim N(0,1)$

then compute the value from there. Note that the population size is not relevant here as we are referring to the distribution of the sample mean.

Page 3 of 11 First 12345 ... Last

There are currently 1 users browsing this thread. (0 members and 1 guests)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•