--- title: "Hypothesis test for a mean" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Hypothesis test for a mean} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE,comment=NA,fig.width=7,fig.height=5) library(interpretCI) library(glue) ``` ```{r,echo=FALSE,message=FALSE} x=meanCI(mtcars,mpg,mu=23) two.sided<-greater<-less<-FALSE if(x$result$alternative=="two.sided") two.sided=TRUE if(x$result$alternative=="less") less=TRUE if(x$result$alternative=="greater") greater=TRUE twoS="The null hypothesis will be rejected if the sample mean is too big or if it is too small." lessS="The null hypothesis will be rejected if the sample mean is too small." greaterS="The null hypothesis will be rejected if the sample mean is too big." ``` This document is prepared automatically using the following R command. ```{r,echo=FALSE} call=paste0(deparse(x$call),collapse="") x1=paste0("library(interpretCI)\nx=",call,"\ninterpret(x)") textBox(x1,italic=TRUE,bg="grey95",lcolor="grey50") ``` ## Given Problem : `r ifelse(two.sided,"Two","One")`-Tailed Test ```{r,echo=FALSE} string=glue("An inventor has developed a new, energy-efficient lawn mower engine. He claims that the engine will run continuously for {round(x$result$mu,2)} minutes on a single gallon of regular gasoline. From his stock of 2000 engines, the inventor selects a simple random sample of {x$result$n} engines for testing. The engines run for an average of {round(x$result$m,2)} minutes, with a standard deviation of {round(x$result$s,2)} minutes. Test the null hypothesis that the mean run time {ifelse(two.sided,'is',ifelse(less, 'greater than','less than'))} {x$result$mu} minutes against the alternative hypothesis that the mean run time {ifelse(two.sided,'is not',ifelse(less, 'less than','greater than'))} {round(x$result$mu,2)} minutes. Use a {x$result$alpha} level of significance. (Assume that run times for the population of engines are normally distributed.)") textBox(string) ``` ## Hypothesis Test for a Mean This lesson explains how to conduct a hypothesis test of a mean, when the following conditions are met: - The sampling method is **simple random sampling**. - The sampling distribution is **normal** or **nearly normal**. Generally, the sampling distribution will be approximately normally distributed if any of the following conditions apply. - The population distribution is normal. - The population distribution is symmetric, unimodal, without outliers, and the sample size is 15 or less. - The population distribution is moderately skewed, unimodal, without outliers, and the sample size is between 16 and 40. - The sample size is greater than 40, without outliers. ## This approach consists of four steps: - state the hypotheses - formulate an analysis plan - analyze sample data - interpret results. ### 1. State the hypotheses The first step is to state the null hypothesis and an alternative hypothesis. $$Null\ hypothesis(H_0): \mu `r ifelse(two.sided,"=",ifelse(less,">=","<="))` `r x$result$mu`$$ $$Alternative\ hypothesis(H_1): \mu `r ifelse(two.sided, "\\neq" ,ifelse(less,"<",">"))` `r x$result$mu`$$ Note that these hypotheses constitute a `r ifelse(two.sided,"two","one")`-tailed test. `r ifelse(two.sided,twoS,ifelse(less,lessS,greaterS))`. ### 2. Formulate an analysis plan For this analysis, the significance level is `r (1-x$result$alpha)*100`%. The test method is a **one-sample t-test**. ### 3. Analyze sample data. Using sample data, we compute the standard error (SE), degrees of freedom (DF), and the t statistic test statistic (t). $$SE = \frac{s}{\sqrt{n}} = \frac{`r x$result$s`}{\sqrt{`r x$result$n`}} = `r round(x$result$se,2)`$$ $$DF=n-1=`r x$result$n`-1=`r round(x$result$DF,2)`$$ $$t = (\bar{x} - \mu) / SE = (`r x$result$m` - `r x$result$mu`)/`r round(x$result$se,2)` = `r round(x$result$t,3)`$$ where **s** is the standard deviation of the sample, $\bar{x}$ is the sample mean, $\mu$ is the hypothesized population mean, and **n** is the sample size. We can visualize the confidence interval of mean. ```{r} plot(x) ``` Since we have a `r ifelse(two.sided,"two","one")`-tailed test, the P-value is the probability that the t statistic having `r round(x$result$DF,2)` degrees of freedom is `r if(!greater) "less than"` `r if(!greater) round(-abs(x$result$t),2)` `r if(!less) "or greater than "` `r if(!less) round(abs(x$result$t),2)`. We use the t Distribution curve to find p value. ```{r} draw_t(DF=x$result$DF,t=x$result$t,alternative=x$result$alternative) ``` $$pt(`r round(x$result$t,3)`,`r x$result$DF`) =`r round(x$result$p,3)` $$ ### 4. Interpret results. Since the P-value (`r round(x$result$p,3)`) is `r ifelse(x$result$p>x$result$alpha,"greater","less")` than the significance level (`r x$result$alpha`), we can`r if(x$result$p>x$result$alpha) "not"` reject the null hypothesis. ### Result of meanCI() ```{r,echo=FALSE} print(x) ``` ### Reference The contents of this document are modified from StatTrek.com. Berman H.B., "AP Statistics Tutorial", [online] Available at: https://stattrek.com/hypothesis-test/mean.aspx?tutorial=AP URL[Accessed Data: 1/23/2022].