---
title: "Confidence interval for a mean"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Confidence interval for a mean}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,comment=NA,fig.width=7,fig.height=4)
library(interpretCI)
library(glue)
```

```{r,echo=FALSE,message=FALSE}
x=meanCI(mtcars,mpg)

two.sided<-greater<-less<-FALSE
if(x$result$alternative=="two.sided") two.sided=TRUE
if(x$result$alternative=="less") less=TRUE
if(x$result$alternative=="greater") greater=TRUE

twoS="The null hypothesis will be rejected if the sample mean is too big or if it is too small."
lessS="The null hypothesis will be rejected if the sample mean is too small."
greaterS="The null hypothesis will be rejected if the sample mean is too big."
```

This document is prepared automatically using the following R command.

```{r,echo=FALSE}
call=paste0(deparse(x$call),collapse="")
x1=paste0("library(interpretCI)\nx=",call,"\ninterpret(x)")
textBox(x1,italic=TRUE,bg="grey95",lcolor="grey50",width=6)
```

## Problem


```{r,echo=FALSE}

string=glue("An inventor has developed a new, energy-efficient lawn mower engine. From his stock of  {x$result$n*100} engines, the inventor selects a simple random sample of {x$result$n} engines for testing. The engines run for an average of {round(x$result$m,2)} minutes on a single gallon of regular gasoline, with a standard deviation of {round(x$result$s,2)} minutes. What is the {(1-x$result$alpha)*100}% confidence interval for the average minutes? (Assume that run times for the population of engines are normally distributed.")

textBox(string)
```


## Confidence interval of mean

The approach that we used to solve this problem is valid when the following conditions are met.

- The sampling method must be simple random sampling. 

- The sampling distribution should be approximately normally distributed.

Since the above requirements are satisfied, we can use the following four-step approach to construct a confidence interval of mean.


### Raw data

`r ifelse(is.na(x$data[1,1]),"Raw data is not provided.","The first 10 rows of the provided data is as follows.")`

```{r,echo=FALSE}
if(!is.na(x$data[1,1])) {
     head(x$data,10)
}
```

### Sample statistics

The sample size is `r x$result$n`, the sample mean is `r round(x$result$m,2)` and the standard error of sample is `r round(x$result$s,2)`. The confidence level is `r (1-x$result$alpha)*100` %.

### Find the margin of error

Since we do not know the standard deviation of the population, we cannot compute the standard deviation of the sample mean; instead, we compute the standard error (SE). Because the sample size is much smaller than the population size, we can use the "approximate" formula for the standard error.


$$ SE= \frac{s}{\sqrt{n}}$$
where **s** is the standard deviation of the sample, **n** is the sample size.


$$SE=\frac{`r round(x$result$s,2)`}{\sqrt{`r x$result$n`}}=`r round(x$result$se,2)`$$
Find the critical probability(p*):

```{r,echo=FALSE}
if(x$result$alternative=="two.sided"){
    string=glue("$$p*=1-\\alpha/2=1-{x$result$alpha}/2={1- x$result$alpha/2}$$")
} else{
     string=glue("$$p*=1-\\alpha=1-{x$result$alpha}$$")
}
```
`r string`

The **degree of freedom**(df) is:
$$df=n-1=`r x$result$n`-1=`r x$result$DF`$$


The **critical value** is the t statistic having `r x$result$DF` degrees of freedom and a cumulative probability equal to `r ifelse(x$result$alternative=="two.sided",1- x$result$alpha/2,1- x$result$alpha)`. From the t Distribution table, we find that the critical value is `r round(x$result$critical,3)`.

```{r,echo=FALSE}
show_t_table(DF=x$result$DF,p=x$result$alpha,alternative=x$result$alternative)
```

```{r,results='asis',echo=FALSE}
if(x$result$alternative=="two.sided"){
  string=glue("$$qt(p,df)=qt({1- x$result$alpha/2},{x$result$DF})={round(x$result$critical,3)}$$")
} else {
    string=glue("$$qt(p,df)=qt({1- x$result$alpha},{x$result$DF})={round(x$result$critical,3)}$$") 
} 
```

`r string`

The graph shows the $\alpha$ values are the tail areas of the distribution. 

```{r,echo=FALSE}
draw_t(DF=x$result$DF,p=x$result$alpha,alternative=x$result$alternative)
```

Compute **margin of error**(ME):

$$ME=critical\ value \times SE$$
$$ME=`r round(x$result$critical,3)` \times `r round(x$result$se,3)`=`r round(x$result$ME,3)`$$

```{r,results='asis',echo=FALSE}
if(two.sided) {
     string="The range of the confidence interval is defined by the sample statistic $\\pm$margin of error."
} else if(less){
     string="The range of the confidence interval is defined by the -$\\infty$(infinite) and the sample statistic + margin of error."
} else{
     string="The range of the confidence interval is defined by the sample statistic - margin of error and the $\\infty$(infinite)."
}
```

Specify the confidence interval. `r string` And the uncertainty is denoted by the confidence level.

### Confidence interval of the mean
 

```{glue,results='asis',echo=FALSE}
Therefore, the {(1-x$result$alpha)*100}% confidence interval is {round(x$result$lower,2)} to {round(x$result$upper,2)}. That is, we are {(1-x$result$alpha)*100}% confident that the true population mean is in the range {round(x$result$lower,2)} to {round(x$result$upper,2)}.
```


### Plot

You can visualize the mean difference:

```{r,echo=FALSE}
plot(x)
```

### Result of meanCI()

```{r,echo=FALSE}
print(x)
```

### Reference

The contents of this document are modified from StatTrek.com.
Berman H.B., "AP Statistics Tutorial", [online] Available at:  https://stattrek.com/estimation/confidence-interval-mean.aspx?tutorial=AP URL[Accessed Data: 1/23/2022].