﻿

# Excel

## Installing Data Analysis Toolpak

1. Click File, Options

2. Select Add-ins from the menu on the left. At the bottom of the window select Excel Add-ins for Manage and click Go

3. Check the box for Analysis ToolPak and click OK.

4. You will now have a button for Data Analysis in the Analysis section of the Data tab.

## ANOVA

### One-Way

1. Make sure you have the Data Analysis Toolpak installed.

2. Enter the data for the different groups into separate columns.

3. Navigate to the Data tab, and select Data Analysis in the Analyze section, then Anova:Single Factor. Select OK. Enter the Input Range (A1:C4 in this example), enter value for Alpha, select Output Range and select the first cell in an available row. Select OK.

### Fisher's LSD

Note: These instructions assume a sample size of 4. Adjust the rows in the instructions to fit your data.

1. Put the data for the samples in columns A, B, and C.

2. Follow the instructions to calculate a One-Way ANOVA and place the output in cell E1.

3. Follow the instructions to find the t critical value corresponding to 1-α/2 and the degrees of freedom nτ - k, where nτ is the total number of observations and k is the number of treatments. For this example, α is 0.1 and nτ - k is 9, so the formula is "=T.INV(0.95,9)". Enter this in cell B8.

4. Calculate Fisher's LSD for comparing the means of Salesperson 1 and Salesperson 2 by typing "=B8*SQRT(H13*((1/COUNT(A2:A5))+(1/COUNT(B2:B5))))" in cell B9.

5. Calculate $| x _ i − x _ j |$ for comparing Salesperson 1 to Salesperson 2 by typing "=ABS(H5-H6)" in cell B10.

6. Compare cell B10 to B9 to determine if the means are different for Salesperson 1 and Salesperson 2.

### Tukey's HSD

Note: These instructions assume a sample size of 4. Adjust the rows in the instructions to fit your data.

1. Put the data for the samples in columns A, B, and C.

2. Follow the instructions to calculate a One-Way ANOVA and place the output in cell E1.

3. Use a table to find the q critical value corresponding to α, the number of treatments k, and the degrees of freedom nτ - k, where nτ is the total number of observations. For this example, α is 0.05, the number of treatments is 3, and the degrees of freedom are 9, so the associated q critical value is 3.949. Enter this in cell B8.

4. Calculate Tukey's HSD for comparing the means of Salesperson 1 and Salesperson 2 by typing "=B8*SQRT((H13/2)*((1/COUNT(A2:A5))+(1/COUNT(B2:B5))))" in cell B9.

5. Calculate $| x _ i − x _ j |$ for comparing Salesperson 1 to Salesperson 2 by typing "=ABS(H5-H6)" in cell B10.

6. Compare cell B10 to B9 to determine if the means are different for Salesperson 1 and Salesperson 2.

### Two-Way

1. Make sure you have the Data Analysis Toolpak installed.

2. Enter your data in the following format.

3. Navigate to the Data tab, and select Data Analysis in the Analyze section, then either Anova:Two-Factor With Replication (Factorial Design) or Anova: Two-Factor Without Replication (Randomized Block Design), depending on the wording of the problem. Select OK. Enter the Input Range (A1:E7 in this example), enter the number of Rows per sample, enter a value Alpha, select the Output Range, and select the first cell in an available row. Select OK.

## Binomial Distribution

### Binomial Probability (pdf)

1. Enter a value in cell A1 for the number of successes, x,

2. In cell A2, type the formula "=BINOM.DIST(A1, 4, 1/6, FALSE)" which corresponds to
BINOM.DIST(
number of successes (x),
number of trials (n),
probability of success (p),
FALSE)
.

Note: You may enter a 0 in place of FALSE.

3. The result for BINOM.DIST(2, 4, 1/6, FALSE) is 0.115741.

### Binomial Probability Distribution

For $\mathsfit{n}=4$, $\mathsfit{p}=\frac{1}{6}$, $\mathsfit{x}=0\text{,}1\text{,}2\text{,}3\text{,}4$.

1. Label your first column x. Below the label, list the possible number of successes from 0 to n.

2. Label your second column Probability. In cell B2 type the formula "BINOM.DIST(A2, 4, 1/6, FALSE)" which corresponds to BINOM.DIST(number of successes, number of trials, probability of success, TRUE for cumulative/FALSE for the probability there are that exact number of successes).

3. Drag cell B2 down to calculate the probabilities for all values of x in column A.

### Binomial Probability (cdf)

1. Enter a value in cell A1 for the number of successes, x.

2. In cell A2, type the formula "=BINOM.DIST(A1, 20, 0.4, TRUE)" which corresponds to
BINOM.DIST(
number of successes (x),
number of trials (n),
probability of success (p),
TRUE)
.

Note: You may enter a 1 in place of TRUE.

3. The result for BINOM.DIST(11, 20, 0.4, TRUE) is 0.943474.

## Chi-Square Distribution

### Critical Value

1. Type "=CHIINV(probability, degrees of freedom)".

2. Press ENTER.

3. The chi-square critical value with area to the right equal to the probability entered is returned.

### Left Tailed Probability (cdf)

Used for finding the p-value corresponding to a ${\mathsfit{\chi }}^{2}$ test statistic.

1. Type "=CHISQ.DIST(x, deg_freedom, TRUE)".

2. Press ENTER.

3. The area to the left of the value is returned.

### Right Tailed Probability (cdf)

Used for finding the p-value corresponding to a ${\mathsfit{\chi }}^{2}$ test statistic.

1. Type "=CHISQ.DIST.RT(x, deg_freedom)".

2. Press ENTER.

3. The area to the right of the value is returned.

### Test for Association

1. Enter the given contingency table of observed values in cells A1 through F4.

Note the cell numbers given in these instructions are based on a test of four categories that are divided into two groups. To perform a test on a different number of categories or groups, use the appropriate number of columns and rows.

2. Calculate the expected value for each cell in the contingency table.

1. In cell B7 enter: =B$4*$F2/$F$4

2. Copy and paste the formula above into cells B7 through E8.

Note that the $symbol is used to “lock” the column or row that follows the$ when a formula is copied from one cell and pasted into another.

3. Calculate the p-value.

1. In cell B9 enter: =CHISQ.TEST(B2:E3,B7:E8)

### Test for Goodness of Fit

1. Enter the observed value and the expected value for each category in cells B1 through E3 as follows.

1. In cells B1 through E1 enter the category name.

2. In cells B2 through E2 enter the observed value for the category above.

3. In cells B3 through E3 enter the expected value for the category above.

Note the cell numbers given in these instructions are based on a test on four categories. To perform a test on a different number of categories, use the appropriate number of columns.

2. Calculate the test statistic (χ2).

1. Compute $\frac{{\left(\text{Observed Value}-\text{Expected Value}\right)}^{2}}{\text{Expected Value}}$ for each category.

1. In cell B4 enter: =(B2-B3)^2/B3

2. Drag across the formula in B4 to cells C4:E4.

2. Add together all the values computed in part (a).

1. In cell B6 enter: =SUM(B4:E4)

3. Calculate the χ2 critical value.

1. In cell B7 enter: =CHISQ.INV.RT(alpha, df)

1. Substitute a value for ‘alpha’.

2. df’ is the number of categories minus 1.

## Confidence Intervals

### t-Interval

Construct the 90% confidence interval for the population mean of a normal population if the sample standard deviation is 900, the sample mean is 425, and the sample size is 100.

1. Type "=CONFIDENCE.T(0.1,900,100)" and press enter.

2. The result is 149.4352, the error of estimation. Subtract and add this value to the sample mean, 425, to find the confidence interval, (275.5648, 574.4352).

### z-Interval

Construct the 90% confidence interval for the population mean of a normal population if the population standard deviation is 900, the sample mean is 425, and the sample size is 100.

1. Type "=CONFIDENCE.NORM(0.1,900,100)" and press enter.

2. The result is 148.03683, the error of estimation. Subtract and add this value to the sample mean, 425, to find the confidence interval, (276.96317, 573.03683).

### Two Sample Proportions z-Interval

Construct the 95% confidence interval for the difference in two population proportions if the sample proportions are 0.05 and 0.04 for Population 1 and Populations 2 respectively. The sample size for each sample proportion is 200.

1. The margin of error for the confidence interval can be computed as follows:

1. Type “=NORM.S.INV(confidence level)*SQRT((p1*(1-p1))/n1+(p2*(1-p2)/n2)” and press enter.

Note that the confidence level used in the function to get the correct normal value for a 95% confidence interval is 1 – (0.05/2).

2. The result is 0.040619, the error of estimation. Subtract and add this value to the point estimate, the difference between the sample proportions (0.05-0.04=0.01) to obtain the lower and upper endpoints of the confidence interval, (-0.030619, 0.050619).

## Counting

### Combination

1. Use the COMBIN function to calculate the number of combinations.

2. Input "=COMBIN()" (without the quotations). Insert the total number of items as the first parameter, and the number of items in each combination as the second parameter.

1. Input: =COMBIN(36,5)

2. Output: 376992

### Factorial

1. Use the FACT function to calculate the factorial.

2. Input "=FACT()" (without the quotations) and insert the desired number within the parentheses.

1. Input: =FACT(5)

2. Output: 120

### Permutation

1. Use the PERMUT() function to calculate the number of permutations.

2. Input "=PERMUT()" (without quotations). Insert the number of total items as the first parameter, and then input the number of items in each permutation as the second parameter.

1. Input: =PERMUT(7,3)

2. Output: 210

## Descriptive Statistics

### One Variable

1. Make sure you have the Data Analysis Toolpak installed.

2. Enter the data arranged in a column.

3. Navigate to the Data tab, and select Data Analysis in the Analyze section.

4. In the window that appears, select Descriptive Statistics and click OK.

5. Select the input range by clicking and dragging your cursor over every cell containing data.

6. Check the option for Summary Statistics.

7. Choose the desired output location.

8. Click OK.

### Two Variable

#### Grouped Data

1. Enter the data in two separate columns: midpoints in column A and data values in column B

2. Use the formula: =SUMPRODUCT([midpoint array], [data array])/SUM([data array])

3. Press ENTER

#### Weighted Mean

1. Enter the data in two separate columns: weights in column A and data values in column B

2. Use the formula: =SUMPRODUCT([data array], [weight array])/SUM([weight array])

3. Press Enter

## F-Distribution

### Critical Value

1. Type "=F.INV.RT(probability, numerator degrees of freedom, denominator degrees of freedom)".

2. Press ENTER.

3. The F critical value with area to the right equal to the probability entered is returned.

### F-Probability (cdf)

#### Left Tail Probability

1. Type "=F.DIST(x, numerator degrees of freedom, denominator degrees of freedom, cumulative)". (Choose TRUE for cumulative for cdf, FALSE is pdf.)

2. Press ENTER.

3. The area to the left of the x value is returned.

#### Right Tail Probability

1. Type "=F.DIST.RT(x, numerator degrees of freedom, denominator degrees of freedom)".

2. Press ENTER.

3. The area to the right of the x value is returned.

## Frequency Distribution

### Qualitative Frequency Distribution

1. Select the column of data you wish to create a frequency distribution with, including the column header.

2. With the data highlighted, under the Insert tab, click PivotTable.

3. In the PivotTable dialogue box, make sure that the correct range of data is selected, and select the location where you want your PivotTable to appear. Click OK.

4. Now, a blank PivotTable will appear in the specified location, and a pane titled PivotTable Fields will be shown on the right side of the window. The name of the highlighted column will appear in the upper portion of the Fields pane.

5. To create a frequency distribution of the qualitative variable you selected, drag the column name in the upper part of the Fields pane down to the area in the lower part of the pane with the label Rows. In the PivotTable, you will now see a list of the possible unique values within the data you selected.

6. Now, drag the same column name from the upper portion of the pane to the lower portion with the label Values. Make sure the variable in the Values area is summarized by count. This can be specified by clicking the dropdown arrow next to the variable name, and opening the Value Field Settings dialogue box. Once the correct functions are set, the count, or frequency, of each unique value in your selected data column will now be displayed in the table, thus making it a qualitative frequency distribution.

### Quantitative Frequency Distribution

1. Select the column of data you wish to create a frequency distribution with, including the column header.

2. With the data highlighted, under the Insert tab, click PivotTable.

3. In the PivotTable dialogue box, make sure that the correct range of data is selected, and select the location where you want your PivotTable to appear. Click OK.

4. Now, a blank PivotTable will appear in the specified location. When you click on the PivotTable, a pane titled PivotTable Fields will be shown on the right side of the window. The name of the selected data column will appear in the upper portion of the Fields pane.

5. To create a frequency distribution of the quantitative variable you selected, drag the column name in the upper part of the Fields pane down to the area in the lower part of the pane with the label Rows. In the PivotTable, you will now see a list of the possible unique values within the data you selected. If your data is continuous, this table will probably have more rows than desired. This will be fixed when we group the data at the end.

6. Now, drag the same column name from the upper portion of the pane to the lower portion into the box with the label Values. Make sure the variable in the Values box is summarized by count. This can be specified by clicking the dropdown arrow next to the variable name in the Values box, and opening the Value Field Settings dialogue box. Select Count and click OK.

7. Once the correct functions are set, the count, or frequency, of each unique value in your selected data column will now be displayed in the table. However, since our data is continuous, we need to group the data by creating bins. Right click on a cell in the Row Labels column of the PivotTable, and select the Group option.

8. In the Group dialogue box, specify your desired starting and ending values as well as the class width, denoted by the term By:.

9. Click OK. Now, your data should be grouped into classes, and the frequency count of each class should be displayed in the PivotTable.

## Graphs

### Bar Charts

1. Organize the data into 2 columns, the labels on the left and the values for each label on the right.

2. Select all the data. Then under the Insert tab, in the Charts group, select the Bar Chart symbol. Choose either 2-D Column > Cluster Column (vertical) or 2-D Bar > Cluster Bar (horizontal).

3. You may edit the chart title by clicking on the text of the title.

4. To add axis labels, click on the chart to display the Chart Design tab. Use Add Chart Element to select Axis Titles > Primary Horizontal or Axis Titles > Primary Vertical.

### Side by Side Bar Charts

1. Organize the data into columns with the labels for the groups that you want to appear on the x-axis in the first (leftmost) column and the values for each group label in the columns to the right with labels for the legend in the first row.

2. Select all the data. Then under the Insert tab, in the Charts group, select Recommended Charts. Choose the Clustered Column chart shown.

3. You may edit the chart title by clicking on the text of the title. To add an axis title, click anywhere on the chart, select the + icon to the right of the chart, check Axis Titles, and then Primary Horizontal and Primary Vertical. You may edit the Axis axis Titles titles by clicking on the text of the title.

### Stacked Bar Charts

1. Organize the data into columns with the labels for the groups that appear on the x-axis in the first row and the data values for each label in the column below. The labels for the legend should be in the first column.

2. Select all the data. Then under the Insert tab, in the Charts group, select Recommended Charts. Choose the Stacked Column chart shown.

3. To change the colors used in the graph, right click on the appropriate portion of the bar/column, select Fill and choose the desired color. Repeat for the other segments of the bar/column.

4. You may edit the chart title by clicking on the text of the title. To add an axis title, click anywhere on the chart, select the + icon to the right of the chart, check Axis Titles, and then Primary Horizontal and Primary Vertical. You may edit the axis titles by clicking on the text of the title.

### Box Plot

1. Organize the data for each box plot in a separate column.

2. To create the box plot, select Insert and then Recommended Charts. Go to the All Charts tab and select Box & Whisker, then OK.

3. You may edit the chart title by clicking on the text of the title.

4. If you are displaying more than one box plot, then you should add a legend. Using the Chart Design tab, Add Chart Element, and select Legend. You have several options of where to place the legend on the chart.

5. The value of 1 along the horizontal axis can be removed by selecting it and deleting it.

### Dot Plot

1. Organize the data for your dot plot into a single column.

2. Highlight the entire column. Under the Home tab, click the Sort & Filter dropdown (on the right side of the toolbar) and select Sort Smallest to Largest. The smallest values should now be at the top of the column.

3. Create a new column next to your data column titled "Frequency". In the first cell of the Frequency column, enter the number 1. In the second cell, enter the following formula (cell references may vary depending on the location of the data in your spreadsheet).

=IF(A3=A2,B2+1, 1)

4. Since our data is sorted, identical values will be in adjacent cells. The above formula will count the number of occurrences of each value. Once you have finished typing the formula into the cell, press Enter, and then double click the small box at the lower right corner of the cell to apply the formula to the whole column.

5. Now, we will create a scatter plot using our two columns. Highlight all the data in both columns (excluding column headers), then navigate to the Insert tab, and insert a Scatter plot.

6. Right click on the horizontal axis, select Format Axis, and then change the Bounds to fit the range of your data (choose values slightly lower than the minimum and higher than the maximum to leave a small cushion of whitespace on either side). You can also specify the increment of the x-axis by changing the Major Units variable.

7. To remove the y-axis and the grid lines, right click on the vertical axis, select Format Axis, and then Delete.

8. To remove the grid lines, right click on a vertical grid line in the chart area, select Delete; repeat a similar process to delete the horizontal grid lines.

9. Move the cursor over the middle of the bottom side of the chart area until the double arrow appears. Click and drag to resize the graph so that our dots appear to be stacked on top of one another.

10. You may edit the chart title by clicking on the text of the title. To add a horizontal axis title, click on the chart, select the + icon to the right of the chart, check Axis Titles, and then Primary Horizontal. You may edit the horizontal axis title by clicking on the text of the title.

### Histogram

1. Organize the data into a column.

2. Select all the data (excluding the column header). Then under the Insert tab, in the Charts group, select the Histogram symbol.

3. To adjust the width of the bins/classes, right click on the horizontal axis, select Format Axis, and then change the Bin width to a desired value. There is also an option to adjust the histogram by indicating the Number of bins. Note: The bins are automatically labelled using intervals with ) or ] to indicate the sorting for the endpoints.

4. You may edit the chart title by clicking on the text of the title. To add an axis title, click on the chart, select the + icon to the right of the chart, check Axis Titles, and then Primary Horizontal and Primary Vertical. You may edit the axis titles by clicking on the text of the title.

5. If you want to display the values for the column frequencies, click on the chart, select the + icon to the right of the chart, check Data Labels, and then select the location for the label to appear.

### Line Graph

1. Organize the data in adjacent columns, so that corresponding paired data values are in the same row. The values that will be graphed along the horizontal axis appear in the first (left) column and the values for the vertical axis appear in the second (right) column. (The labels are not required.)

2. Select the second (right) column of data (excluding the column header). Then under the Insert tab, in the Charts section, select the Insert Line or Area Chart and the 2-D Line.

3. After inserting the graph, right click on the horizontal axis data labels and choose Select Data. Under the Horizontal (Category) Axis Labels press Edit and select the range of data in the first (left) column. Click OK and OK.

4. You may edit the chart title by clicking on the text of the title. To add a horizontal axis title, click on the chart, select the + icon to the right of the chart, check Axis Titles, and then Primary Horizontal and Primary Vertical. You may edit the axis titles by clicking on the text of the title.

### Multivariate/Multidimensional

1. Organize three columns of quantitative data where the first (left) column will be graphed on the horizontal axis, the second (middle) column will be graphed on the vertical axis, and the third (right) column will be represented by the size of the bubbles.

2. Select the data (excluding the column headers). Then under the Insert tab, in the Charts section, select Insert Scatter (X,Y) or Bubble Chart and choose Bubble chart.

3. Adjust the scale on the axes to “zoom” in on the ranges of the displayed data.

Right click on the vertical axis, select Format Axis, and then change the Bounds to fit the range of your data (choose values slightly lower than the minimum and higher that the maximum to leave a small cushion of whitespace). You can also specify the increment of the x-axis by changing the Major Units variable.

Repeat this process to adjust the horizontal axis, if necessary.

4. If the bubbles overlap too much you can scale them down by right-clicking the bubbles, selecting Format Data Series and reducing the value in Scale bubble size to.

5. You may edit the chart title by clicking on the text of the title. To add a horizontal axis title, click on the chart, select the + icon to the right of the chart, check Axis Titles, and then Primary Horizontal and Primary Vertical. You may edit the axes by clicking on the text of the title.

### Normal Probability Plot

1. Enter a column header (label for the data) in A1 and the data below in a single column starting in cell A2.

2. Select Column A. In the Editing portion of the Home menu, use Sort & Filter to Sort Smallest to Largest.

3. Starting in cell B2, assign a rank to each row of data. A shortcut uses the formula "=B2+1" in cell B3. Press Enter and drag down for all your rows of data. The last entry in this column is equal to n, the number of data points.

4. In cell C2, enter the formula "=(B2-0.5)/n", replacing n with the number of data points which can be found as the last entry in the previous column (see Step 3). This will calculate the percentile. Press Enter and drag down for all your rows of data.

5. In D2 enter "=NORM.INV(C2,0,1)" to calculate the z-score corresponding to each percentile. Press Enter and drag down for all your rows of data.

6. Highlight column A. Then hold the Ctrl key and highlight and D. Under the Insert tab, in the Charts section, select the Insert Scatter and the 2-D Line.

7. After inserting the graph, right click on the vertical axis data labels and choose Format Axis. Under Axis Options > Horizontal axis crosses, select Axis value and enter the smallest value shown on the current vertical axis.

8. To add a trendline, click on the chart, select the + icon to the right of the chart, check Trendline, and then Linear. The data should closely follow a linear trendline if it is approximately normally distributed.

### Pareto Chart

1. Organize the data into 2 columns, the labels on the left and the data values corresponding to each label on the right.

2. If the values are not sorted, then select the data in both columns (excluding the column headers), choose Sort & Filter, Custom Sort. Sort the data by the column with the data values (Tickets Sold in the example) and sort largest to smallest.

3. Once the data is sorted, select all the data and then under the Insert tab, in the Charts group, select the Bar Chart symbol. Choose either 2-D Column > Cluster Column (vertical) or 2-D Bar > Cluster Bar (horizontal).

4. You can edit the Chart Title by clicking on it and typing in a new title.

5. To add axis labels, use the Chart Design tab, Add Chart Element and select Axis Titles > Primary Horizontal or Axis Titles > Primary Vertical.

### Pie Chart

1. Organize the data into 2 columns, the labels of the categories in the left column and the data values corresponding to each label in the right column. Format these values as percentages if you intend to label the percentages on the graph.

2. Select all the data including the column headers. Then under the Insert tab, in the Charts group, select the 2-D Pie chart.

3. To add labels showing the percentages, right click on the pie chart and select Add Data Labels.

4. You may edit the chart title by clicking on the text of the title.

### Scatterplot

1. Organize the data in adjacent columns so that corresponding paired data values are in the same row. Typically, the independent/explanatory variable is located in the first (or left) column and the dependent/response variable is located in the second (or right column).

2. Select the data (excluding the column headers). Then under the Insert tab, in the Charts section, select the Scatter chart.

3. Adjust the scale on the axes to “zoom” in on the ranges of the displayed data.

Right click on the vertical axis, select Format Axis, and then change the Bounds to fit the range of your data (choose values slightly lower than the minimum and higher that the maximum to leave a small cushion of whitespace). You can also specify the increment of the x-axis by changing the Major Units variable.

Repeat this process to adjust the horizontal axis, if necessary.

4. You may edit the chart title by clicking on the text of the title. To add a horizontal axis title, click on the chart, select the + icon to the right of the chart, check Axis Titles, and then Primary Horizontal and Primary Vertical. You may edit the axis titles by clicking on the text of the title.

## Hypergeometric Distribution

1. Type the formula "=HYPGEOM.DIST(A2, 2, 16, 30, FALSE)" which corresponds to
HYPGEOM.DIST(
number of successes (x),
number of trials (n),
number of successes in the population (k),
population size (N),
TRUE for the probability of at most x successes/FALSE for the probability of getting exactly x successes)

Note: You may enter a 1 in place of TRUE and a 0 in place of FALSE.

## Hypothesis Testing

### One Proportion z-Test

1. Enter the summary statistics in cells B1 through B3 as follows.

1. In cell B1 enter the sample proportion, $\stackrel{^}{p}$.

2. In cell B2 enter the population proportion, $p$.

3. In cell B3 enter the sample size, $n$.

2. Calculate the test statistic (z).

1. In cell B5 enter: =(B1-B2)/SQRT(B2*(1-B2)/B3)

3. Calculate the p-value. Enter the appropriate formula below in cell B6.

1. For a left tailed test: =NORM.S.DIST(B5, TRUE)

2. For a right tailed test: =1-NORM.S.DIST(B5, TRUE)

3. For a two-tailed test: =2*(1-NORM.S.DIST(ABS(B5), TRUE))

### z-Test

1. Enter the summary statistics in cells B1 through B4 as follows.

1. In cell B1 enter the sample mean, .

2. In cell B2 enter the population mean, µ.

3. In cell B3 enter the population standard deviation, σ.

4. In cell B4 enter the sample size, n.

2. Calculate the test statistic (z).

1. In cell B6 enter: =(B1-B2)/(B3/SQRT(B4))

3. Calculate the p-value. Enter the appropriate formula below in cell B7.

1. For a left tailed test: =NORM.S.DIST(B6, TRUE)

2. For a right tailed test: =1-NORM.S.DIST(B6, TRUE)

3. For a two-tailed test: =2*(1-NORM.S.DIST(ABS(B6), TRUE))

### t-Test

1. Enter the summary statistics in cells B1 through B5 as follows.

1. In cell B1 enter the sample mean, $\stackrel{_}{x}$.

2. In cell B2 enter the population mean, µ.

3. In cell B3 enter the sample standard deviation, $s$.

4. In cell B4 enter the sample size, $n$.

5. In cell B5 enter the alpha level, $\alpha$.

2. Calculate the test statistic (t).

1. In cell B7 enter: =(B1-B2)/(B3/SQRT(B4))

3. Compute the t critical value with n−1 degrees of freedom.

1. For a left tailed test: =T.INV(B5, B4-1)

2. For a right tailed test: =T.INV(1-B5, B4-1)

3. For a two-tailed test: =T.INV.2T(B5, B4-1)

4. Compute the p-value with n−1 degrees of freedom. The example shown below is a one-tailed test.

1. For a one-tailed test: =T.DIST.RT(ABS(B7), B4-1)

2. For a two-tailed test: =T.DIST.2T(ABS(B7), B4-1)

### Two Sample t-Test (Independent Samples)

1. Enter the statistics for each variable in cells B2 through C4. Enter a value for alpha in cell B5.

2. Compute the test statistic by entering the following formula in cell B7.

1. If assuming equal variances: =(B2-C2-0)/(SQRT(((B4-1)*B3^2+(C4-1)*C3^2)/(B4+C4-2))*SQRT(1/B4+1/C4))

2. If assuming unequal variances: =(B2-C2-0)/(SQRT((B3^2/B4)+(C3^2/C4))

Note that the '0'' in the formulas above is the presumed value of the difference between the two population means from the null hypothesis.

3. Compute the t critical value with n1 + n2 − 2 degrees of freedom.

1. For a left tailed test: =T.INV(B5, B4+C4-2)

2. For a right tailed test: =T.INV(1-B5, B4+C4-2)

3. For a two-tailed test: =T.INV.2T(B5, B4+C4-2)

4. Compute the p-value with n1 + n2 − 2 degrees of freedom. The example shown below is a one-tailed test.

1. For a one-tailed test: =T.DIST.RT(ABS(B7), B4+C4-2)

2. For a two-tailed test: =T.DIST.2T(ABS(B7), B4+C4-2)

### Two Sample t-Test (Dependent Samples, Paired Difference)

1. Make sure you have the Data Analysis Toolpak Add-in installed.

2. Enter the data for Variable 1 in column A and the data for Variable 2 in column B.

3. Under the Data tab, select the Data Analysis option. In the dialogue box that appears, select the t-Test: Paired Two Sample for Means option and click OK.

4. Enter the Variable 1 Range, the Variable 2 Range, and the Hypothesized Mean Difference.

Note that Excel calculates the paired differences by subtracting the values for Variable 2 from the values for Variable 1, which is the opposite of what we do when we calculate them by hand or using a TI-83/84 Plus calculator.

5. Select Labels if the first cells in your variable ranges are data labels. Enter a value for Alpha. Select an output option:

1. Output Range: A cell within your spreadsheet where excel will enter the results.

2. New Worksheet Ply: The name of a new worksheet where excel will enter the results.

3. New Workbook: Excel will enter the results in a new workbook.

6. Click OK.

### Two Sample z-Test

1. Make sure you have the Data Analysis Toolpak Add-in installed.

2. Enter the data for Variable 1 in column A and the data for Variable 2 in column B.

3. Under the Data tab, select the Data Analysis option. In the dialogue box that appears, scroll down to select the z-Test: Two Sample for Means option and click OK.

4. Enter the Variable 1 Range, the Variable 2 Range, the Hypothesized Mean Difference, the Variable 1 Variance (known), and the Variable 2 Variance (known).

5. Select Labels if the first cells in your variable ranges are data labels. Enter a value for Alpha. Select an output option:

1. Output Range: A cell within your spreadsheet where excel will enter the results.

2. New Worksheet Ply: The name of a new worksheet where excel will enter the results.

3. New Workbook: Excel will enter the results in a new workbook.

6. Click OK.

### Two Proportion z-Test

1. Enter the statistics for each variable in cells B2 through C3.

2. Calculate ${\stackrel{^}{p}}_{1}$ and ${\stackrel{^}{p}}_{2}$.

1. In cell B4 enter: =B2/B3

2. In cell C4 enter: =C2/C3

3. Calculate $\stackrel{_}{p}$, $1-\stackrel{_}{p}$ and the test statistic (z).

1. In cell B6 enter: =(B2+C2)/(B3+C3)

2. In cell B7 enter: =1-B6

3. In cell B8 enter: =(B4-C4-0)/SQRT(B6*B7*(1/B3+1/C3))

Note that the ‘0’ in the formula above is the presumed value of the difference between the two population means from the null hypothesis.

4. Calculate the p-value. Enter the appropriate formula below in cell B9.

1. For a left tailed test: =NORM.S.DIST(B8, TRUE)

2. For a right tailed test: =1-NORM.S.DIST(B8, TRUE)

3. For a two-tailed test: =2*(1-NORM.S.DIST(ABS(B8), TRUE))

### Two Sample F-Test

1. Enter the statistics for each variable in cells B2 through C3. Additionally enter an alpha value in cell B4.

2. Compute test statistic (F).

1. In cell B6 enter: =B2/C2

3. Compute the F critical value(s). Enter the appropriate formula below in cell B7.

1. For a left tailed test: =F.INV(B4,B3-1,C3-1)

2. For a right tailed test: =F.INV.RT(B4,B3-1,C3-1)

3. For a two-tailed test, calculate both the left and right critical values (using Excel formulas shown above) using half of the original alpha value for each.

4. Calculate the p-value. Enter the appropriate formula below in cell B8. The example shown below is a left tailed test.

1. For a left tailed test: =F.DIST(B6,B3-1,C3-1, TRUE)

2. For a right tailed test: =F.DIST.RT(B6,B3-1,C3-1)

## Nonparametrics

### Kruskal-Wallis Test

Note: These instructions assume 3 samples with sizes of 6, 5, and 7. Adjust the rows and columns in the instructions to fit your data.

1. Put the data for the samples in columns A, C, and E (leaving blank columns in between the samples where we will put the ranks in the next step).

2. Determine the ranks (from smallest to largest) for each of the three samples and put them in columns B, D, and F, respectively.

3. Compute the sum of the ranks for the first sample, R1, in column B by entering "=SUM(B2:B7)" in cell B10. Do the same thing for the ranks of the second and third samples in columns D and F.

4. Enter the sample size for the first sample, n1, in cell B11 under the value for R1. Do the same thing for the sample sizes for the other two samples in cells D11 and F11.

5. Enter the squared sum of ranks divided by the sample size, R12/n1, for the first sample by typing '=(B10^2)/B11' in cell B12. Do the same thing for the other two samples in cells D12 and F12.

6. Enter the total number of observations in the samples, N, in cell B14 by typing '=SUM(B11,D11,F11)'.

7. In a new cell, type "=12/(B14*(B14+1))*SUM(B12,D12,F12)-3*(B14+1)" to calculate the test statistic.

8. Press ENTER.

9. The value of the test statistic for the Kruskal-Wallis test is shown in cell B16.

### Sign Test

Note: These instructions assume a sample size greater than 25.

1. Enter the values for X and the sample size into cells A1 and A2, respectively.

2. To calculate the test statistic, in cell B3, input the formula "=(A1+0.5–A2/2)/(SQRT(A2)/2)".

3. Press ENTER.

4. To calculate the p-value, in cell A4, input the formula "=NORM.S.DIST(B3,1)".

5. Press ENTER.

### Spearman Rank Correlation Test

1. Put the data for x and y in columns A and B, respectively.

2. Calculate the rank for each data value in the first column (A) by typing "=RANK.AVG(A2,$A$2:$A$13,1)" in cell C2. Click the small square at the lower-right corner of the cell and drag it down to the bottom row so that the ranks are calculated for all of the values in the first column. Repeat this procedure for ranking the second column (B), "=RANK.AVG(B2,$B$2:$B$13,1)" in cell D2.

3. Compute the squared differences by entering "=(C2 – D2)^2" in cell E2. Click the small square at the lower-right corner of the cell and drag it down so that the formula is replicated for all the rows in the table.

4. In a new cell, type "=1-((6*SUM(E2:E13))/(12*(12^2-1)))" to calculate Spearman's rho. Note: If your data do not have 12 data points, adjust the range E2:E13 and the sample size in the denominator accordingly.

5. Press ENTER.

6. The value of Spearman's rho is shown.

### Wilcoxon Rank-Sum Test

Note: These instructions assume samples of size 10. Adjust the number of rows in the instructions to fit your data.

1. Put the data for the samples in columns A and B. Put the name of each sample in the first row and put the data values corresponding to each sample below the column labels.

2. Calculate the rank for each of the data values by typing "=RANK.AVG(A2,$A$2:$B$11,1)" in cell C2. Click the small square at the lower-right corner of the cell and drag it down to the bottom row and to the right by one so that the ranks are calculated for all of the values in the first two columns.

3. In cell C13, sum the ranks of the first sample by entering "=SUM(C2:C11)". Do the same for the second sample by entering "=SUM(D2:D11)" in cell D13.

4. Count the sample size by entering "=COUNT(A2:A11)" in cell C14. Do the same for the second sample by entering "=COUNT(B2:B11)" in cell D14.

5. Since our test is two-tailed and n1 ≤ 10, we know T is the rank sum of the sample with the fewest members. Since both the sample sizes in our example are the same, we will arbitrarily choose the rank sum for the sample in column A. In cell B16, enter the label "T" and then in cell C16 enter the value of T, which is 110 in this example .

6. Compare this value with the values for TL and TU found in the table for critical values of the Wilcoxon Rank-Sum Test.

### Wilcoxon Signed-Rank Test

Note: These instructions assume samples of size 9. Adjust the number of rows in the instructions to fit your data.

1. Put the data for the samples in columns A and B. Put the name of each sample in the first row and put the data values corresponding to each sample below the column labels.

2. Enter the title Difference in cell C1. Calculate the difference for each of the pairs by typing "=A2-B2" in cell C2. Click the small square at the lower-right corner of the cell and drag it down so that the formula is replicated for all the rows in the table.

3. Enter the title |Difference| in cell D1. Calculate the absolute value of the differences by typing "=ABS(C2)" in cell D2 and copy the formula down the column.

4. Enter the title Rank(|Difference|) in cell E1. Rank the absolute values of the positive differences from smallest to largest (ignore any 0 values and assign the average rank to ties), and put the ranks in column E.

5. Enter the title Positive Ranks in cell F1. Determine the ranks associated with positive differences by typing "=IF(C2>0, E2, "")" in cell F2, and copy the formula down the column.

6. Enter the title Negative Ranks in cell G1. Determine the ranks associated with negative differences by typing "=IF(C2<0, E2, "")" in cell F2, and copy the formula down the column.

7. Calculate the test statistic in a new cell by entering the formula "=MIN(SUM(F2:F10),SUM(G2:G10))".

8. Calculate the sample size in a new cell by entering the formula "=COUNT(F2:G10)".

9. Use the sample size and desired significance level α to look up the critical value for the Wilcoxon Signed-Rank Test in the appropriate table. Compare this to the test statistic calculated in Step 7.

## Normal Distribution

### Inverse Normal

#### Standard Normal

1. Type "=NORM.S.INV(probability)".

2. Press ENTER.

3. The z-score with area to the left equal to the probability entered is returned.

#### Non-standard Normal

1. Type "=NORM.INV(probability, mean, standard deviation)".

2. Press ENTER.

3. The x-value with area to its left equal to the probability entered is returned.

### Normal Probability (cdf)

#### Standard Normal

1. Type "=NORM.S.DIST(z, cumulative)". Choose TRUE for cumulative for cdf, FALSE is pdf.

2. Press ENTER.

3. The area to the left of the z-score is returned.

#### Non-standard Normal

1. Type "=NORM.DIST(x, mean, standard deviation, cumulative)". Choose TRUE for cumulative for cdf, FALSE is pdf.

2. Press ENTER.

3. The area to the left of the x-value is returned.

## Poisson Distribution

### Poisson Probability Distribution

1. Type the column headers. In cell A1, type "x" as the label of the first column. In cell B1, type "Probability" as the label of the second column.

2. In column A, starting with "0" in cell A2, type whole numbers for the value of the discrete random variable.

3. The probabilities for the Poisson distribution are calculated using the Excel formula "=POISSON.DIST(x, mean, cumulative)" where

x = the number of successes in the sample,

mean = the mean number of successes, and

cumulative = a logical value which is either TRUE, (Excel returns the cumulative probability, cdf) or FALSE (Excel returns the probability there are x successes, pdf.)

Note: You may enter a 1 in place of TRUE and a 0 in place of FALSE. In cell B2 type the formula "=POISSON.DIST(A2,0.3, FALSE)" to calculate the probability of exactly zero successes for a distribution with a mean of 0.3. Press ENTER.

4. Double-click on the lower right corner of cell B2 and drag down the column to populate the remaining cells in column B.

### Poisson Probability (cdf)

1. Use the formula "=POISSON.DIST(x, mean, TRUE)". Press Enter to calculate.

Note: You may enter a 1 in place of TRUE.

### Poisson Probability (pdf)

1. Use the formula "=POISSON.DIST(x, mean, FALSE)". Press Enter to calculate.

Note: You may enter a 0 in place of FALSE.

## Regression

### Confidence Intervals for Slope and y-Intercept

1. Make sure you have the Data Analysis Toolpak Add-in installed.

2. Organize the data into two columns. Enter the independent variable (X) in the first column and the dependent variable (Y) in the second column.

3. Under the Data tab, select the Data Analysis option. In the dialogue box that appears, select the Regression option and click OK.

4. Enter the Input Y Range and the Input X Range. Check the Labels box if you included the variable labels in the input range. Check the Confidence Level box and enter the desired numeric value for confidence level percentage. Check the Output Range and select the first cell in an available row. Then select OK.

5. In the example regression output, the Lower 95.0% and the Upper 95.0% columns give the lower and upper endpoints of the 95% confidence intervals for the y-intercept and slope for the independent variable Age (Years).

### Correlation Coefficient

1. Organize your data into two columns with each row representing an ordered pair.

2. In a separate cell, use the formula "=CORREL(array 1, array 2)" where the arrays are the x and y variable columns of your data, respectfully. Press Enter to calculate.

### Coefficient of Determination

#### Simple Linear Regression

1. Organize your data into two columns with each row representing an ordered pair.

2. In a separate cell, use the formula "=RSQ(array 1, array 2)". In this case, the array for the y variable comes first and the array for the x variable is second. Press Enter to calculate.

#### Multiple Regression

1. Make sure you have the Data Analysis Toolpak Add-in Installed.

2. Enter your response values into the first column and each of your predictor values in the next columns. Label your columns.

3. Under the Data tab, select the Data Analysis option. In the dialogue box that appears, select the "Regression" option and click OK.

4. Enter the Input Y Range and the Input X Range. Check the Labels box, check the Output Range and select the first cell in an available row. Then select OK. (Note the X-variables must be in contiguous columns.)

### Simple Linear Regression

1. Make sure you have the Data Analysis Toolpak Add-in Installed.

2. Organize the data into two columns. Enter the independent variable in the first column (left) and the dependent variable in the second column (right).

3. Under the Data tab, select the Data Analysis option. In the dialogue box that appears, select the Regression option and click OK.

4. Select the second column for the Y values, and the first column for the X values. If you have labels in the first cell of the columns, check the box that says Labels. Choose a desired Output location, and check the box at the bottom that says Line Fit Plots.

## t-Distribution

### Inverse t

#### Area in one tail

1. Use the formula "=T.INV(probability, degrees of freedom)". Press Enter to calculate.

#### Area in two tails

1. Use the formula "=T.INV.2T(probability, degrees of freedom)". Press Enter to calculate. The result gives the positive t-value with half the given area to its right. Note that the area in two tails is one minus the area between symmetric t-values.

### t-Probability (cdf)

1. Type "=T.DIST(x,degrees of freedom, cumulative)". Choose TRUE for cumulative for cdf, FALSE is pdf.

2. Press ENTER.

3. The area to the left of the x-value is returned.

## Data Manipulation

### Sorting

1. Enter the values in a single column.

2. Select the data, right-click, and select Sort > Sort Smallest to Largest or Sort > Sort Largest to Smallest. Another option is to select the data, and under Home > Editing choose Sort & Filter > Sort Smallest to Largest or Sort & Filter > Sort Largest to Smallest.

### Filtering

1. Making sure to reserve the first row for headers (or otherwise left empty), enter the data in a column. (If your data covers multiple parameters, then enter the data for each parameter in separate columns)

2. Select the header/empty row and under Home > Editing click Sort & Filter > Filter. (Alternatively, navigate to the Data tab and click Filter under Sort & Filter.)

3. Click the small arrow in the column you wish to filter by to access your filtering options.

1. A list will be shown of all the unique values in the column (up to 10,000). When the checkbox for a value is selected, the rows where this column has that value will be displayed. You can select any number of these checkboxes at any time.

2. There are also Number Filters or Text Filters depending on the type of data that are useful for selecting multiple values which fit some certain criteria.

### Subset Calculations

Say you have the numbers 1, 2, 3, …., 20 in a column of data and you have filtered this column to only include the even values (2, 4, 6, …, 20).

If you want to sum these values, type =SUM( and select the visible range, you will end up with =SUM(A3:A21) which is equal to 209. This is also including the non-visible cells in the sum. To avoid this issue, you can use the SUBTOTAL function. =SUBTOTAL(109,A3:A21) which will equal 110, the sum of only the visible cells. The 109 is the code to access the SUM function for visible cells. There are several functions for visible subsets built into SUBTOTAL.

101 – AVERAGE

102 – COUNT

103 – COUNTA

104 – MAX

105 – MIN

106 – PRODUCT

107 – STDEV.S

108 – STDEV.P

109 – SUM

110 – VAR.S

111 – VAR.P

## Sampling

### Random Samples

1. In cell A1 type "=RANDBETWEEN(1, 897)" for example. 1 represents the smallest possible number to generate and 897 the largest.

2. Place the cursor over the bottom right-hand corner of cell A1, click and drag the box down as many rows as random numbers you desire.

## Time Series

1. Enter the values for the time period and the response variable in columns A and B, respectively, and label them in row 1 of the spreadsheet

2. Create the columns named Forecast, Trend, and Adjusted Forecast in cells C1, D1, and E1. Enter the initial forecast in cell C2 as 9, trend in cell D2 as 0, and adjusted forecast in cell E2 as 9.

3. Using the formula for the forecast, F2 = α*(D1) + (1 – α)*(AF1), write a formula with α = 0.3 in cell C3 using the cell reference B2 in place of D1 and E2 in place of AF1, as shown below. Press Enter.

4. The forecast for February is displayed as 9.

5. Using the formula for trend, Tt+1= Tt-1 + β*(Ft– AFt-1), write a formula with β = 0.2 in cell D3 using cell reference D2 in place of Tt-1 and C3 and E2 in place of Ft and AFt-1, respectively. Press Enter.

6. The trend for February is 0.

7. In cell E3, enter "=SUM(C3:D3)" to find the adjusted forecast for February and press Enter.

8. The adjusted forecast for February is 9.

9. We have performed the calculation for the first row, which needs to be repeated for all other rows. Select the cells from C3 to E3 and double-click on the bottom-right corner of E3, which will autofill the values to E7, as shown below.

10. The adjusted forecast for June in this example is 11.81.

### Mean Absolute Percentage Error (MAPE)

1. Enter the values for the Period (t), Actual Data (Di), and Forecast (Fi) in Columns A, B, and C, respectively.

2. Calculate the Absolute Error, |Di-Fi|. In cell D2, enter "=ABS (B2-C2)". Press Enter.

3. Double-click on the right corner of cell D2 so that the formula gets auto-applied on the remaining cells (D3-D7), as shown below.

4. Calculate the Absolute Percentage Error, 100*|Dt - Ft|/Dt . In cell E2, enter "=100*(D2/B2)". Press Enter.

5. Double-click on the right corner of the cell E2 so that the formula gets auto-applied on the remaining cells (E3-E7) as shown below.

6. In cell E9, enter "=AVERAGE (E2:E7)". Press Enter.

7. The MAPE for this example is 2.8939.

### Mean Squared Error (MSE)

1. Enter the values for Period (t), Actual Data (Di), and Forecast (Fi) in Columns A, B, and C, respectively.

2. Calculate the Absolute Error, |Di-Fi|. In cell D2, enter "=ABS(B2-C2)". Press Enter.

3. Double-click on the right corner of cell D2 so that the formula gets auto-applied on the remaining cells (D3-D7), as shown below.

4. Now, calculate the Squared Error for each period by entering "=D2^2" in cell E2. Click Enter.

5. Double-click on the right corner of cell E2 so that the formula gets auto-applied on the remaining cells (E3-E7) as shown below.

6. In cell E9, enter "=AVERAGE(E2:E7)". Click Enter.

7. The result for the MSE in this example is 35.3333.

### Simple Exponential Smoothing

#### Method 1

1. Enter the values for the time period and the response variable in Columns A and B, respectively, and label them in row 1 of the spreadsheet. Additionally, enter the weight or smoothing constant (α) in cell B9.

2. Create a column C labelled Forecast for each month. Enter the initial forecast (in this example 9) in cell C2.

3. In cell C3, enter "=$B$9*B2+(1-$B$9)*C2". Press Enter to calculate.

4. Click and drag the bottom right corner of cell C3 to populate the remaining values in column C, including the forecasted value of the last time period.

#### Method 2

1. Make sure you have the Data Analysis Toolpak Add-in installed.

2. Enter the values for the time period and the response variable in Columns A and B, respectively, and label them in row 1 of the spreadsheet. In cell C1, insert the label "Forecast".

3. Under the Data tab, select the Data Analysis option. In the dialogue box that appears, select the Exponential Smoothing option, and press OK.

4. Provide the Input Range (in this example, cells B2:B7), enter the Damping factor, 1-α (in this example, α = 0.3), and provide the Output Range (in this example, cells C2:C7). Press OK.

### Simple Moving Average

1. Enter the values for the time period and the response variable in Columns A and B, respectively, and label them in row 1 of the spreadsheet.

2. To calculate a 3-month moving average for June, enter the formula "=AVERAGE(B4:B6)" in cell C7. Press Enter to calculate.

Note: For a different time period, change the cell range accordingly inside the Average function.

### Weighted Moving Average

1. Enter the values for the time period and the response variable in Columns A and B, respectively, and label them in row 1 of the spreadsheet. Create a column for Weight in column C as shown below. (Note that the weights in Column C should add to 1.)

2. In cell D7, enter "=SUMPRODUCT(B4:B6,C4:C6)" and press Enter.

3. The desired Weighted Moving Average for June for a period of 3 will appear. For this example the weighted moving average is 11.3.

### Mean Absolute Deviation

1. Enter the values for Period (t), Actual Data (Di), and Forecast (Fi) in Columns A, B, and C, respectively.

2. In cell D1, type the header Absolute Error |Di - Fi | for column D. In cell D2, enter the formula "=ABS(B2-C2)". Press Enter.

3. Select cell D2 and double-click the bottom right corner of the cell to populate the remaining values in column D.

4. To compute the MAD, choose an empty cell and enter the formula "=AVERAGE(D2:D7)". Press Enter.