Subtitles section Play video
Choosing which statistical test to use.
There are many different tests you can use in statistics.
Sometimes it can be quite difficult to know which is the correct test to use.
This video will talk about seven tests you are likely to use.
involving means proportions and relationships.
When you are trying to work out which is the most appropriate test
there are three questions you should ask
One. What level of measurement was used for the data we are analyzing.
2. How many samples do we have?
3. What is the purpose of our analysis?
I will now explain each of these questions
1. Data or level of measurement
Is our data nominal or interval/ratio?
Nominal data is also called categorical, qualitative
or nonparametric
Examples of nominal data are color
whether parts are defective or not,
or preferred type of chocolate.
Nominal summary values are usually stated as frequencies, proportions or
percentages.
The tests that involve nominal data are:
Test for a proportion
Difference of two proportions
and chi-squared test for independence
The other type of data
is interval/ratio
also called quantitative
Examples of interval/ratio data are
daily sales figures for choconutties
weight of peanuts or temperature
the most common summary value for interval/ratio data is a mean.
Tests that involve interval/ratio data are:
Test for a mean
difference of two means - independent samples
difference of two means - paired
and regression analysis.
For more help on levels of measurement see our video:
"Types of data nominal, ordinal, interval/ratio"
Ordinal data can be classified with nominal or interval/ratio
depending on the circumstances.
2. Samples
Next we ask how many samples are involved
Is there one sample for which we are testing the relevant statistic
against a hypothesized value
or are there two samples
which are being compared with each other
or
is the one sample but each observation has a measure or score
for more than one variable?
The same sample is measured twice.
If we wish to compare a proportion or a mean against a given value,
this will involve one sample.
If we're comparing two different lots of people or things such as men and women
or people from two different departments
then we would have two samples.
If we have two sets of information on the same people of things
we would say we have one sample with two variables.
An example is one set of days and information on how many choconutties
are sold and what the temperature was.
Or - one set of people and information on their gender and preferred type of chocolate.
Finally we ask
What is the purpose of the analysis?
We can be testing against the hypothesized value
comparing two statistics
or looking for a relationship.
Chi-squared test for independence and regression are similar
in that they are looking at the relationship between two variables
The difference between them is in the kind of data.
If you would summarize the data in s table,
we would use a chi-squared test fo independence
whereas if you would put it on a scatter plot
you would use regression analysis.
Here iss an example for each of these tests.
They relate back or out other videos teaching about hypothesis testing.
After each description of the scenario pause the video
and see if you can identify the correct test before we tell you the answer.
Helen is still selling choconutties.
Example one:
sufficient nuts.
Helen was concerned whether the quantity of nuts was sufficient in her choconutties.
She took a sample of twenty packets and found the weight of nuts in
each packet
Pause the video
1. Data
The weight was interval/ratio data.
2. Samples
There was just one sample of twenty packets of choconutties.
3. Purpose. Helen was comparing against given value
Thus, the test she needs to use is Test for a mean.
Example Two
Prize tickets
In a promotional campaign twenty percent of all packs of choconutties should
include tickets for free prizes.
Helen takes a sample of fifty packets and finds that seven of them
have winning tickets
Pause the video
1. Data: For each bar we are saying yes or no, only to be lumped whether or not
there is a ticket.
This is nominal data from which we get a sample proportion of seven out of fifty
Or 0.14
Samples
There is one sample of fifty packets
Purpose.
Helen is comparing the sample value against a given value: twenty percent
We conclude that the test she needs to use is test for a proportion.
Example three
Bar longevity compared with nuttabars.
Helen thinks her choconutties last longer than the competition, nuttabars.
She gets 36 people to eat one of each, and records their eating times.
Pause now
1. Data. Helen collects times taken in seconds
so this is interval/ratio data.
2. Samples
There is one sample of thirty-six people but with two scores for each person
the time for the choconuttie and the time for the nuttabar.
3. Purpose
She is looking at whether there iss a difference in the amount of time taken
for each of the bars.
Thus the test is difference of two means, paired sample.
Example four
Defective wrapping from two wrapping machines
Helen thinks there is a difference in performance between
the two wrapping machines in her factory. She checks 200 bars from
one machine and 150 bars from the other.
For each bar she is seeing if the wrapping is satisfactory or not
She finds that ten out of two hundred bars from the first machine
and nine out of 150 bars from the second machine
are badly wrapped.
Pause the video
Data. The information for each bar is OK or not ok
This is nominal data.
It has been summarized as frequencies.
2. Samples there are two independent samples
one sample from each of the two machines
3. Purpose
Helen is comparing the proportions from the two samples
We can see that the test is
difference of two proportions.
Example five
Do stickers help sales?
Helen is exploring whether having free stickers makes a difference to sales.
She has the sales figures for thirteen days when she did offer free stickers
and ten days when she did not. Pause and decide on the test
Data. For each day Helen has a number or value corresponding to the sales for that day
This is interval/ratio data
It is summarized as a mean member of sales.
2. Samples
There are two samples one sample for days with stickers
and one sample for days without.
3. Purpose
Helen is comparing the average sales figures for the two treatments
we conclude that the test to use is...
Difference of two means independent samples
Example six
Are sales affected by temperature?
Helen wants to see if there is a relationship between the daily
temperature and sales of choconutties.
She has data on sales and temperature
for thirty weekdays of sales
Pause!
Data. Sales and temperature at both interval variables
Samples
There is one sample of thirty days with two measures or scores for each day.
Purpose.
Helen is interested in the relationship between sales and temperature
This leads us to decide that the test is regression.
Example seven
Men and women and chocolate preference
Helen is thinking of selling dark chocolate, milk chocolate and white chocolate
choconutties.
She thinks that men and women might have different preferences with regard to type.
She collects data from fifty customers, noting down if they are men or women
and asking them which variety they prefer.
Pause the video and decide.
Data. Helen records the type of chocolate and sex of person.
These are both nominal variables.
Samples.
There is one sample of fifty customers
but with two measures or variables.
Purpose.
Helen is looking at whether there is a relationship
variables
Thus the test is chi-squared test for independence.
Those are seven examples of the seven tests outlined here.
There are numerous other statistical tests and other things may need to be
considered,
but this summary will help you to understand what these seven basic tests do
and what to look for when deciding on which test to choose.