Which nonparametric test is analogous to a one factor anova




















A new approach to prenatal care is proposed for pregnant women living in a rural community. The new program involves in-home visits during the course of pregnancy in addition to the usual or regularly scheduled visits. A pilot randomized trial with 15 pregnant women is designed to evaluate whether women who participate in the program deliver healthier babies than women receiving usual care.

Recall that APGAR scores range from 0 to 10 with scores of 7 or higher considered normal healthy , low and critically low. Is there statistical evidence of a difference in APGAR scores in women receiving the new and enhanced versus usual prenatal care? We run the test using the five-step approach. H 1 : The two populations are not equal. The test statistic is U, the smaller of.

The appropriate critical value can be found in the table above. The first step is to assign ranks of 1 through 15 to the smallest through largest values in the total sample, as follows:.

Next, we sum the ranks in each group. We reject H 0 because 9. A clinical trial is run to assess the effectiveness of a new anti-retroviral therapy for patients with HIV. Patients are randomized to receive a standard anti-retroviral therapy usual care or the new anti-retroviral therapy and are monitored for 3 months.

The primary outcome is viral load which represents the number of HIV copies per milliliter of blood. A total of 30 participants are randomized and the data are shown below. Is there statistical evidence of a difference in viral load in patients receiving the standard versus the new anti-retroviral therapy? Because viral load measures are not normally distributed with outliers as well as limits of detection e. The first step is to assign ranks of 1 through 30 to the smallest through largest values in the total sample.

Note in the table below, that the "undetectable" measurement is listed first in the ordered values smallest and assigned a rank of 1. We now compute U 1 and U 2 , as follows,. We do not have sufficient evidence to conclude that the treatment groups differ in viral load. This section describes nonparametric tests to compare two groups with respect to a continuous outcome when the data are collected on matched or paired samples.

The parametric procedure for doing this was presented in the modules on hypothesis testing for the situation in which the continuous outcome was normally distributed.

This section describes procedures that should be used when the outcome cannot be assumed to follow a normal distribution. There are two popular nonparametric tests to compare outcomes between two matched or paired groups. Recall that when data are matched or paired, we compute difference scores for each individual and analyze difference scores. The same approach is followed in nonparametric tests. In nonparametric tests, the null hypothesis is that the median difference is zero.

Consider a clinical investigation to assess the effectiveness of a new drug designed to reduce repetitive behaviors in children affected with autism. If the drug is effective, children will exhibit fewer repetitive behaviors on treatment as compared to when they are untreated.

A total of 8 children with autism enroll in the study. Each child is observed by the study psychologist for a period of 3 hours both before treatment and then again after taking the new drug for 1 week. The time that each child is engaged in repetitive behavior during each 3 hour observation period is measured. Repetitive behavior is scored on a scale of 0 to and scores represent the percent of the observation time in which the child is engaged in repetitive behavior.

For example, a score of 0 indicates that during the entire observation period the child did not engage in repetitive behavior while a score of indicates that the child was constantly engaged in repetitive behavior. Looking at the data, it appears that some children improve e. Is there statistically significant improvement in repetitive behavior after 1 week of treatment?. Because the before and after treatment measures are paired, we compute difference scores for each child.

In this example, we subtract the assessment of repetitive behaviors after treatment from that measured before treatment so that difference scores represent improvement in repetitive behavior. The question of interest is whether there is significant improvement after treatment. In this small sample, the observed difference or improvement scores vary widely and are subject to extremes e. Thus, a nonparametric test is appropriate to test whether there is significant improvement in repetitive behavior before versus after treatment.

The hypotheses are given below. In this example, the null hypothesis is that there is no difference in scores before versus after treatment.

If the null hypothesis is true, we expect to see some positive differences improvement and some negative differences worsening. If the research hypothesis is true, we expect to see more positive differences after treatment as compared to before.

The Sign Test is the simplest nonparametric test for matched or paired data. The approach is to analyze only the signs of the difference scores, as shown below:. If the research hypothesis is true, we expect to see more positive differences. The test statistic for the Sign Test is the number of positive signs or number of negative signs, whichever is smaller.

In this example, we observe 2 negative and 6 positive signs. Is this evidence of significant improvement or simply due to chance? Determining whether the observed test statistic supports the null or research hypothesis is done following the same approach used in parametric testing.

Specifically, we determine a critical value such that if the smaller of the number of positive or negative signs is less than or equal to that critical value, then we reject H 0 in favor of H 1 and if the smaller of the number of positive or negative signs is greater than the critical value, then we do not reject H 0.

Notice that this is a one-sided decision rule corresponding to our one-sided research hypothesis the two-sided situation is discussed in the next example. In essence, we could use the critical value to decide whether to reject the null hypothesis.

Another alternative would be to calculate the p-value, as described below. With the Sign test we can readily compute a p-value based on our observed test statistic. These are shown in the table below. Recall that a p-value is the probability of observing a test statistic as or more extreme than that observed.

We observed 2 negative signs. Using the table above,. Recall the critical value for our test was 1 based on the table of critical values for the Sign Test above. In the example looking for differences in repetitive behaviors in autistic children, we used a one-sided test i.

A two sided test can be used if we hypothesize a difference in repetitive behavior after taking the drug as compared to before. From the table of critical values for the Sign Test, we can determine a two-sided critical value and again reject H 0 if the smaller of the number of positive or negative signs is less than or equal to that two-sided critical value. Alternatively, we can compute a two-sided p-value.

With a two-sided test, the p-value is the probability of observing many or few positive or negative signs. If the research hypothesis is a two sided alternative i. Recall in two-sided tests, we reject the null hypothesis if the test statistic is extreme in either direction. Thus, in the Sign Test, a two-sided p-value is the probability of observing few or many positive or negative signs.

Here we observe 2 negative signs and thus 6 positive signs. The two-sided p-value is the probability of observing a test statistic as or more extreme in either direction i.

There is a special circumstance that needs attention when implementing the Sign Test which arises when one or more participants have difference scores of zero i. If there is just one difference score of zero, some investigators drop that observation and reduce the sample size by 1 i.

This is a reasonable approach if there is just one zero. However, if there are two or more zeros, an alternative approach is preferred.

A new chemotherapy treatment is proposed for patients with breast cancer. Investigators are concerned with patient's ability to tolerate the treatment and assess their quality of life both before and after receiving the new chemotherapy treatment. The question of interest is whether there is a difference in QOL after chemotherapy treatment as compared to before. The appropriate critical value for the Sign Test can be found in the table of critical values for the Sign Test.

Because the before and after treatment measures are paired, we compute difference scores for each patient. In this example, we subtract the QOL measured before treatment from that measured after. We now capture the signs of the difference scores and because there are two zeros, we randomly assign one negative sign i. Again, the two-sided p-value is the probability of observing few or many positive or negative signs.

Here we observe 3 negative signs and thus 9 positive signs. We can compute the p-value using the binomial formula or a statistical computing package, as follows:. In each of the two previous examples, we failed to show statistical significance because the p-value was not less than the stated level of significance.

While the test statistic for the Sign Test is easy to compute, it actually does not take much of the information in the sample data into account. All we measure is the difference in participant's scores, and do not account for the magnitude of those differences.

Another popular nonparametric test for matched or paired data is called the Wilcoxon Signed Rank Test. Like the Sign Test, it is based on difference scores, but in addition to analyzing the signs of the differences, it also takes into account the magnitude of the observed differences. Let's use the Wilcoxon Signed Rank Test to re-analyze the data in Example 4 on page 5 of this module. Recall that this study assessed the effectiveness of a new drug designed to reduce repetitive behaviors in children affected with autism.

A total of 8 children with autism enroll in the study and the amount of time that each child is engaged in repetitive behavior during three hour observation periods are measured both before treatment and then again after taking the new medication for a period of 1 week. The next step is to rank the difference scores. We first order the absolute values of the difference scores and assign rank from 1 through n to the smallest through largest absolute values of the difference scores, and assign the mean rank when there are ties in the absolute values of the difference scores.

Similar to the Sign Test, hypotheses for the Wilcoxon Signed Rank Test concern the population median of the difference scores. The research hypothesis can be one- or two-sided. Here we consider a one-sided test. If the null hypothesis is true, we expect to see similar numbers of lower and higher ranks that are both positive and negative i.

If the research hypothesis is true we expect to see more higher and positive ranks in this example, more children with substantial improvement in repetitive behavior after treatment as compared to before, i. Next we must determine whether the observed test statistic W supports the null or research hypothesis.

Specifically, we determine a critical value of W such that if the observed value of W is less than or equal to the critical value, we reject H 0 in favor of H 1 , and if the observed value of W exceeds the critical value, we do not reject H 0. The H statistic for this example is computed as follows:. We must now determine whether the observed test statistic H supports the null or research hypothesis. Once again, this is done by establishing a critical value of H.

If the observed value of H is greater than or equal to the critical value, we reject H 0 in favor of H 1 ; if the observed value of H is less than the critical value we do not reject H 0.

The critical value of H can be found in the table below. For this example the critical value is 5. Notice that Table 8 contains critical values for the Kruskal Wallis test for tests comparing 3, 4 or 5 groups with small sample sizes. A personal trainer is interested in comparing the anaerobic thresholds of elite athletes.

Anaerobic threshold is defined as the point at which the muscles cannot get more oxygen to sustain activity or the upper limit of aerobic exercise. It is a measure also related to maximum heart rate. The following data are anaerobic thresholds for distance runners, distance cyclists, distance swimmers and cross-country skiers.

The critical value is 7. To conduct the test we assign ranks using the procedures outlined above. The first step in assigning ranks is to order the data from smallest to largest. This is done on the combined or total sample i. The table below shows the ordered data.

In this example,. Before we introduce you to these four assumptions, do not be surprised if, when analysing your own data using SPSS Statistics, one or more of these assumptions is violated i. This is not uncommon when working with real-world data rather than textbook examples, which often only show you how to carry out a Kruskal-Wallis H test when everything goes well!

Even when your data fails certain assumptions, there is often a solution to overcome this. As the Kruskal-Wallis H test does not assume normality in the data and is much less sensitive to outliers, it can be used when these assumptions have been violated and the use of a one-way ANOVA is inappropriate. However, the Kruskal-Wallis H test does come with an additional data consideration, Assumption 4 , which is discussed below:.

You should also check that your data meets assumptions 1, 2 and 3, which you can do without using SPSS Statistics. Just remember that if you do not check assumption 4, you will not know whether you are able to compare medians or just mean ranks, meaning that you might incorrectly interpret and report the result of the Kruskal-Wallis H test.

This is why we dedicate a number of sections of our enhanced Kruskal-Wallis H test guide to help you get this right. You can learn more about assumption 4 and what you will need to interpret in the Assumptions section of our enhanced Kruskal-Wallis H test guide, which you can access by subscribing to Laerd Statistics.

In the Test Procedure in SPSS Statistics section of this "quick start" guide, we illustrate the SPSS Statistics procedure to perform a Kruskal-Wallis H test assuming that your distributions are not the same shape and you have to interpret mean ranks rather than medians.

A medical researcher has heard anecdotal evidence that certain anti-depressive drugs can have the positive side-effect of lowering neurological pain in those individuals with chronic, neurological back pain, when administered in doses lower than those prescribed for depression. The medical researcher would like to investigate this anecdotal evidence with a study. The researcher identifies 3 well-known, anti-depressive drugs which might have this positive side effect, and labels them Drug A, Drug B and Drug C.

The researcher then recruits a group of 60 individuals with a similar level of back pain and randomly assigns them to one of three groups — Drug A, Drug B or Drug C treatment groups — and prescribes the relevant drug for a 4 week period.

At the end of the 4 week period, the researcher asks the participants to rate their back pain on a scale of 1 to 10, with 10 indicating the greatest level of pain. The researcher wants to compare the levels of pain experienced by the different groups at the end of the drug treatment period.

At the end of these eight steps, we show you how to interpret the results from your Kruskal-Wallis H test. If you want to find out where the differences between your groups lie i. However, it has the disadvantage of not automatically running post hoc tests. However, the procedure is identical. Note: The K ruskal-Wallis H checkbox in the —Test Type— area should be selected by default, but if it is not, make sure to check this option.



0コメント

  • 1000 / 1000