The Mann-Whitney U test, also known as the Wilcoxon rank-sum test, is a non-parametric statistical test used in biostatistics and other fields to compare two independent samples.
This test is particularly useful when the data do not meet the assumptions required for the parametric t-test, especially concerning normality.
Assumptions
Independence of Samples: The two samples being compared must be independent of each other.
Ordinal or Continuous Data: The test can be applied to ordinal data (ranked data) or continuous data.
Identical Shape and Scale: It is assumed that the distributions of both groups are the same shape and scale, though not necessarily normally distributed.
Steps to Perform the Mann-Whitney U Test
1. Prepare Your Data:
Ensure your data meets the assumptions: two independent samples, and ordinal or continuous data.
2. Rank the Combined Data:
Merge and rank the data from both groups, handling any ties appropriately.
3. Calculate 𝑈1 and 𝑈2:
Use the formulas given to calculate the U statistic for both groups.
4. Determine the Test Statistic:
Use the smaller of 𝑈1 or 𝑈2 as your test statistic.
5. Calculate the P-value:
Depending on your sample size, either use the normal approximation or exact distribution tables to find the p-value.
6. Interpret the Results:
Compare the p-value to your significance level (commonly 0.05) to decide whether to reject the null hypothesis (no difference between the groups).
Calculation
The Mann-Whitney U test compares the ranks of the data from both groups, rather than the actual values.
1. Combine and Rank the Data:
Merge the two samples into a single set, then rank all observations from the smallest to the largest.
If there are ties (equal values), assign the average rank to the tied values.
2. Calculate 𝑈 for Each Group:
Let 𝑛1 and 𝑛2 be the sizes of the two samples.
Let 𝑅1 be the sum of the ranks in the first sample, and 𝑅2 be the sum of the ranks in the second sample.
The U statistic for each sample can be computed as:
Alternatively, it can also be computed as:
3. Calculate the Test Statistic 𝑈:
The Mann-Whitney U test statistic is the smaller of 𝑈1 and 𝑈2.
4. Significance Testing:
For large samples, the distribution of 𝑈 can be approximated by a normal distribution.
The mean and standard deviation of 𝑈 is used to standardize 𝑈U into a z-score, which is then used to determine the p-value.
For small samples, use exact tables of the Mann-Whitney U distribution to determine the p-value.
This test is especially valuable in biostatistics for comparing the effects of treatments or conditions when the data cannot be assumed to be normally distributed.