Parametric VS Non Parametric test
Introduction
In the literal meaning of the terms, a parametric statistical test is one that makes assumptions
about the parameters (defining properties) of the population distribution(s) from which one's data
are drawn, while a non-parametric test is one that makes no such assumptions. In this strict
sense, "non-parametric" is essentially a null category, since virtually all statistical tests assume
one thing or another about the properties of the source population(s).
For practical purposes, you can think of "parametric" as referring to tests, such as t-tests and the
analysis of variance, that assume the underlying source population(s) to be normally distributed;
they generally also assume that one's measures derive from an equal-interval scale. And you can
think of "non-parametric" as referring to tests that do not make on these particular assumptions.
Examples of non-parametric tests include
o the various forms of chi-square tests
o the Fisher Exact Probability test
o the Mann-Whitney Test
o the Wilcoxon Signed-Rank Test
o the Kruskal-Wallis Test
o The Friedman Test
Non-parametric tests are sometimes spoken of as "distribution-free" tests, although this too is
something of a misnomer.
Meaning of Parametric Test
The parametric test is the hypothesis test which provides generalizations for making statements
about the mean of the parent population. A t-test based on Student’s t-statistic, which is often
used in this regard. The t-statistic rests on the underlying assumption that there is the normal
distribution of variable and the mean in known or assumed to be known. The population variance
is calculated for the sample. It is assumed that the variables of interest, in the population are
measured on an interval scale.
Meaning of Nonparametric Test
The nonparametric test is defined as the hypothesis test which is not based on underlying
assumptions, i.e. it does not require population’s distribution to be denoted by specific
parameters. The test is mainly based on differences in medians. Hence, it is alternately known as
the distribution-free test. The test assumes that the variables are measured on a nominal or
ordinal level. It is used when the independent variables are non-metric.
Key Differences between Parametric and Nonparametric Tests
The fundamental differences between parametric and nonparametric test are discussed in the
following points:
1. A statistical test, in which specific assumptions are made about the population parameter is
known as the parametric test. A statistical test used in the case of non-metric independent
variables is called nonparametric test.
2. In the parametric test, the test statistic is based on distribution. On the other hand, the test
statistic is arbitrary in the case of the nonparametric test.
3. In the parametric test, it is assumed that the measurement of variables of interest is done on
interval or ratio level. As opposed to the nonparametric test, wherein the variable of interest are
measured on nominal or ordinal scale.
4. In general, the measure of central tendency in the parametric test is mean, while in the case of the
nonparametric test is median.
5. In the parametric test, there is complete information about the population. Conversely, in the
nonparametric test, there is no information about the population.
6. The applicability of parametric test is for variables only, whereas nonparametric test applies to
both variables and attributes.
7. For measuring the degree of association between two quantitative variables, Pearson’s
coefficient of correlation is used in the parametric test, while spearman’s rank correlation is used
in the nonparametric test.
Hypothesis Tests Hierarchy
Parametric Tests
T-test
A t-test is an analysis of two populations means through the use of statistical examination; a t-
test with two samples is commonly used with small sample sizes, testing the difference between
the samples when the variances of two normal distributions are not known.
The t test (also called Student’s T Test) compares two averages (means) and tells you if they are
different from each other. The t test also tells you how significant the differences are; In other
words it lets you know if those differences could have happened by chance.
Z-test
A z-test is a statistical test used to determine whether two population means are different when
the variances are known and the sample size is large. The test statistic is assumed to have a
normal distribution, and nuisance parameters such as standard deviation should be known for an
accurate z-test to be performed
Paired T-test
The purpose of the test is to determine whether there is statistical evidence that the mean
difference between paired observations on a particular outcome is significantly different from
zero. The Paired Samples t Test is a parametric test. This test is also known as Dependent t Test.
Like many statistical procedures, the paired sample t-test has two competing hypotheses, the null
hypothesis and the alternative hypothesis. The null hypothesis assumes that the true mean
difference between the paired samples is zero. Under this model, all observable differences are
explained by random variation. Conversely, the alternative hypothesis assumes that the true
mean difference between the paired samples is not equal to zero. The alternative hypothesis can
take one of several forms depending on the expected outcome. If the direction of the difference
does not matter, a two-tailed hypothesis is used. Otherwise, an upper-tailed or lower-tailed
hypothesis can be used to increase the power of the test. The null hypothesis remains the same
for each type of alternative hypothesis.
2-group T-test
It calculates a confidence interval and does a hypothesis test of the difference between two
populations means when standard deviations are unknown and samples are drawn independently
from each other. This procedure is based on the t-distribution, and for small samples it works
best if the data were drawn from distributions that are normal or close to normal. You can have
increasing confidence in the results as the sample sizes increase.
To do a 2-sample t-test, the two populations must be independent; in other words, the
observations from the first sample must not have any bearing on the observations from the
second sample. For example, test scores of two separate groups of students are independent, but
before-and-after measurements on the same group of students are not independent, although both
of these examples have two samples. If you cannot support the assumption of sample
independence, reconstruct your experiment to use the paired t-test for dependent populations.
The 2-sample t-test also works well when the assumption of normality is violated, but only if the
underlying distribution is not highly skewed. With non normal and highly skewed distributions,
it might be more appropriate to use a nonparametric test.
Advantages of Parametric Tests
1. Don’t require data
2. Quite easy to calculate them
3. They give you all necessary information
4. Other benefits of parametric tests
Limitations of Parametric Tests
1. They aren’t valid:
2. The size of sample is always very big
3. What you are studying here shall be represented through the medium itself
4. You have ranked data as well as outliners you just can’t remove
Non Parametric Tests
Chi Square test
It was First used by Karl Pearson. It is the Simplest & most widely used non-parametric test in
statistical work. Chi square test is calculated using the
Formula- χ2 = ∑ ( O – E )2 E O = observed frequencies E = expected frequencies
The Greater the discrepancy b/w observed & expected frequencies, greater shall be the value of
χ2. Calculated value of χ2 is compared with table value of χ2 for given degrees of freedom.
Application of chi-square test: • Test of association (smoking & cancer, treatment & outcome of
disease, vaccination & immunity) • Test of proportions (compare frequencies of diabetics & non-
diabetics in groups weighing 40-50kg, 50-60kg, 60- 70kg & >70kg.) • The chi-square for
goodness of fit (determine if actual numbers are similar to the expected/theoretical numbers)
Fisher’s Exact Test
It is used when the total number of cases is <20 or the expected number of cases in any cell is ≤1
or more than 25% of the cells have expected frequencies <5.
McNemar Test
McNemar Test: used to compare before and after findings in the same individual or to compare
findings in a matched analysis (for dichotomous variables). Example: comparing the attitudes of
medical students toward confidence in statistics analysis before and after the intensive statistics
course.
Sign Test
It’s used for paired data, can be ordinal or continuous. It’s Simple and easy to interpret. It makes
no assumptions about distribution of the data and is not very powerful .To evaluate H0 we only
need to know the signs of the differences. If half the differences are positive and half are
negative, then the median = 0 (H0 is true). If the signs are more unbalanced, then that is evidence
against H0.
Wilcoxon signed-rank test
It is a nonparametric equivalent of the paired t-test and is Similar to sign test, but take into
consideration the magnitude of difference among the pairs of values. (Sign test only considers
the direction of difference but not the magnitude of differences.)
Mann-Whitney U test
Mann-Whitney U – similar is to Wilcoxon signed-ranks test except that the samples are
independent and not paired.Null hypothesis is that the population means are the same for the two
groups. Combined data values for the two groups are ranked and then the average rank in each
group is found. Then the U value is calculated using
Formula U= N1*N2+ Nx(Nx+1) _ Rx (where Rx is larger rank 2 total)
To be statistically significant, obtained U has to be equal to or LESS than this critical value.
Kruskal-Wallis One-way ANOVA
It’s more powerful than Chi-square test. It is computed exactly like the Mann-Whitney test,
except that there are more groups (>2 groups). It’s Applied on independent samples with the
same shape (but not necessarily normal).
Friedman ANOVA
When either a matched-subjects or repeated-measure design is used or the hypothesis of a
difference among three or more (k) treatments is to be tested, the Friedman ANOVA by ranks
test can be used.
Spearman rank-order correlation
It’s Used to assess the relationship between two ordinal variables or two skewed continuous
variables and is considered a Nonparametric equivalent of the Pearson correlation. It is a relative
measure which varies from -1 (perfect negative relationship) to +1 (perfect positive relationship).
Cochran's Q test
Cochran's Q test is a non-parametric statistical test to verify if k treatments have identical effects
where the response variable can take only two possible outcomes (coded as 0 and 1)
Advantages of non-parametric tests
• These tests are distribution free.
• Easier to calculate & less time consuming than parametric tests when sample size is small.
• Can be used with any type of data.
• Many non-parametric methods make it possible to work with very small samples, particularly
helpful in collecting pilot study data or medical researcher working with a rare disease.
Limitations of non-parametric methods
• Statistical methods which require no assumptions about populations are usually less efficient .
• As the sample size get larger, data manipulations required for non-parametric tests becomes
laborious
• A collection of tabulated critical values for a variety of non- parametric tests under situations
dealing with various sample sizes is not readily available.
Conclusion
To make a choice between parametric and the nonparametric test is not easy for a researcher
conducting statistical analysis. For performing hypothesis, if the information about the
population is completely known, by way of parameters, then the test is said to be parametric test
whereas, if there is no knowledge about population and it is needed to test the hypothesis on
population, then the test conducted is considered as the nonparametric test.
Bibliography
1. https://www.google.co.in/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&cad=rja&uact=8
&ved=0ahUKEwjApPr6garUAhUJO48KHRHkCy0QFgg1MAM&url=https%3A%2F%2Fblue-sea-697d.quartiers047.workers.dev%3A443%2Fhttp%2Fw
ww.blackwellpublishing.com%2Fresearchproject%2Frobson%2Fpdfs%2FEDAC07.pdf&usg
=AFQjCNGnsIN3rHJk-q8o8EJ-0Wrjh_6nTA
2. Graduatetutor.com
Parametric vs non parametric test

Parametric vs non parametric test

  • 1.
    Parametric VS NonParametric test Introduction In the literal meaning of the terms, a parametric statistical test is one that makes assumptions about the parameters (defining properties) of the population distribution(s) from which one's data are drawn, while a non-parametric test is one that makes no such assumptions. In this strict sense, "non-parametric" is essentially a null category, since virtually all statistical tests assume one thing or another about the properties of the source population(s). For practical purposes, you can think of "parametric" as referring to tests, such as t-tests and the analysis of variance, that assume the underlying source population(s) to be normally distributed; they generally also assume that one's measures derive from an equal-interval scale. And you can think of "non-parametric" as referring to tests that do not make on these particular assumptions. Examples of non-parametric tests include o the various forms of chi-square tests o the Fisher Exact Probability test o the Mann-Whitney Test o the Wilcoxon Signed-Rank Test o the Kruskal-Wallis Test o The Friedman Test Non-parametric tests are sometimes spoken of as "distribution-free" tests, although this too is something of a misnomer. Meaning of Parametric Test The parametric test is the hypothesis test which provides generalizations for making statements about the mean of the parent population. A t-test based on Student’s t-statistic, which is often used in this regard. The t-statistic rests on the underlying assumption that there is the normal distribution of variable and the mean in known or assumed to be known. The population variance is calculated for the sample. It is assumed that the variables of interest, in the population are measured on an interval scale. Meaning of Nonparametric Test The nonparametric test is defined as the hypothesis test which is not based on underlying assumptions, i.e. it does not require population’s distribution to be denoted by specific parameters. The test is mainly based on differences in medians. Hence, it is alternately known as the distribution-free test. The test assumes that the variables are measured on a nominal or ordinal level. It is used when the independent variables are non-metric.
  • 2.
    Key Differences betweenParametric and Nonparametric Tests The fundamental differences between parametric and nonparametric test are discussed in the following points: 1. A statistical test, in which specific assumptions are made about the population parameter is known as the parametric test. A statistical test used in the case of non-metric independent variables is called nonparametric test. 2. In the parametric test, the test statistic is based on distribution. On the other hand, the test statistic is arbitrary in the case of the nonparametric test. 3. In the parametric test, it is assumed that the measurement of variables of interest is done on interval or ratio level. As opposed to the nonparametric test, wherein the variable of interest are measured on nominal or ordinal scale. 4. In general, the measure of central tendency in the parametric test is mean, while in the case of the nonparametric test is median. 5. In the parametric test, there is complete information about the population. Conversely, in the nonparametric test, there is no information about the population. 6. The applicability of parametric test is for variables only, whereas nonparametric test applies to both variables and attributes. 7. For measuring the degree of association between two quantitative variables, Pearson’s coefficient of correlation is used in the parametric test, while spearman’s rank correlation is used in the nonparametric test. Hypothesis Tests Hierarchy
  • 3.
    Parametric Tests T-test A t-testis an analysis of two populations means through the use of statistical examination; a t- test with two samples is commonly used with small sample sizes, testing the difference between the samples when the variances of two normal distributions are not known. The t test (also called Student’s T Test) compares two averages (means) and tells you if they are different from each other. The t test also tells you how significant the differences are; In other words it lets you know if those differences could have happened by chance. Z-test A z-test is a statistical test used to determine whether two population means are different when the variances are known and the sample size is large. The test statistic is assumed to have a normal distribution, and nuisance parameters such as standard deviation should be known for an accurate z-test to be performed Paired T-test The purpose of the test is to determine whether there is statistical evidence that the mean difference between paired observations on a particular outcome is significantly different from zero. The Paired Samples t Test is a parametric test. This test is also known as Dependent t Test. Like many statistical procedures, the paired sample t-test has two competing hypotheses, the null hypothesis and the alternative hypothesis. The null hypothesis assumes that the true mean difference between the paired samples is zero. Under this model, all observable differences are explained by random variation. Conversely, the alternative hypothesis assumes that the true mean difference between the paired samples is not equal to zero. The alternative hypothesis can take one of several forms depending on the expected outcome. If the direction of the difference does not matter, a two-tailed hypothesis is used. Otherwise, an upper-tailed or lower-tailed hypothesis can be used to increase the power of the test. The null hypothesis remains the same for each type of alternative hypothesis. 2-group T-test It calculates a confidence interval and does a hypothesis test of the difference between two populations means when standard deviations are unknown and samples are drawn independently from each other. This procedure is based on the t-distribution, and for small samples it works best if the data were drawn from distributions that are normal or close to normal. You can have increasing confidence in the results as the sample sizes increase. To do a 2-sample t-test, the two populations must be independent; in other words, the observations from the first sample must not have any bearing on the observations from the second sample. For example, test scores of two separate groups of students are independent, but
  • 4.
    before-and-after measurements onthe same group of students are not independent, although both of these examples have two samples. If you cannot support the assumption of sample independence, reconstruct your experiment to use the paired t-test for dependent populations. The 2-sample t-test also works well when the assumption of normality is violated, but only if the underlying distribution is not highly skewed. With non normal and highly skewed distributions, it might be more appropriate to use a nonparametric test. Advantages of Parametric Tests 1. Don’t require data 2. Quite easy to calculate them 3. They give you all necessary information 4. Other benefits of parametric tests Limitations of Parametric Tests 1. They aren’t valid: 2. The size of sample is always very big 3. What you are studying here shall be represented through the medium itself 4. You have ranked data as well as outliners you just can’t remove Non Parametric Tests Chi Square test It was First used by Karl Pearson. It is the Simplest & most widely used non-parametric test in statistical work. Chi square test is calculated using the Formula- χ2 = ∑ ( O – E )2 E O = observed frequencies E = expected frequencies The Greater the discrepancy b/w observed & expected frequencies, greater shall be the value of χ2. Calculated value of χ2 is compared with table value of χ2 for given degrees of freedom.
  • 5.
    Application of chi-squaretest: • Test of association (smoking & cancer, treatment & outcome of disease, vaccination & immunity) • Test of proportions (compare frequencies of diabetics & non- diabetics in groups weighing 40-50kg, 50-60kg, 60- 70kg & >70kg.) • The chi-square for goodness of fit (determine if actual numbers are similar to the expected/theoretical numbers) Fisher’s Exact Test It is used when the total number of cases is <20 or the expected number of cases in any cell is ≤1 or more than 25% of the cells have expected frequencies <5. McNemar Test McNemar Test: used to compare before and after findings in the same individual or to compare findings in a matched analysis (for dichotomous variables). Example: comparing the attitudes of medical students toward confidence in statistics analysis before and after the intensive statistics course. Sign Test It’s used for paired data, can be ordinal or continuous. It’s Simple and easy to interpret. It makes no assumptions about distribution of the data and is not very powerful .To evaluate H0 we only need to know the signs of the differences. If half the differences are positive and half are negative, then the median = 0 (H0 is true). If the signs are more unbalanced, then that is evidence against H0. Wilcoxon signed-rank test It is a nonparametric equivalent of the paired t-test and is Similar to sign test, but take into consideration the magnitude of difference among the pairs of values. (Sign test only considers the direction of difference but not the magnitude of differences.) Mann-Whitney U test Mann-Whitney U – similar is to Wilcoxon signed-ranks test except that the samples are independent and not paired.Null hypothesis is that the population means are the same for the two groups. Combined data values for the two groups are ranked and then the average rank in each group is found. Then the U value is calculated using Formula U= N1*N2+ Nx(Nx+1) _ Rx (where Rx is larger rank 2 total) To be statistically significant, obtained U has to be equal to or LESS than this critical value. Kruskal-Wallis One-way ANOVA It’s more powerful than Chi-square test. It is computed exactly like the Mann-Whitney test, except that there are more groups (>2 groups). It’s Applied on independent samples with the same shape (but not necessarily normal). Friedman ANOVA
  • 6.
    When either amatched-subjects or repeated-measure design is used or the hypothesis of a difference among three or more (k) treatments is to be tested, the Friedman ANOVA by ranks test can be used. Spearman rank-order correlation It’s Used to assess the relationship between two ordinal variables or two skewed continuous variables and is considered a Nonparametric equivalent of the Pearson correlation. It is a relative measure which varies from -1 (perfect negative relationship) to +1 (perfect positive relationship). Cochran's Q test Cochran's Q test is a non-parametric statistical test to verify if k treatments have identical effects where the response variable can take only two possible outcomes (coded as 0 and 1) Advantages of non-parametric tests • These tests are distribution free. • Easier to calculate & less time consuming than parametric tests when sample size is small. • Can be used with any type of data. • Many non-parametric methods make it possible to work with very small samples, particularly helpful in collecting pilot study data or medical researcher working with a rare disease. Limitations of non-parametric methods • Statistical methods which require no assumptions about populations are usually less efficient . • As the sample size get larger, data manipulations required for non-parametric tests becomes laborious • A collection of tabulated critical values for a variety of non- parametric tests under situations dealing with various sample sizes is not readily available. Conclusion To make a choice between parametric and the nonparametric test is not easy for a researcher conducting statistical analysis. For performing hypothesis, if the information about the population is completely known, by way of parameters, then the test is said to be parametric test whereas, if there is no knowledge about population and it is needed to test the hypothesis on population, then the test conducted is considered as the nonparametric test. Bibliography 1. https://www.google.co.in/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&cad=rja&uact=8 &ved=0ahUKEwjApPr6garUAhUJO48KHRHkCy0QFgg1MAM&url=https%3A%2F%2Fblue-sea-697d.quartiers047.workers.dev%3A443%2Fhttp%2Fw ww.blackwellpublishing.com%2Fresearchproject%2Frobson%2Fpdfs%2FEDAC07.pdf&usg =AFQjCNGnsIN3rHJk-q8o8EJ-0Wrjh_6nTA 2. Graduatetutor.com