- Statistics - Discussion
- Z table
- Weak Law of Large Numbers
- Venn Diagram
- Variance
- Type I & II Error
- Trimmed Mean
- Transformations
- Ti 83 Exponential Regression
- T-Distribution Table
- Sum of Square
- Student T Test
- Stratified sampling
- Stem and Leaf Plot
- Statistics Notation
- Statistics Formulas
- Statistical Significance
- Standard normal table
- Standard Error ( SE )
- Standard Deviation
- Skewness
- Simple random sampling
- Signal to Noise Ratio
- Shannon Wiener Diversity Index
- Scatterplots
- Sampling methods
- Sample planning
- Root Mean Square
- Residual sum of squares
- Residual analysis
- Required Sample Size
- Reliability Coefficient
- Relative Standard Deviation
- Regression Intercept Confidence Interval
- Rayleigh Distribution
- Range Rule of Thumb
- Quartile Deviation
- Qualitative Data Vs Quantitative Data
- Quadratic Regression Equation
- Process Sigma
- Process Capability (Cp) & Process Performance (Pp)
- Probability Density Function
- Probability Bayes Theorem
- Probability Multiplecative Theorem
- Probability Additive Theorem
- Probability
- Power Calculator
- Pooled Variance (r)
- Poisson Distribution
- Pie Chart
- Permutation with Replacement
- Permutation
- Outlier Function
- One Proportion Z Test
- Odd and Even Permutation
- Normal Distribution
- Negative Binomial Distribution
- Multinomial Distribution
- Means Difference
- Mean Deviation
- Mcnemar Test
- Logistic Regression
- Log Gamma Distribution
- Linear regression
- Laplace Distribution
- Kurtosis
- Kolmogorov Smirnov Test
- Inverse Gamma Distribution
- Interval Estimation
- Individual Series Arithmetic Mode
- Individual Series Arithmetic Median
- Individual Series Arithmetic Mean
- Hypothesis testing
- Hypergeometric Distribution
- Histograms
- Harmonic Resonance Frequency
- Harmonic Number
- Harmonic Mean
- Gumbel Distribution
- Grand Mean
- Goodness of Fit
- Geometric Probability Distribution
- Geometric Mean
- Gamma Distribution
- Frequency Distribution
- Factorial
- F Test Table
- F distribution
- Exponential distribution
- Dot Plot
- Discrete Series Arithmetic Mode
- Discrete Series Arithmetic Median
- Discrete Series Arithmetic Mean
- Deciles Statistics
- Data Patterns
- Data collection - Case Study Method
- Data collection - Observation
- Data collection - Questionaire Designing
- Data collection
- Cumulative Poisson Distribution
- Cumulative plots
- Correlation Co-efficient
- Co-efficient of Variation
- Cumulative Frequency
- Continuous Series Arithmetic Mode
- Continuous Series Arithmetic Median
- Continuous Series Arithmetic Mean
- Continuous Uniform Distribution
- Comparing plots
- Combination with replacement
- Combination
- Cluster sampling
- Circular Permutation
- Chi Squared table
- Chi-squared Distribution
- Central limit theorem
- Boxplots
- Black-Scholes model
- Binomial Distribution
- Beta Distribution
- Best Point Estimation
- Bar Graph
- Arithmetic Range
- Arithmetic Mode
- Arithmetic Median
- Arithmetic Mean
- Analysis of Variance
- Adjusted R-Squared
- Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Statistics - Kolmogorov Smirnov Test
This test is used in situations where a comparison has to be made between an observed sample distribution and theoretical distribution.
K-S One Sample Test
This test is used as a test of goodness of fit and is ideal when the size of the sample is small. It compares the cumulative distribution function for a variable with a specified distribution. The null hypothesis assumes no difference between the observed and theoretical distribution and the value of test statistic D is calculated as:
Formula
$D = Maximum |F_o(X)-F_r(X)|$
Where −
${F_o(X)}$ = Observed cumulative frequency distribution of a random sample of n observations.
and ${F_o(X) = frac{k}{n}}$ = (No.of observations ≤ X)/(Total no.of observations).
${F_r(X)}$ = The theoretical frequency distribution.
The critical value of ${D}$ is found from the K-S table values for one sample test.
Acceptance Criteria: If calculated value is less than critical value accept null hypothesis.
Rejection Criteria: If calculated value is greater than table value reject null hypothesis.
Example
Problem Statement:
In a study done from various streams of a college 60 students, with equal number of students drawn from each stream, are we interviewed and their intention to join the Drama Club of college was noted.
B.Sc. | B.A. | B.Com | M.A. | M.Com | |
---|---|---|---|---|---|
No. in each class | 5 | 9 | 11 | 16 | 19 |
It was expected that 12 students from each class would join the Drama Club. Using the K-S test to find if there is any difference among student classes with regard to their intention of joining the Drama Club.
Solution:
${H_o}$: There is no difference among students of different streams with respect to their intention of joining the drama club.
We develop the cumulative frequencies for observed and theoretical distributions.
Streams | No. of students interested in joining | ${F_O(X)}$ | ${F_T(X)}$ | ${|F_O(X)-F_T(X)|}$ | |
---|---|---|---|---|---|
Observed (O) | Theoretical (T) | ||||
B.Sc. | 5 | 12 | 5/60 | 12/60 | 7/60 |
B.A. | 9 | 12 | 14/60 | 24/60 | 10/60 |
B.COM. | 11 | 12 | 25/60 | 36/60 | 11/60 |
M.A. | 16 | 12 | 41/60 | 48/60 | 7/60 |
M.COM. | 19 | 12 | 60/40 | 60/60 | 60/60 |
Total | n=60 | ||||
Test statistic ${|D|}$ is calculated as:
$D = Maximum {|F_0 (X)-F_T (X)|} \[7pt] , = frac{11}{60} \[7pt] , = 0.183$The table value of D at 5% significance level is given by
${D_0.05 = frac{1.36}{sqrt{n}}} \[7pt] , = frac{1.36}{sqrt{60}} \[7pt] , = 0.175$Since the calculated value is greater than the critical value, hence we reject the null hypothesis and conclude that there is a difference among students of different streams in their intention of joining the Club.
K-S Two Sample Test
When instead of one, there are two independent samples then K-S two sample test can be used to test the agreement between two cumulative distributions. The null hypothesis states that there is no difference between the two distributions. The D-statistic is calculated in the same manner as the K-S One Sample Test.
Formula
${D = Maximum |{F_n}_1(X)-{F_n}_2(X)|}$
Where −
${n_1}$ = Observations from first sample.
${n_2}$ = Observations from second sample.
It has been seen that when the cumulative distributions show large maximum deviation ${|D|}$ it is indicating towards a difference between the two sample distributions.
The critical value of D for samples where ${n_1 = n_2}$ and is ≤ 40, the K-S table for two sample case is used. When ${n_1}$ and/or ${n_2}$ > 40 then the K-S table for large samples of two sample test should be used. The null hypothesis is accepted if the calculated value is less than the table value and vice-versa.
Thus use of any of these nonparametric tests helps a researcher to test the significance of his results when the characteristics of the target population are unknown or no assumptions had been made about them.
Advertisements