- Statistics - Discussion
- Z table
- Weak Law of Large Numbers
- Venn Diagram
- Variance
- Type I & II Error
- Trimmed Mean
- Transformations
- Ti 83 Exponential Regression
- T-Distribution Table
- Sum of Square
- Student T Test
- Stratified sampling
- Stem and Leaf Plot
- Statistics Notation
- Statistics Formulas
- Statistical Significance
- Standard normal table
- Standard Error ( SE )
- Standard Deviation
- Skewness
- Simple random sampling
- Signal to Noise Ratio
- Shannon Wiener Diversity Index
- Scatterplots
- Sampling methods
- Sample planning
- Root Mean Square
- Residual sum of squares
- Residual analysis
- Required Sample Size
- Reliability Coefficient
- Relative Standard Deviation
- Regression Intercept Confidence Interval
- Rayleigh Distribution
- Range Rule of Thumb
- Quartile Deviation
- Qualitative Data Vs Quantitative Data
- Quadratic Regression Equation
- Process Sigma
- Process Capability (Cp) & Process Performance (Pp)
- Probability Density Function
- Probability Bayes Theorem
- Probability Multiplecative Theorem
- Probability Additive Theorem
- Probability
- Power Calculator
- Pooled Variance (r)
- Poisson Distribution
- Pie Chart
- Permutation with Replacement
- Permutation
- Outlier Function
- One Proportion Z Test
- Odd and Even Permutation
- Normal Distribution
- Negative Binomial Distribution
- Multinomial Distribution
- Means Difference
- Mean Deviation
- Mcnemar Test
- Logistic Regression
- Log Gamma Distribution
- Linear regression
- Laplace Distribution
- Kurtosis
- Kolmogorov Smirnov Test
- Inverse Gamma Distribution
- Interval Estimation
- Individual Series Arithmetic Mode
- Individual Series Arithmetic Median
- Individual Series Arithmetic Mean
- Hypothesis testing
- Hypergeometric Distribution
- Histograms
- Harmonic Resonance Frequency
- Harmonic Number
- Harmonic Mean
- Gumbel Distribution
- Grand Mean
- Goodness of Fit
- Geometric Probability Distribution
- Geometric Mean
- Gamma Distribution
- Frequency Distribution
- Factorial
- F Test Table
- F distribution
- Exponential distribution
- Dot Plot
- Discrete Series Arithmetic Mode
- Discrete Series Arithmetic Median
- Discrete Series Arithmetic Mean
- Deciles Statistics
- Data Patterns
- Data collection - Case Study Method
- Data collection - Observation
- Data collection - Questionaire Designing
- Data collection
- Cumulative Poisson Distribution
- Cumulative plots
- Correlation Co-efficient
- Co-efficient of Variation
- Cumulative Frequency
- Continuous Series Arithmetic Mode
- Continuous Series Arithmetic Median
- Continuous Series Arithmetic Mean
- Continuous Uniform Distribution
- Comparing plots
- Combination with replacement
- Combination
- Cluster sampling
- Circular Permutation
- Chi Squared table
- Chi-squared Distribution
- Central limit theorem
- Boxplots
- Black-Scholes model
- Binomial Distribution
- Beta Distribution
- Best Point Estimation
- Bar Graph
- Arithmetic Range
- Arithmetic Mode
- Arithmetic Median
- Arithmetic Mean
- Analysis of Variance
- Adjusted R-Squared
- Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Statistics - Residual analysis
Residual analysis is used to assess the appropriateness of a pnear regression model by defining residuals and examining the residual plot graphs.
Residual
Residual($ e $) refers to the difference between observed value($ y $) vs predicted value ($ hat y $). Every data point have one residual.
${ residual = observedValue - predictedValue \[7pt] e = y - hat y }$
Residual Plot
A residual plot is a graph in which residuals are on tthe vertical axis and the independent variable is on the horizontal axis. If the dots are randomly dispersed around the horizontal axis then a pnear regression model is appropriate for the data; otherwise, choose a non-pnear model.
Types of Residual Plot
Following example shows few patterns in residual plots.
In first case, dots are randomly dispersed. So pnear regression model is preferred. In Second and third case, dots are non-randomly dispersed and suggests that a non-pnear regression method is preferred.
Example
Problem Statement:
Check where a pnear regression model is appropriate for the following data.
$ x $ | 60 | 70 | 80 | 85 | 95 |
---|---|---|---|---|---|
$ y $ (Actual Value) | 70 | 65 | 70 | 95 | 85 |
$ hat y $ (Predicted Value) | 65.411 | 71.849 | 78.288 | 81.507 | 87.945 |
Solution:
Step 1: Compute residuals for each data point.
$ x $ | 60 | 70 | 80 | 85 | 95 |
---|---|---|---|---|---|
$ y $ (Actual Value) | 70 | 65 | 70 | 95 | 85 |
$ hat y $ (Predicted Value) | 65.411 | 71.849 | 78.288 | 81.507 | 87.945 |
$ e $ (Residual) | 4.589 | -6.849 | -8.288 | 13.493 | -2.945 |
Step 2: - Draw the residual plot graph.
Step 3: - Check the randomness of the residuals.
Here residual plot exibits a random pattern - First residual is positive, following two are negative, the fourth one is positive, and the last residual is negative. As pattern is quite random which indicates that a pnear regression model is appropriate for the above data.
Advertisements