- R - Data Reshaping
- R - Packages
- R - Data Frames
- R - Factors
- R - Arrays
- R - Matrices
- R - Lists
- R - Vectors
- R - Strings
- R - Functions
- R - Loops
- R - Decision Making
- R - Operators
- R - Variables
- R - Data Types
- R - Basic Syntax
- R - Environment Setup
- R - Overview
- R - Home
R Data Interfaces
- R - Database
- R - Web Data
- R - JSON Files
- R - XML Files
- R - Binary Files
- R - Excel Files
- R - CSV Files
R Charts & Graphs
R Statistics Examples
- R - Chi Square Tests
- R - Survival Analysis
- R - Random Forest
- R - Decision Tree
- R - Nonlinear Least Square
- R - Time Series Analysis
- R - Analysis of Covariance
- R - Poisson Regression
- R - Binomial Distribution
- R - Normal Distribution
- R - Logistic Regression
- R - Multiple Regression
- R - Linear Regression
- R - Mean, Median & Mode
R Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
R - Normal Distribution
In a random collection of data from independent sources, it is generally observed that the distribution of data is normal. Which means, on plotting a graph with the value of the variable in the horizontal axis and the count of the values in the vertical axis we get a bell shape curve. The center of the curve represents the mean of the data set. In the graph, fifty percent of values pe to the left of the mean and the other fifty percent pe to the right of the graph. This is referred as normal distribution in statistics.
R has four in built functions to generate normal distribution. They are described below.
dnorm(x, mean, sd) pnorm(x, mean, sd) qnorm(p, mean, sd) rnorm(n, mean, sd)
Following is the description of the parameters used in above functions −
x is a vector of numbers.
p is a vector of probabipties.
n is number of observations(sample size).
mean is the mean value of the sample data. It s default value is zero.
sd is the standard deviation. It s default value is 1.
dnorm()
This function gives height of the probabipty distribution at each point for a given mean and standard deviation.
# Create a sequence of numbers between -10 and 10 incrementing by 0.1. x <- seq(-10, 10, by = .1) # Choose the mean as 2.5 and standard deviation as 0.5. y <- dnorm(x, mean = 2.5, sd = 0.5) # Give the chart file a name. png(file = "dnorm.png") plot(x,y) # Save the file. dev.off()
When we execute the above code, it produces the following result −
pnorm()
This function gives the probabipty of a normally distributed random number to be less that the value of a given number. It is also called "Cumulative Distribution Function".
# Create a sequence of numbers between -10 and 10 incrementing by 0.2. x <- seq(-10,10,by = .2) # Choose the mean as 2.5 and standard deviation as 2. y <- pnorm(x, mean = 2.5, sd = 2) # Give the chart file a name. png(file = "pnorm.png") # Plot the graph. plot(x,y) # Save the file. dev.off()
When we execute the above code, it produces the following result −
qnorm()
This function takes the probabipty value and gives a number whose cumulative value matches the probabipty value.
# Create a sequence of probabipty values incrementing by 0.02. x <- seq(0, 1, by = 0.02) # Choose the mean as 2 and standard deviation as 3. y <- qnorm(x, mean = 2, sd = 1) # Give the chart file a name. png(file = "qnorm.png") # Plot the graph. plot(x,y) # Save the file. dev.off()
When we execute the above code, it produces the following result −
rnorm()
This function is used to generate random numbers whose distribution is normal. It takes the sample size as input and generates that many random numbers. We draw a histogram to show the distribution of the generated numbers.
# Create a sample of 50 numbers which are normally distributed. y <- rnorm(50) # Give the chart file a name. png(file = "rnorm.png") # Plot the histogram for this sample. hist(y, main = "Normal DIstribution") # Save the file. dev.off()
When we execute the above code, it produces the following result −
Advertisements