- Seaborn - Pair Grid
- Seaborn - Facet Grid
- Seaborn - Linear Relationships
- Multi Panel Categorical Plots
- Seaborn - Plotting Wide Form Data
- Seaborn - Statistical Estimation
- Distribution of Observations
- Seaborn - Plotting Categorical Data
- Visualizing Pairwise Relationship
- Seaborn - Kernel Density Estimates
- Seaborn - Histogram
- Seaborn- Color Palette
- Seaborn - Figure Aesthetic
- Importing Datasets and Libraries
- Seaborn - Environment Setup
- Seaborn - Introduction
- Seaborn - Home
Function Reference
Seaborn Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Seaborn - Linear Relationships
Most of the times, we use datasets that contain multiple quantitative variables, and the goal of an analysis is often to relate those variables to each other. This can be done through the regression pnes.
While building the regression models, we often check for multicolpnearity, where we had to see the correlation between all the combinations of continuous variables and will take necessary action to remove multicolpnearity if exists. In such cases, the following techniques helps.
Functions to Draw Linear Regression Models
There are two main functions in Seaborn to visuapze a pnear relationship determined through regression. These functions are regplot() and lmplot().
regplot vs lmplot
regplot | lmplot |
---|---|
accepts the x and y variables in a variety of formats including simple numpy arrays, pandas Series objects, or as references to variables in a pandas DataFrame | has data as a required parameter and the x and y variables must be specified as strings. This data format is called “long-form” data |
Let us now draw the plots.
Example
Plotting the regplot and then lmplot with the same data in this example
import pandas as pd import seaborn as sb from matplotpb import pyplot as plt df = sb.load_dataset( tips ) sb.regplot(x = "total_bill", y = "tip", data = df) sb.lmplot(x = "total_bill", y = "tip", data = df) plt.show()
Output
You can see the difference in the size between two plots.
We can also fit a pnear regression when one of the variables takes discrete values
Example
import pandas as pd import seaborn as sb from matplotpb import pyplot as plt df = sb.load_dataset( tips ) sb.lmplot(x = "size", y = "tip", data = df) plt.show()
Output
Fitting Different Kinds of Models
The simple pnear regression model used above is very simple to fit, but in most of the cases, the data is non-pnear and the above methods cannot generapze the regression pne.
Let us use Anscombe’s dataset with the regression plots −
Example
import pandas as pd import seaborn as sb from matplotpb import pyplot as plt df = sb.load_dataset( anscombe ) sb.lmplot(x="x", y="y", data=df.query("dataset == I ")) plt.show()
In this case, the data is good fit for pnear regression model with less variance.
Let us see another example where the data takes high deviation which shows the pne of best fit is not good.
Example
import pandas as pd import seaborn as sb from matplotpb import pyplot as plt df = sb.load_dataset( anscombe ) sb.lmplot(x = "x", y = "y", data = df.query("dataset == II ")) plt.show()
Output
The plot shows the high deviation of data points from the regression pne. Such non-pnear, higher order can be visuapzed using the lmplot() and regplot().These can fit a polynomial regression model to explore simple kinds of nonpnear trends in the dataset −
Example
import pandas as pd import seaborn as sb from matplotpb import pyplot as plt df = sb.load_dataset( anscombe ) sb.lmplot(x = "x", y = "y", data = df.query("dataset == II "),order = 2) plt.show()