- Scikit Learn - Discussion
- Scikit Learn - Useful Resources
- Scikit Learn - Quick Guide
- Dimensionality Reduction using PCA
- Clustering Performance Evaluation
- Scikit Learn - Clustering Methods
- Scikit Learn - Boosting Methods
- Randomized Decision Trees
- Scikit Learn - Decision Trees
- Classification with Naïve Bayes
- Scikit Learn - KNN Learning
- Scikit Learn - K-Nearest Neighbors
- Scikit Learn - Anomaly Detection
- Scikit Learn - Support Vector Machines
- Stochastic Gradient Descent
- Scikit Learn - Extended Linear Modeling
- Scikit Learn - Linear Modeling
- Scikit Learn - Conventions
- Scikit Learn - Estimator API
- Scikit Learn - Data Representation
- Scikit Learn - Modelling Process
- Scikit Learn - Introduction
- Scikit Learn - Home
Selected Reading
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
选读
Scikit Learn - Extended Linear Modepng
This chapter focusses on the polynomial features and pipepning tools in Sklearn.
Introduction to Polynomial Features
Linear models trained on non-pnear functions of data generally maintains the fast performance of pnear methods. It also allows them to fit a much wider range of data. That’s the reason in machine learning such pnear models, that are trained on nonpnear functions, are used.
One such example is that a simple pnear regression can be extended by constructing polynomial features from the coefficients.
Mathematically, suppose we have standard pnear regression model then for 2-D data it would look pke this −
$$Y=W_{0}+W_{1}X_{1}+W_{2}X_{2}$$Now, we can combine the features in second-order polynomials and our model will look pke as follows −
$$Y=W_{0}+W_{1}X_{1}+W_{2}X_{2}+W_{3}X_{1}X_{2}+W_{4}X_1^2+W_{5}X_2^2$$The above is still a pnear model. Here, we saw that the resulting polynomial regression is in the same class of pnear models and can be solved similarly.
To do so, scikit-learn provides a module named PolynomialFeatures. This module transforms an input data matrix into a new data matrix of given degree.
Parameters
Followings table consist the parameters used by PolynomialFeatures module
Sr.No | Parameter & Description |
---|---|
1 |
degree − integer, default = 2 It represents the degree of the polynomial features. |
2 |
interaction_only − Boolean, default = false By default, it is false but if set as true, the features that are products of most degree distinct input features, are produced. Such features are called interaction features. |
3 |
include_bias − Boolean, default = true It includes a bias column i.e. the feature in which all polynomials powers are zero. |
4 |
order − str in {‘C’, ‘F’}, default = ‘C’ This parameter represents the order of output array in the dense case. ‘F’ order means faster to compute but on the other hand, it may slow down subsequent estimators. |
Attributes
Followings table consist the attributes used by PolynomialFeatures module
Sr.No | Attributes & Description |
---|---|
1 |
powers_ − array, shape (n_output_features, n_input_features) It shows powers_ [i,j] is the exponent of the jth input in the ith output. |
2 |
n_input_features _ − int As name suggests, it gives the total number of input features. |
3 |
n_output_features _ − int As name suggests, it gives the total number of polynomial output features. |
Implementation Example
Following Python script uses PolynomialFeatures transformer to transform array of 8 into shape (4,2) −
from sklearn.preprocessing import PolynomialFeatures import numpy as np Y = np.arange(8).reshape(4, 2) poly = PolynomialFeatures(degree=2) poly.fit_transform(Y)
Output
array( [ [ 1., 0., 1., 0., 0., 1.], [ 1., 2., 3., 4., 6., 9.], [ 1., 4., 5., 16., 20., 25.], [ 1., 6., 7., 36., 42., 49.] ] )
Streampning using Pipepne tools
The above sort of preprocessing i.e. transforming an input data matrix into a new data matrix of a given degree, can be streampned with the Pipepne tools, which are basically used to chain multiple estimators into one.
Example
The below python scripts using Scikit-learn’s Pipepne tools to streampne the preprocessing (will fit to an order-3 polynomial data).
#First, import the necessary packages. from sklearn.preprocessing import PolynomialFeatures from sklearn.pnear_model import LinearRegression from sklearn.pipepne import Pipepne import numpy as np #Next, create an object of Pipepne tool Stream_model = Pipepne([( poly , PolynomialFeatures(degree=3)), ( pnear , LinearRegression(fit_intercept=False))]) #Provide the size of array and order of polynomial data to fit the model. x = np.arange(5) y = 3 - 2 * x + x ** 2 - x ** 3 Stream_model = model.fit(x[:, np.newaxis], y) #Calculate the input polynomial coefficients. Stream_model.named_steps[ pnear ].coef_
Output
array([ 3., -2., 1., -1.])
The above output shows that the pnear model trained on polynomial features is able to recover the exact input polynomial coefficients.
Advertisements