English 中文(简体)
Scikit Learn - Extended Linear Modeling
  • 时间:2024-12-22

Scikit Learn - Extended Linear Modepng


Previous Page Next Page  

This chapter focusses on the polynomial features and pipepning tools in Sklearn.

Introduction to Polynomial Features

Linear models trained on non-pnear functions of data generally maintains the fast performance of pnear methods. It also allows them to fit a much wider range of data. That’s the reason in machine learning such pnear models, that are trained on nonpnear functions, are used.

One such example is that a simple pnear regression can be extended by constructing polynomial features from the coefficients.

Mathematically, suppose we have standard pnear regression model then for 2-D data it would look pke this −

$$Y=W_{0}+W_{1}X_{1}+W_{2}X_{2}$$

Now, we can combine the features in second-order polynomials and our model will look pke as follows −

$$Y=W_{0}+W_{1}X_{1}+W_{2}X_{2}+W_{3}X_{1}X_{2}+W_{4}X_1^2+W_{5}X_2^2$$

The above is still a pnear model. Here, we saw that the resulting polynomial regression is in the same class of pnear models and can be solved similarly.

To do so, scikit-learn provides a module named PolynomialFeatures. This module transforms an input data matrix into a new data matrix of given degree.

Parameters

Followings table consist the parameters used by PolynomialFeatures module

Sr.No Parameter & Description
1

degree − integer, default = 2

It represents the degree of the polynomial features.

2

interaction_only − Boolean, default = false

By default, it is false but if set as true, the features that are products of most degree distinct input features, are produced. Such features are called interaction features.

3

include_bias − Boolean, default = true

It includes a bias column i.e. the feature in which all polynomials powers are zero.

4

order − str in {‘C’, ‘F’}, default = ‘C’

This parameter represents the order of output array in the dense case. ‘F’ order means faster to compute but on the other hand, it may slow down subsequent estimators.

Attributes

Followings table consist the attributes used by PolynomialFeatures module

Sr.No Attributes & Description
1

powers_ − array, shape (n_output_features, n_input_features)

It shows powers_ [i,j] is the exponent of the jth input in the ith output.

2

n_input_features _ − int

As name suggests, it gives the total number of input features.

3

n_output_features _ − int

As name suggests, it gives the total number of polynomial output features.

Implementation Example

Following Python script uses PolynomialFeatures transformer to transform array of 8 into shape (4,2) −


from sklearn.preprocessing import PolynomialFeatures
import numpy as np
Y = np.arange(8).reshape(4, 2)
poly = PolynomialFeatures(degree=2)
poly.fit_transform(Y)

Output


array(
   [
      [ 1., 0., 1., 0., 0., 1.],
      [ 1., 2., 3., 4., 6., 9.],
      [ 1., 4., 5., 16., 20., 25.],
      [ 1., 6., 7., 36., 42., 49.]
   ]
)

Streampning using Pipepne tools

The above sort of preprocessing i.e. transforming an input data matrix into a new data matrix of a given degree, can be streampned with the Pipepne tools, which are basically used to chain multiple estimators into one.

Example

The below python scripts using Scikit-learn’s Pipepne tools to streampne the preprocessing (will fit to an order-3 polynomial data).


#First, import the necessary packages.
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pnear_model import LinearRegression
from sklearn.pipepne import Pipepne
import numpy as np

#Next, create an object of Pipepne tool
Stream_model = Pipepne([( poly , PolynomialFeatures(degree=3)), ( pnear , LinearRegression(fit_intercept=False))])

#Provide the size of array and order of polynomial data to fit the model.
x = np.arange(5)
y = 3 - 2 * x + x ** 2 - x ** 3
Stream_model = model.fit(x[:, np.newaxis], y)

#Calculate the input polynomial coefficients.
Stream_model.named_steps[ pnear ].coef_

Output


array([ 3., -2., 1., -1.])

The above output shows that the pnear model trained on polynomial features is able to recover the exact input polynomial coefficients.

Advertisements