Linear model for non-linear data: Polynomial Regression

Mustafa Sidhpuri
FAUN — Developer Community 🐾
3 min readMay 24, 2021

--

In Linear Regression, you learned how it works and how to implement it using sklearn, but it only works if data is linear and gives a bad performance on non-linear data. To overcome this disadvantage we have Polynomial Regression. It will be easy to understand if you know how to implement Linear Regression, if you do not know about Linear Regression please check the below article to learn before moving on.

Polynomial Regression

So what do you polynomial regression? it is a linear model that fits in non-linear data. Suppose we have a complex data than a single linear line, then we can add polynomial features to it. A simple way to do this is to add the powers of each feature as new features, then train a linear model on this extended set of features. This technique is called Polynomial Regression.

Let’s generate some data which is nonlinear.

m = 100
X = 6 * np.random.rand(m, 1) - 3
y = 0.5 * X**2 + X + 2 + np.random.randn(m, 1)
# plotting data
plt.plot(X,y,'b.')
Generated data

Using sklearn we can easily generate data with help of PolynomialFeatures() Let’s implement it.

from sklearn.preprocessing import PolynomialFeatures
poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly_features.fit_transform(X)
print(X[0])
# O/P: array([-0.75275929])
print(X_poly[0])
# O/P: array([-0.75275929, 0.56664654])

We have imported PolynomialFeatures() from sklearn.preprocessing and then initialized an object poly_features. After that, we simply call fit_transform and pass our dataset as a parameter.

X_poly now contains the original feature of X plus the square of this feature. Now you can fit a LinearRegression model to this extended training data.

lin_reg = LinearRegression()
lin_reg.fit(X_poly, y)
print(lin_reg.intercept_, lin_reg.coef_)
# o/p: (array([1.78134581]), array([[0.93366893, 0.56456263]]))
Best fit line

Not bad: the model estimates y = 0 . 56 x₁² + 0 . 93x₁ + 1 . 78 when in fact the original function was y = 0 . 5 x₁² + 1 . 0x₁ + 2 . 0 + Gaussian noise.

That is all about polynomial regression.

Thank you.

Alexandr Honchar Payman Taei Ben Weber Eugenio Culurciello Ben Lorica 罗瑞卡 Team AV Auren Hoffman Vidhy Choksi Data Science Brigade DataSeries TDS Editors

Join FAUN: Website 💻|Podcast 🎙️|Twitter 🐦|Facebook 👥|Instagram 📷|Facebook Group 🗣️|Linkedin Group 💬| Slack 📱|Cloud Native News 📰|More.

If this post was helpful, please click the clap 👏 button below a few times to show your support for the author 👇

--

--