多项式回归拟合非线性数据

多项式回归原理

参考博客:多项式回归

sklearn 多项式回归演示

导包

1
2
3
4
5
from sklearn.preprocessing import PolynomialFeatures as PF
from sklearn.linear_model import LinearRegression
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

准备数据

1
2
3
4
5
6
7
8
9
rnd = np.random.RandomState(1) # 设施随机数种子
X = rnd.uniform(-3, 3, size=100)
y = np.sin(X) + rnd.normal(size=len(X)) / 3 # 添加噪声

X = X.reshape(-1, 1) # sklearn中数据需要是二维的
X = pd.DataFrame(X)
X.columns = ['x']

X.head()

1.png

特征变换

1
2
3
4
5
6
7
d = 3
poly = PF(degree=d)
X_ = poly.fit_transform(X)

X_ = pd.DataFrame(X_)
X_.columns = poly.get_feature_names(X.columns)
X_.head()

2.png

测试数据

1
2
line = np.linspace(-3, 3, 1000, endpoint=False).reshape(-1, 1)
line_ = poly.fit_transform(line)

线性回归训练

1
2
3
model_linear = LinearRegression().fit(X, y)
print(model_linear.score(X, y)) #0.53
print(model_linear.score(line, np.sin(line))) #0.68

多项式回归拟合

1
2
3
model_poly = LinearRegression().fit(X_, y)
print(model_poly.score(X_, y)) #0.84
print(model_poly.score(line_, np.sin(line))) #0.99

绘图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 放置画布
fig, ax1 = plt.subplots(1)

#将测试数据带入predict接口,获得模型的拟合效果并进行绘制
ax1.plot(line, model_linear.predict(line), linewidth=2, color='green', label="linear regression")
ax1.plot(line, model_poly.predict(line_), linewidth=2, color='red', label="Polynomial regression")

#将原数据上的拟合绘制在图像上
ax1.plot(X.iloc[:, 0], y, 'o', c='k')

#其他图形选项
ax1.legend(loc="best")
ax1.set_ylabel("Regression output")
ax1.set_xlabel("Input feature")
ax1.set_title("Linear Regression ordinary vs poly")
plt.tight_layout()
plt.show()

3.png