sklearn.datasets iris数据集绘图

导入数据

from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

data = load_iris()
df = pd.DataFrame(data['data'], columns=data['feature_names'])
df['species'] = data['target']

df.head()
sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)species
05.13.51.40.20
14.93.01.40.20
24.73.21.30.20
34.63.11.50.20
45.03.61.40.20
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   sepal length (cm)  150 non-null    float64
 1   sepal width (cm)   150 non-null    float64
 2   petal length (cm)  150 non-null    float64
 3   petal width (cm)   150 non-null    float64
 4   species            150 non-null    int32  
dtypes: float64(4), int32(1)
memory usage: 5.4 KB
df.describe()
sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)species
count150.000000150.000000150.000000150.000000150.000000
mean5.8433333.0573333.7580001.1993331.000000
std0.8280660.4358661.7652980.7622380.819232
min4.3000002.0000001.0000000.1000000.000000
25%5.1000002.8000001.6000000.3000000.000000
50%5.8000003.0000004.3500001.3000001.000000
75%6.4000003.3000005.1000001.8000002.000000
max7.9000004.4000006.9000002.5000002.000000

数据可视化

df.plot(kind='box')#箱线图
df.plot(kind='line')#折线图
df.plot(kind='hist',bins=50)#直方图
df.plot(kind='bar')#条形图
df.plot(kind='kde')#Kernel 概率密度线
df.plot(kind='scatter',x='sepal length (cm)',y='sepal width (cm)')
<matplotlib.axes._subplots.AxesSubplot at 0x20a352cbcc8>

配合matplotlib 使用添加子图

fig,axs=plt.subplots(5,1)
df.plot(kind='line',subplots=True,ax=axs)
fig.subplots_adjust(wspace=1, ) 
fig,axs=plt.subplots(1,5)
df.plot(kind='kde',subplots=True,ax=axs,legend=False,sharey=1)
array([<matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3978FE48>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3993AA88>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3994F6C8>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3998DE88>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x0000020A39935288>],
      dtype=object)

matplotlib.pyplot.GridSpec绘制子图

x = df['sepal length (cm)'].values
y = df['sepal width (cm)'].values
plt.figure(figsize=(6,6))
grid = plt.GridSpec(4, 4, wspace=0.5, hspace=0.5)
main_ax = plt.subplot(grid[0:3,1:4])
df.plot(kind='scatter',x='sepal length (cm)',y='sepal width (cm)',ax=main_ax)

y_hist = plt.subplot(grid[0:3,0],xticklabels=[],sharey=main_ax)
plt.hist(y,60,color='green',orientation='horizontal',alpha=0.5)
y_hist.invert_xaxis()

x_hist = plt.subplot(grid[3,1:4],yticklabels=[])
df['sepal length (cm)'].plot(kind='kde',color='red',alpha=0.5,ax=x_hist)
x_hist.invert_yaxis()

plt.subplots_adjust(hspace=1.2, wspace=1)

plt.show()

后续:利用 seaborn 库相关库进行绘制复杂图形

Categories: Python

1 Comment

sklearn.datasets iris数据集绘图 seaborn and matplotlib – xinzipanghuang.home · 2020年7月24日 at 16:27

[…] sns.jointplot plt版 […]

Leave a Reply

Your email address will not be published.