Python Data Science 简明教程
Python - Correlation
相关性是指涉及两个数据集之间的相关性的某些统计关系。依赖现象的简单示例包括父母与其后代的身体外观之间的相关性,以及产品价格与其供应量之间的相关性。
Correlation refers to some statistical relationships involving dependence between two data sets. Simple examples of dependent phenomena include the correlation between the physical appearance of parents and their offspring, and the correlation between the price for a product and its supplied quantity.
我们以 seaborn python 库中可用的 Iris 数据集为例。在其中,我们尝试建立三种鸢尾花的花萼和花瓣长度和宽度之间的相关性。根据发现的相关性,可以创建一个强有力的模型,该模型可以轻松地区分一个物种和另一个物种。
We take example of the iris data set available in seaborn python library. In it we try to establish the correlation between the length and the width of the sepals and petals of three species of iris flower. Based on the correlation found, a strong model could be created which easily distinguishes one species from another.
import matplotlib.pyplot as plt
import seaborn as sns
df = sns.load_dataset('iris')
#without regression
sns.pairplot(df, kind="scatter")
plt.show()
它的 output 如下所示 −
Its output is as follows −