Python Pandas 简明教程

Python Pandas - Series

Series 是一个一维的带标签阵列,它可以保存任何类型(整数、字符串、浮动、Python 对象等)的数据。轴标签统称为索引。

Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index.

pandas.Series

可以使用以下构造函数创建 Pandas Series −

A pandas Series can be created using the following constructor −

pandas.Series( data, index, dtype, copy)

构造函数的参数如下:

The parameters of the constructor are as follows −

Sr.No

Parameter & Description

1

data data takes various forms like ndarray, list, constants

2

index Index values must be unique and hashable, same length as data. Default np.arrange(n) if no index is passed.

3

dtype dtype is for data type. If None, data type will be inferred

4

copy Copy data. Default False

可以使用各种输入创建 series,如:

A series can be created using various inputs like −

  1. Array

  2. Dict

  3. Scalar value or constant

Create an Empty Series

可以创建的基本 series 是一个空 series。

A basic series, which can be created is an Empty Series.

Example

#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print s

它的 output 如下所示 −

Its output is as follows −

Series([], dtype: float64)

Create a Series from ndarray

如果 data 是一个 ndarray,则传入的索引的长度必须相同。如果没有传入索引,则默认的索引是 range(n) ,其中 n 是数组长度,即 [0,1,2,3…. range(len(array))-1].

If data is an ndarray, then index passed must be of the same length. If no index is passed, then by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].

Example 1

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s

它的 output 如下所示 −

Its output is as follows −

0   a
1   b
2   c
3   d
dtype: object

我们并没有传入任何索引,因此它默认分配了从 0 到 len(data)-1 的索引,即 0 到 3。

We did not pass any index, so by default, it assigned the indexes ranging from 0 to len(data)-1, i.e., 0 to 3.

Example 2

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[100,101,102,103])
print s

它的 output 如下所示 −

Its output is as follows −

100  a
101  b
102  c
103  d
dtype: object

我们在这里传入了索引值。现在,我们可以在输出中看到自定义的索引值。

We passed the index values here. Now we can see the customized indexed values in the output.

Create a Series from dict

可以将一个 dict 作为输入,并且如果未指定索引,则会按照排序的顺序取字典作为密钥,以构造索引。如果传递了 index ,则会提取出与索引中标签相对应的 data 中的值。

A dict can be passed as input and if no index is specified, then the dictionary keys are taken in a sorted order to construct index. If index is passed, the values in data corresponding to the labels in the index will be pulled out.

Example 1

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print s

它的 output 如下所示 −

Its output is as follows −

a 0.0
b 1.0
c 2.0
dtype: float64

Observe − 字典密钥用于构建索引。

Observe − Dictionary keys are used to construct index.

Example 2

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data,index=['b','c','d','a'])
print s

它的 output 如下所示 −

Its output is as follows −

b 1.0
c 2.0
d NaN
a 0.0
dtype: float64

Observe − 索引顺序得以保留,并且将缺失元素填充为 NaN(非数字)。

Observe − Index order is persisted and the missing element is filled with NaN (Not a Number).

Create a Series from Scalar

如果 data 是一个标量值,则必须提供一个索引。该值将会重复,以匹配 index 的长度

If data is a scalar value, an index must be provided. The value will be repeated to match the length of index

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
s = pd.Series(5, index=[0, 1, 2, 3])
print s

它的 output 如下所示 −

Its output is as follows −

0  5
1  5
2  5
3  5
dtype: int64

Accessing Data from Series with Position

可以通过类似于 ndarray. 中的方式访问 series 中的数据

Data in the series can be accessed similar to that in an ndarray.

Example 1

检索第一个元素。正如我们已经知道的那样,数组的计数从零开始,这意味着第一个元素存储在第零个位置,依此类推。

Retrieve the first element. As we already know, the counting starts from zero for the array, which means the first element is stored at zeroth position and so on.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the first element
print s[0]

它的 output 如下所示 −

Its output is as follows −

1

Example 2

在 Series 中检索前三个元素。如果在它的前面插入了一个 : ,则会提取从该索引开始的所有项目。如果使用了两个参数(中间以 : 分隔),则会提取两个索引所对应的项(不包括停止索引)

Retrieve the first three elements in the Series. If a : is inserted in front of it, all items from that index onwards will be extracted. If two parameters (with : between them) is used, items between the two indexes (not including the stop index)

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the first three element
print s[:3]

它的 output 如下所示 −

Its output is as follows −

a  1
b  2
c  3
dtype: int64

Example 3

检索最后三个元素。

Retrieve the last three elements.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the last three element
print s[-3:]

它的 output 如下所示 −

Its output is as follows −

c  3
d  4
e  5
dtype: int64

Retrieve Data Using Label (Index)

Series 就像一个固定大小的 dict ,因为你可以通过索引标签获取和设置值。

A Series is like a fixed-size dict in that you can get and set values by index label.

Example 1

使用索引标签值检索一个单独的元素。

Retrieve a single element using index label value.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve a single element
print s['a']

它的 output 如下所示 −

Its output is as follows −

1

Example 2

使用索引标签值列表检索多个元素。

Retrieve multiple elements using a list of index label values.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve multiple elements
print s[['a','c','d']]

它的 output 如下所示 −

Its output is as follows −

a  1
c  3
d  4
dtype: int64

Example 3

如果未包含标签,则会引发异常。

If a label is not contained, an exception is raised.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve multiple elements
print s['f']

它的 output 如下所示 −

Its output is as follows −

…
KeyError: 'f'