Pandas 中文参考指南

Time deltas

时间增量是时间差异,用差异单位表示,例如天、小时、分钟、秒。它们可以是正数也可以是负数。

Timedeltas are differences in times, expressed in difference units, e.g. days, hours, minutes, seconds. They can be both positive and negative.

Timedeltadatetime.timedelta 的子类,行为类似,但它允许与 np.timedelta64 类型以及大量自定义的表示形式、解析和属性兼容。

Timedelta is a subclass of datetime.timedelta, and behaves in a similar manner, but allows compatibility with np.timedelta64 types as well as a host of custom representation, parsing, and attributes.

Parsing

你可以通过多种参数构造 Timedelta 标量,包括 ISO 8601 Duration 字符串。

You can construct a Timedelta scalar through various arguments, including ISO 8601 Duration strings.

In [1]: import datetime

# strings
In [2]: pd.Timedelta("1 days")
Out[2]: Timedelta('1 days 00:00:00')

In [3]: pd.Timedelta("1 days 00:00:00")
Out[3]: Timedelta('1 days 00:00:00')

In [4]: pd.Timedelta("1 days 2 hours")
Out[4]: Timedelta('1 days 02:00:00')

In [5]: pd.Timedelta("-1 days 2 min 3us")
Out[5]: Timedelta('-2 days +23:57:59.999997')

# like datetime.timedelta
# note: these MUST be specified as keyword arguments
In [6]: pd.Timedelta(days=1, seconds=1)
Out[6]: Timedelta('1 days 00:00:01')

# integers with a unit
In [7]: pd.Timedelta(1, unit="d")
Out[7]: Timedelta('1 days 00:00:00')

# from a datetime.timedelta/np.timedelta64
In [8]: pd.Timedelta(datetime.timedelta(days=1, seconds=1))
Out[8]: Timedelta('1 days 00:00:01')

In [9]: pd.Timedelta(np.timedelta64(1, "ms"))
Out[9]: Timedelta('0 days 00:00:00.001000')

# negative Timedeltas have this string repr
# to be more consistent with datetime.timedelta conventions
In [10]: pd.Timedelta("-1us")
Out[10]: Timedelta('-1 days +23:59:59.999999')

# a NaT
In [11]: pd.Timedelta("nan")
Out[11]: NaT

In [12]: pd.Timedelta("nat")
Out[12]: NaT

# ISO 8601 Duration strings
In [13]: pd.Timedelta("P0DT0H1M0S")
Out[13]: Timedelta('0 days 00:01:00')

In [14]: pd.Timedelta("P0DT0H0M0.000000123S")
Out[14]: Timedelta('0 days 00:00:00.000000123')

也可以在构造中使用 DateOffsets (Day, Hour, Minute, Second, Milli, Micro, Nano)。

DateOffsets (Day, Hour, Minute, Second, Milli, Micro, Nano) can also be used in construction.

In [15]: pd.Timedelta(pd.offsets.Second(2))
Out[15]: Timedelta('0 days 00:00:02')

此外,标量之间的运算产生另一个标量 Timedelta

Further, operations among the scalars yield another scalar Timedelta.

In [16]: pd.Timedelta(pd.offsets.Day(2)) + pd.Timedelta(pd.offsets.Second(2)) + pd.Timedelta(
   ....:     "00:00:00.000123"
   ....: )
   ....:
Out[16]: Timedelta('2 days 00:00:02.000123')

to_timedelta

使用顶层 pd.to_timedelta,你可以将标量、数组、列表或 Series 从公认的时间增量格式/值转换为 Timedelta 类型。如果输入是 Series,它将构造 Series;如果输入是类似于标量的,它将构造标量;否则,它将输出 TimedeltaIndex

Using the top-level pd.to_timedelta, you can convert a scalar, array, list, or Series from a recognized timedelta format / value into a Timedelta type. It will construct Series if the input is a Series, a scalar if the input is scalar-like, otherwise it will output a TimedeltaIndex.

您可以将单个字符串解析为 Timedelta:

You can parse a single string to a Timedelta:

In [17]: pd.to_timedelta("1 days 06:05:01.00003")
Out[17]: Timedelta('1 days 06:05:01.000030')

In [18]: pd.to_timedelta("15.5us")
Out[18]: Timedelta('0 days 00:00:00.000015500')

或字符串列表/数组:

or a list/array of strings:

In [19]: pd.to_timedelta(["1 days 06:05:01.00003", "15.5us", "nan"])
Out[19]: TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015500', NaT], dtype='timedelta64[ns]', freq=None)

如果输入是数字,则 unit 关键字参数指定 Timedelta 的单位:

The unit keyword argument specifies the unit of the Timedelta if the input is numeric:

In [20]: pd.to_timedelta(np.arange(5), unit="s")
Out[20]:
TimedeltaIndex(['0 days 00:00:00', '0 days 00:00:01', '0 days 00:00:02',
                '0 days 00:00:03', '0 days 00:00:04'],
               dtype='timedelta64[ns]', freq=None)

In [21]: pd.to_timedelta(np.arange(5), unit="d")
Out[21]: TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)

警告

Warning

如果字符串或字符串数组作为输入传递,则将忽略 unit 关键字参数。如果传递没有单位的字符串,则假定默认单位为纳秒。

If a string or array of strings is passed as an input then the unit keyword argument will be ignored. If a string without units is passed then the default unit of nanoseconds is assumed.

Timedelta limitations

pandas 使用 64 位整数以纳秒分辨率表示 Timedeltas。因此,64 位整型限制决定了 Timedelta 限制。

pandas represents Timedeltas in nanosecond resolution using 64 bit integers. As such, the 64 bit integer limits determine the Timedelta limits.

In [22]: pd.Timedelta.min
Out[22]: Timedelta('-106752 days +00:12:43.145224193')

In [23]: pd.Timedelta.max
Out[23]: Timedelta('106751 days 23:47:16.854775807')

Operations

您可以在 Series/DataFrames 中操作并通过在 datetime64[ns] Series 或 Timestamps 上进行减法操作来构造 timedelta64[ns] Series。

You can operate on Series/DataFrames and construct timedelta64[ns] Series through subtraction operations on datetime64[ns] Series, or Timestamps.

In [24]: s = pd.Series(pd.date_range("2012-1-1", periods=3, freq="D"))

In [25]: td = pd.Series([pd.Timedelta(days=i) for i in range(3)])

In [26]: df = pd.DataFrame({"A": s, "B": td})

In [27]: df
Out[27]:
           A      B
0 2012-01-01 0 days
1 2012-01-02 1 days
2 2012-01-03 2 days

In [28]: df["C"] = df["A"] + df["B"]

In [29]: df
Out[29]:
           A      B          C
0 2012-01-01 0 days 2012-01-01
1 2012-01-02 1 days 2012-01-03
2 2012-01-03 2 days 2012-01-05

In [30]: df.dtypes
Out[30]:
A     datetime64[ns]
B    timedelta64[ns]
C     datetime64[ns]
dtype: object

In [31]: s - s.max()
Out[31]:
0   -2 days
1   -1 days
2    0 days
dtype: timedelta64[ns]

In [32]: s - datetime.datetime(2011, 1, 1, 3, 5)
Out[32]:
0   364 days 20:55:00
1   365 days 20:55:00
2   366 days 20:55:00
dtype: timedelta64[ns]

In [33]: s + datetime.timedelta(minutes=5)
Out[33]:
0   2012-01-01 00:05:00
1   2012-01-02 00:05:00
2   2012-01-03 00:05:00
dtype: datetime64[ns]

In [34]: s + pd.offsets.Minute(5)
Out[34]:
0   2012-01-01 00:05:00
1   2012-01-02 00:05:00
2   2012-01-03 00:05:00
dtype: datetime64[ns]

In [35]: s + pd.offsets.Minute(5) + pd.offsets.Milli(5)
Out[35]:
0   2012-01-01 00:05:00.005
1   2012-01-02 00:05:00.005
2   2012-01-03 00:05:00.005
dtype: datetime64[ns]

来自 timedelta64[ns] 系列的标量操作:

Operations with scalars from a timedelta64[ns] series:

In [36]: y = s - s[0]

In [37]: y
Out[37]:
0   0 days
1   1 days
2   2 days
dtype: timedelta64[ns]

支持具有 NaT 值的时间间隔序列:

Series of timedeltas with NaT values are supported:

In [38]: y = s - s.shift()

In [39]: y
Out[39]:
0      NaT
1   1 days
2   1 days
dtype: timedelta64[ns]

可以使用 np.nan 类似于日期时间来将元素设置为 NaT

Elements can be set to NaT using np.nan analogously to datetimes:

In [40]: y[1] = np.nan

In [41]: y
Out[41]:
0      NaT
1      NaT
2   1 days
dtype: timedelta64[ns]

操作数也可以按相反的顺序出现(用 Series 运算的单一对象):

Operands can also appear in a reversed order (a singular object operated with a Series):

In [42]: s.max() - s
Out[42]:
0   2 days
1   1 days
2   0 days
dtype: timedelta64[ns]

In [43]: datetime.datetime(2011, 1, 1, 3, 5) - s
Out[43]:
0   -365 days +03:05:00
1   -366 days +03:05:00
2   -367 days +03:05:00
dtype: timedelta64[ns]

In [44]: datetime.timedelta(minutes=5) + s
Out[44]:
0   2012-01-01 00:05:00
1   2012-01-02 00:05:00
2   2012-01-03 00:05:00
dtype: datetime64[ns]

在框架上支持 min, max 和相应的 idxmin, idxmax 操作:

min, max and the corresponding idxmin, idxmax operations are supported on frames:

In [45]: A = s - pd.Timestamp("20120101") - pd.Timedelta("00:05:05")

In [46]: B = s - pd.Series(pd.date_range("2012-1-2", periods=3, freq="D"))

In [47]: df = pd.DataFrame({"A": A, "B": B})

In [48]: df
Out[48]:
                  A       B
0 -1 days +23:54:55 -1 days
1   0 days 23:54:55 -1 days
2   1 days 23:54:55 -1 days

In [49]: df.min()
Out[49]:
A   -1 days +23:54:55
B   -1 days +00:00:00
dtype: timedelta64[ns]

In [50]: df.min(axis=1)
Out[50]:
0   -1 days
1   -1 days
2   -1 days
dtype: timedelta64[ns]

In [51]: df.idxmin()
Out[51]:
A    0
B    0
dtype: int64

In [52]: df.idxmax()
Out[52]:
A    2
B    0
dtype: int64

在 Series 上也支持 min, max, idxmin, idxmax 操作。标量结果将是 Timedelta

min, max, idxmin, idxmax operations are supported on Series as well. A scalar result will be a Timedelta.

In [53]: df.min().max()
Out[53]: Timedelta('-1 days +23:54:55')

In [54]: df.min(axis=1).min()
Out[54]: Timedelta('-1 days +00:00:00')

In [55]: df.min().idxmax()
Out[55]: 'A'

In [56]: df.min(axis=1).idxmin()
Out[56]: 0

您可以对时间间隔填充空值,传递时间间隔以获得特定值。

You can fillna on timedeltas, passing a timedelta to get a particular value.

In [57]: y.fillna(pd.Timedelta(0))
Out[57]:
0   0 days
1   0 days
2   1 days
dtype: timedelta64[ns]

In [58]: y.fillna(pd.Timedelta(10, unit="s"))
Out[58]:
0   0 days 00:00:10
1   0 days 00:00:10
2   1 days 00:00:00
dtype: timedelta64[ns]

In [59]: y.fillna(pd.Timedelta("-1 days, 00:00:05"))
Out[59]:
0   -1 days +00:00:05
1   -1 days +00:00:05
2     1 days 00:00:00
dtype: timedelta64[ns]

您还可以否定、相乘并对 Timedeltas 使用 abs

You can also negate, multiply and use abs on Timedeltas:

In [60]: td1 = pd.Timedelta("-1 days 2 hours 3 seconds")

In [61]: td1
Out[61]: Timedelta('-2 days +21:59:57')

In [62]: -1 * td1
Out[62]: Timedelta('1 days 02:00:03')

In [63]: -td1
Out[63]: Timedelta('1 days 02:00:03')

In [64]: abs(td1)
Out[64]: Timedelta('1 days 02:00:03')

Reductions

timedelta64[ns] 的数字约减操作将返回 Timedelta 对象。与往常一样,NaT 在评估期间被跳过。

Numeric reduction operation for timedelta64[ns] will return Timedelta objects. As usual NaT are skipped during evaluation.

In [65]: y2 = pd.Series(
   ....:     pd.to_timedelta(["-1 days +00:00:05", "nat", "-1 days +00:00:05", "1 days"])
   ....: )
   ....:

In [66]: y2
Out[66]:
0   -1 days +00:00:05
1                 NaT
2   -1 days +00:00:05
3     1 days 00:00:00
dtype: timedelta64[ns]

In [67]: y2.mean()
Out[67]: Timedelta('-1 days +16:00:03.333333334')

In [68]: y2.median()
Out[68]: Timedelta('-1 days +00:00:05')

In [69]: y2.quantile(0.1)
Out[69]: Timedelta('-1 days +00:00:05')

In [70]: y2.sum()
Out[70]: Timedelta('-1 days +00:00:10')

Frequency conversion

时间间隔 Series 和 TimedeltaIndex 以及 Timedelta 可以通过 astyping 转换为特定时间间隔 dtype 来转换为其他频率。

Timedelta Series and TimedeltaIndex, and Timedelta can be converted to other frequencies by astyping to a specific timedelta dtype.

In [71]: december = pd.Series(pd.date_range("20121201", periods=4))

In [72]: january = pd.Series(pd.date_range("20130101", periods=4))

In [73]: td = january - december

In [74]: td[2] += datetime.timedelta(minutes=5, seconds=3)

In [75]: td[3] = np.nan

In [76]: td
Out[76]:
0   31 days 00:00:00
1   31 days 00:00:00
2   31 days 00:05:03
3                NaT
dtype: timedelta64[ns]

# to seconds
In [77]: td.astype("timedelta64[s]")
Out[77]:
0   31 days 00:00:00
1   31 days 00:00:00
2   31 days 00:05:03
3                NaT
dtype: timedelta64[s]

对于除支持的“s”、“ms”、“us”、“ns”以外的其他 timedelta64 分辨率,替代方法是除以另一个 timedelta 对象。请注意,除以 NumPy 标量是真除法,而 astyping 等同于地板除法。

For timedelta64 resolutions other than the supported “s”, “ms”, “us”, “ns”, an alternative is to divide by another timedelta object. Note that division by the NumPy scalar is true division, while astyping is equivalent of floor division.

# to days
In [78]: td / np.timedelta64(1, "D")
Out[78]:
0    31.000000
1    31.000000
2    31.003507
3          NaN
dtype: float64

用一个整数或整数 Series 除以或乘以一个 timedelta64[ns] Series 会产生另一个 timedelta64[ns] dtypes Series。

Dividing or multiplying a timedelta64[ns] Series by an integer or integer Series yields another timedelta64[ns] dtypes Series.

In [79]: td * -1
Out[79]:
0   -31 days +00:00:00
1   -31 days +00:00:00
2   -32 days +23:54:57
3                  NaT
dtype: timedelta64[ns]

In [80]: td * pd.Series([1, 2, 3, 4])
Out[80]:
0   31 days 00:00:00
1   62 days 00:00:00
2   93 days 00:15:09
3                NaT
dtype: timedelta64[ns]

按标量 Timedelta_对 _timedelta64[ns] Series 进行舍入除法(地板除法)会产生一系列整数。

Rounded division (floor-division) of a timedelta64[ns] Series by a scalar Timedelta gives a series of integers.

In [81]: td // pd.Timedelta(days=3, hours=4)
Out[81]:
0    9.0
1    9.0
2    9.0
3    NaN
dtype: float64

In [82]: pd.Timedelta(days=3, hours=4) // td
Out[82]:
0    0.0
1    0.0
2    0.0
3    NaN
dtype: float64

当使用另一个类似时差或数字参数操作时,为 Timedelta 定义 mod (%)divmod 操作。

The mod (%) and divmod operations are defined for Timedelta when operating with another timedelta-like or with a numeric argument.

In [83]: pd.Timedelta(hours=37) % datetime.timedelta(hours=2)
Out[83]: Timedelta('0 days 01:00:00')

# divmod against a timedelta-like returns a pair (int, Timedelta)
In [84]: divmod(datetime.timedelta(hours=2), pd.Timedelta(minutes=11))
Out[84]: (10, Timedelta('0 days 00:10:00'))

# divmod against a numeric returns a pair (Timedelta, Timedelta)
In [85]: divmod(pd.Timedelta(hours=25), 86400000000000)
Out[85]: (Timedelta('0 days 00:00:00.000000001'), Timedelta('0 days 01:00:00'))

Attributes

你可以直接使用属性 days,seconds,microseconds,nanoseconds 访问 TimedeltaTimedeltaIndex 的各种组成部分。这些与 datetime.timedelta 返回的值相同,例如,.seconds 属性表示秒数 >= 0 且 < 1 天。这些根据 Timedelta 是否带符号而进行符号化。

You can access various components of the Timedelta or TimedeltaIndex directly using the attributes days,seconds,microseconds,nanoseconds. These are identical to the values returned by datetime.timedelta, in that, for example, the .seconds attribute represents the number of seconds >= 0 and < 1 day. These are signed according to whether the Timedelta is signed.

这些操作也可以直接通过 Series.dt 属性访问。

These operations can also be directly accessed via the .dt property of the Series as well.

请注意,属性不是 Timedelta 的显示值。使用 .components 检索显示值。

Note that the attributes are NOT the displayed values of the Timedelta. Use .components to retrieve the displayed values.

对于 Series

For a Series:

In [86]: td.dt.days
Out[86]:
0    31.0
1    31.0
2    31.0
3     NaN
dtype: float64

In [87]: td.dt.seconds
Out[87]:
0      0.0
1      0.0
2    303.0
3      NaN
dtype: float64

你可以直接访问标量 Timedelta 的字段值。

You can access the value of the fields for a scalar Timedelta directly.

In [88]: tds = pd.Timedelta("31 days 5 min 3 sec")

In [89]: tds.days
Out[89]: 31

In [90]: tds.seconds
Out[90]: 303

In [91]: (-tds).seconds
Out[91]: 86097

你可以使用 .components 属性访问缩减形式的时间差。这会返回一个 DataFrame,其索引类似于 Series。这些是 Timedelta 的显示值。

You can use the .components property to access a reduced form of the timedelta. This returns a DataFrame indexed similarly to the Series. These are the displayed values of the Timedelta.

In [92]: td.dt.components
Out[92]:
   days  hours  minutes  seconds  milliseconds  microseconds  nanoseconds
0  31.0    0.0      0.0      0.0           0.0           0.0          0.0
1  31.0    0.0      0.0      0.0           0.0           0.0          0.0
2  31.0    0.0      5.0      3.0           0.0           0.0          0.0
3   NaN    NaN      NaN      NaN           NaN           NaN          NaN

In [93]: td.dt.components.seconds
Out[93]:
0    0.0
1    0.0
2    3.0
3    NaN
Name: seconds, dtype: float64

你可以通过 .isoformat 方法将 Timedelta 转换为 ` ISO 8601 Duration` 字符串

You can convert a Timedelta to an ISO 8601 Duration string with the .isoformat method

In [94]: pd.Timedelta(
   ....:     days=6, minutes=50, seconds=3, milliseconds=10, microseconds=10, nanoseconds=12
   ....: ).isoformat()
   ....:
Out[94]: 'P6DT0H50M3.010010012S'

TimedeltaIndex

要生成带有时差的索引,你可以使用 ` TimedeltaIndex` 或 ` timedelta_range()` 构造函数。

To generate an index with time delta, you can use either the TimedeltaIndex or the timedelta_range() constructor.

使用 TimedeltaIndex,你可以传递字符串式、Timedeltatimedeltanp.timedelta64 对象。传递 np.nan/pd.NaT/nat 将表示缺失值。

Using TimedeltaIndex you can pass string-like, Timedelta, timedelta, or np.timedelta64 objects. Passing np.nan/pd.NaT/nat will represent missing values.

In [95]: pd.TimedeltaIndex(
   ....:     [
   ....:         "1 days",
   ....:         "1 days, 00:00:05",
   ....:         np.timedelta64(2, "D"),
   ....:         datetime.timedelta(days=2, seconds=2),
   ....:     ]
   ....: )
   ....:
Out[95]:
TimedeltaIndex(['1 days 00:00:00', '1 days 00:00:05', '2 days 00:00:00',
                '2 days 00:00:02'],
               dtype='timedelta64[ns]', freq=None)

字符串“infer”可以传递,以便在创建时将索引的频率设置为推断的频率:

The string ‘infer’ can be passed in order to set the frequency of the index as the inferred frequency upon creation:

In [96]: pd.TimedeltaIndex(["0 days", "10 days", "20 days"], freq="infer")
Out[96]: TimedeltaIndex(['0 days', '10 days', '20 days'], dtype='timedelta64[ns]', freq='10D')

Generating ranges of time deltas

类似于 ` date_range(),你可以使用 ` timedelta_range() 构建 TimedeltaIndex 的常规范围。timedelta_range 的默认频率是日历日:

Similar to date_range(), you can construct regular ranges of a TimedeltaIndex using timedelta_range(). The default frequency for timedelta_range is calendar day:

In [97]: pd.timedelta_range(start="1 days", periods=5)
Out[97]: TimedeltaIndex(['1 days', '2 days', '3 days', '4 days', '5 days'], dtype='timedelta64[ns]', freq='D')

startendperiods 的各种组合可以与 timedelta_range 一起使用:

Various combinations of start, end, and periods can be used with timedelta_range:

In [98]: pd.timedelta_range(start="1 days", end="5 days")
Out[98]: TimedeltaIndex(['1 days', '2 days', '3 days', '4 days', '5 days'], dtype='timedelta64[ns]', freq='D')

In [99]: pd.timedelta_range(end="10 days", periods=4)
Out[99]: TimedeltaIndex(['7 days', '8 days', '9 days', '10 days'], dtype='timedelta64[ns]', freq='D')

freq 参数可以传递各种 ` frequency aliases`:

The freq parameter can passed a variety of frequency aliases:

In [100]: pd.timedelta_range(start="1 days", end="2 days", freq="30min")
Out[100]:
TimedeltaIndex(['1 days 00:00:00', '1 days 00:30:00', '1 days 01:00:00',
                '1 days 01:30:00', '1 days 02:00:00', '1 days 02:30:00',
                '1 days 03:00:00', '1 days 03:30:00', '1 days 04:00:00',
                '1 days 04:30:00', '1 days 05:00:00', '1 days 05:30:00',
                '1 days 06:00:00', '1 days 06:30:00', '1 days 07:00:00',
                '1 days 07:30:00', '1 days 08:00:00', '1 days 08:30:00',
                '1 days 09:00:00', '1 days 09:30:00', '1 days 10:00:00',
                '1 days 10:30:00', '1 days 11:00:00', '1 days 11:30:00',
                '1 days 12:00:00', '1 days 12:30:00', '1 days 13:00:00',
                '1 days 13:30:00', '1 days 14:00:00', '1 days 14:30:00',
                '1 days 15:00:00', '1 days 15:30:00', '1 days 16:00:00',
                '1 days 16:30:00', '1 days 17:00:00', '1 days 17:30:00',
                '1 days 18:00:00', '1 days 18:30:00', '1 days 19:00:00',
                '1 days 19:30:00', '1 days 20:00:00', '1 days 20:30:00',
                '1 days 21:00:00', '1 days 21:30:00', '1 days 22:00:00',
                '1 days 22:30:00', '1 days 23:00:00', '1 days 23:30:00',
                '2 days 00:00:00'],
               dtype='timedelta64[ns]', freq='30min')

In [101]: pd.timedelta_range(start="1 days", periods=5, freq="2D5h")
Out[101]:
TimedeltaIndex(['1 days 00:00:00', '3 days 05:00:00', '5 days 10:00:00',
                '7 days 15:00:00', '9 days 20:00:00'],
               dtype='timedelta64[ns]', freq='53h')

指定 startendperiods 将生成从 startend(包括)均匀间隔的时间差范围,其中结果 TimedeltaIndex 中的元素数为 periods

Specifying start, end, and periods will generate a range of evenly spaced timedeltas from start to end inclusively, with periods number of elements in the resulting TimedeltaIndex:

In [102]: pd.timedelta_range("0 days", "4 days", periods=5)
Out[102]: TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)

In [103]: pd.timedelta_range("0 days", "4 days", periods=10)
Out[103]:
TimedeltaIndex(['0 days 00:00:00', '0 days 10:40:00', '0 days 21:20:00',
                '1 days 08:00:00', '1 days 18:40:00', '2 days 05:20:00',
                '2 days 16:00:00', '3 days 02:40:00', '3 days 13:20:00',
                '4 days 00:00:00'],
               dtype='timedelta64[ns]', freq=None)

Using the TimedeltaIndex

类似于其他日期时间类型索引,DatetimeIndexPeriodIndex,你可以将 TimedeltaIndex 用作熊猫对象的索引。

Similarly to other of the datetime-like indices, DatetimeIndex and PeriodIndex, you can use TimedeltaIndex as the index of pandas objects.

In [104]: s = pd.Series(
   .....:     np.arange(100),
   .....:     index=pd.timedelta_range("1 days", periods=100, freq="h"),
   .....: )
   .....:

In [105]: s
Out[105]:
1 days 00:00:00     0
1 days 01:00:00     1
1 days 02:00:00     2
1 days 03:00:00     3
1 days 04:00:00     4
                   ..
4 days 23:00:00    95
5 days 00:00:00    96
5 days 01:00:00    97
5 days 02:00:00    98
5 days 03:00:00    99
Freq: h, Length: 100, dtype: int64

选择机制类似,对字符串式和切片进行强制转换:

Selections work similarly, with coercion on string-likes and slices:

In [106]: s["1 day":"2 day"]
Out[106]:
1 days 00:00:00     0
1 days 01:00:00     1
1 days 02:00:00     2
1 days 03:00:00     3
1 days 04:00:00     4
                   ..
2 days 19:00:00    43
2 days 20:00:00    44
2 days 21:00:00    45
2 days 22:00:00    46
2 days 23:00:00    47
Freq: h, Length: 48, dtype: int64

In [107]: s["1 day 01:00:00"]
Out[107]: 1

In [108]: s[pd.Timedelta("1 day 1h")]
Out[108]: 1

此外,你可以使用部分字符串选择,范围将被推断:

Furthermore you can use partial string selection and the range will be inferred:

In [109]: s["1 day":"1 day 5 hours"]
Out[109]:
1 days 00:00:00    0
1 days 01:00:00    1
1 days 02:00:00    2
1 days 03:00:00    3
1 days 04:00:00    4
1 days 05:00:00    5
Freq: h, dtype: int64

Operations

最后,TimedeltaIndexDatetimeIndex 的组合允许某些组合操作保持 NaT:

Finally, the combination of TimedeltaIndex with DatetimeIndex allow certain combination operations that are NaT preserving:

In [110]: tdi = pd.TimedeltaIndex(["1 days", pd.NaT, "2 days"])

In [111]: tdi.to_list()
Out[111]: [Timedelta('1 days 00:00:00'), NaT, Timedelta('2 days 00:00:00')]

In [112]: dti = pd.date_range("20130101", periods=3)

In [113]: dti.to_list()
Out[113]:
[Timestamp('2013-01-01 00:00:00'),
 Timestamp('2013-01-02 00:00:00'),
 Timestamp('2013-01-03 00:00:00')]

In [114]: (dti + tdi).to_list()
Out[114]: [Timestamp('2013-01-02 00:00:00'), NaT, Timestamp('2013-01-05 00:00:00')]

In [115]: (dti - tdi).to_list()
Out[115]: [Timestamp('2012-12-31 00:00:00'), NaT, Timestamp('2013-01-01 00:00:00')]

Conversions

类似于 Series 上频率转换,你可以转换这些索引以产生另一个 Index。

Similarly to frequency conversion on a Series above, you can convert these indices to yield another Index.

In [116]: tdi / np.timedelta64(1, "s")
Out[116]: Index([86400.0, nan, 172800.0], dtype='float64')

In [117]: tdi.astype("timedelta64[s]")
Out[117]: TimedeltaIndex(['1 days', NaT, '2 days'], dtype='timedelta64[s]', freq=None)

标量类型操作同样有效。它们可能返回不同类型的索引。

Scalars type ops work as well. These can potentially return a different type of index.

# adding or timedelta and date -> datelike
In [118]: tdi + pd.Timestamp("20130101")
Out[118]: DatetimeIndex(['2013-01-02', 'NaT', '2013-01-03'], dtype='datetime64[ns]', freq=None)

# subtraction of a date and a timedelta -> datelike
# note that trying to subtract a date from a Timedelta will raise an exception
In [119]: (pd.Timestamp("20130101") - tdi).to_list()
Out[119]: [Timestamp('2012-12-31 00:00:00'), NaT, Timestamp('2012-12-30 00:00:00')]

# timedelta + timedelta -> timedelta
In [120]: tdi + pd.Timedelta("10 days")
Out[120]: TimedeltaIndex(['11 days', NaT, '12 days'], dtype='timedelta64[ns]', freq=None)

# division can result in a Timedelta if the divisor is an integer
In [121]: tdi / 2
Out[121]: TimedeltaIndex(['0 days 12:00:00', NaT, '1 days 00:00:00'], dtype='timedelta64[ns]', freq=None)

# or a float64 Index if the divisor is a Timedelta
In [122]: tdi / tdi[0]
Out[122]: Index([1.0, nan, 2.0], dtype='float64')

Resampling

timeseries resampling 类似,我们可以使用 TimedeltaIndex 重新采样。

Similar to timeseries resampling, we can resample with a TimedeltaIndex.

In [123]: s.resample("D").mean()
Out[123]:
1 days    11.5
2 days    35.5
3 days    59.5
4 days    83.5
5 days    97.5
Freq: D, dtype: float64