Python Data Science 简明教程

Python - Time Series

时间序列是一个数据点系列,其中每个数据点都与时间戳相关。一个简单的示例是在给定的一天不同时间的股票市场中某个股票的价格。另一个示例是某一年不同月份某一区域的降雨量。

Time series is a series of data points in which each data point is associated with a timestamp. A simple example is the price of a stock in the stock market at different points of time on a given day. Another example is the amount of rainfall in a region at different months of the year.

在下面的示例中,我们获取某个特定股票符号的股票价格的季度每日值。我们以 csv 文件的形式记录这些值,然后使用 Pandas 库将它们组织成数据框。然后,我们通过重新创建额外的 Valuedate 列作为索引并删除旧的 valuedate 列,将日期字段设置为数据框的索引。

In the below example we take the value of stock prices every day for a quarter for a particular stock symbol. We capture these values as a csv file and then organize them to a dataframe using pandas library. We then set the date field as index of the dataframe by recreating the additional Valuedate column as index and deleting the old valuedate column.

Sample Data

以下是给定季度中股票在不同日期的价格的示例数据。数据保存在名为 stock.csv 的文件中:

Below is the sample data for the price of the stock on different days of a given quarter. The data is saved in a file named as stock.csv

ValueDate	Price
01-01-2018,	1042.05
02-01-2018,	1033.55
03-01-2018,	1029.7
04-01-2018,	1021.3
05-01-2018,	1015.4
...
...
...
...
23-03-2018,	1161.3
26-03-2018,	1167.6
27-03-2018,	1155.25
28-03-2018,	1154

Creating Time Series

from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('path_to_file/stock.csv')
df = pd.DataFrame(data, columns = ['ValueDate', 'Price'])

# Set the Date as Index
df['ValueDate'] = pd.to_datetime(df['ValueDate'])
df.index = df['ValueDate']
del df['ValueDate']


df.plot(figsize=(15, 6))
plt.show()

它的 output 如下所示 −

Its output is as follows −

timeseries