Pybrain 简明教程
PyBrain - Importing Data For Datasets
在本章中,我们将学习如何获取使用 Pybrain 数据集的数据。
In this chapter, we will learn how to get data to work with Pybrain datasets.
最常用的数据集是:
The most commonly used are datasets are −
-
Using sklearn
-
From CSV file
Using sklearn
使用 sklearn
Using sklearn
以下是 sklearn 数据集详细信息的链接: https://scikit-learn.org/stable/datasets/toy_dataset.html
Here is the link that has details of datasets from sklearn:https://scikit-learn.org/stable/datasets/toy_dataset.html
以下是使用 sklearn 数据集的一些示例:
Here are a few examples of how to use datasets from sklearn −
From CSV file
我们还可以通过以下方式使用 csv 文件中的数据:
We can also use data from csv file as follows −
下面是异或真值表的样本数据:datasettest.csv
Here is sample data for xor truth table: datasettest.csv
以下是读取 csv 文件中数据以获取数据集的工作示例。
Here is the working example to read the data from .csv file for dataset.
Example
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure import TanhLayer
from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
import pandas as pd
print('Read data...')
df = pd.read_csv('data/datasettest.csv',header=0).head(1000)
data = df.values
train_output = data[:,0]
train_data = data[:,1:]
print(train_output)
print(train_data)
# Create a network with two inputs, three hidden, and one output
nn = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer)
# Create a dataset that matches network input and output sizes:
_gate = SupervisedDataSet(2, 1)
# Create a dataset to be used for testing.
nortrain = SupervisedDataSet(2, 1)
# Add input and target values to dataset
# Values for NOR truth table
for i in range(0, len(train_output)) :
_gate.addSample(train_data[i], train_output[i])
#Training the network with dataset norgate.
trainer = BackpropTrainer(nn, _gate)
# will run the loop 1000 times to train it.
for epoch in range(1000):
trainer.train()
trainer.testOnData(dataset=_gate, verbose = True)
如示例所示,Panda 用于读取 csv 文件中的数据。
Panda is used to read data from csv file as shown in the example.
Output
C:\pybrain\pybrain\src>python testcsv.py
Read data...
[0 1 1 0]
[
[0 0]
[0 1]
[1 0]
[1 1]
]
Testing on data:
('out: ', '[0.004 ]')
('correct:', '[0 ]')
error: 0.00000795
('out: ', '[0.997 ]')
('correct:', '[1 ]')
error: 0.00000380
('out: ', '[0.996 ]')
('correct:', '[1 ]')
error: 0.00000826
('out: ', '[0.004 ]')
('correct:', '[0 ]')
error: 0.00000829
('All errors:', [7.94733477723902e-06, 3.798267582566822e-06, 8.260969076585322e
-06, 8.286246525558165e-06])
('Average error:', 7.073204490487332e-06)
('Max error:', 8.286246525558165e-06, 'Median error:', 8.260969076585322e-06)