Numpy 简明教程
NumPy - Statistical Functions
NumPy 有相当一些有用的统计函数,用于查找数组中给定元素的最小值、最大值、百分位数标准差和方差等。函数的说明如下 −
NumPy has quite a few useful statistical functions for finding minimum, maximum, percentile standard deviation and variance, etc. from the given elements in the array. The functions are explained as follows −
numpy.amin() and numpy.amax()
These functions return the minimum and the maximum from the elements in the given array along the specified axis.
import numpy as np
a = np.array([[3,7,5],[8,4,3],[2,4,9]])
print 'Our array is:'
print a
print '\n'
print 'Applying amin() function:'
print np.amin(a,1)
print '\n'
print 'Applying amin() function again:'
print np.amin(a,0)
print '\n'
print 'Applying amax() function:'
print np.amax(a)
print '\n'
print 'Applying amax() function again:'
print np.amax(a, axis = 0)
It will produce the following output −
Our array is:
[[3 7 5]
[8 4 3]
[2 4 9]]
Applying amin() function:
[3 3 2]
Applying amin() function again:
[2 4 3]
Applying amax() function:
Applying amax() function again:
[8 7 9]
numpy.ptp() 函数返回沿轴的范围(最大值 - 最小值)。
The numpy.ptp() function returns the range (maximum-minimum) of values along an axis.
import numpy as np
a = np.array([[3,7,5],[8,4,3],[2,4,9]])
print 'Our array is:'
print a
print '\n'
print 'Applying ptp() function:'
print np.ptp(a)
print '\n'
print 'Applying ptp() function along axis 1:'
print np.ptp(a, axis = 1)
print '\n'
print 'Applying ptp() function along axis 0:'
print np.ptp(a, axis = 0)
It will produce the following output −
Our array is:
[[3 7 5]
[8 4 3]
[2 4 9]]
Applying ptp() function:
Applying ptp() function along axis 1:
[4 5 7]
Applying ptp() function along axis 0:
[6 3 6]
百分位数(或百分比)是统计中用于指示一组观察值中低于该值的观察值所占百分比的测量值。函数 numpy.percentile() 采用以下参数。
Percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. The function numpy.percentile() takes the following arguments.
numpy.percentile(a, q, axis)
Sr.No. |
Argument & Description |
1 |
a Input array |
2 |
q The percentile to compute must be between 0-100 |
3 |
axis The axis along which the percentile is to be calculated |
import numpy as np
a = np.array([[30,40,70],[80,20,10],[50,90,60]])
print 'Our array is:'
print a
print '\n'
print 'Applying percentile() function:'
print np.percentile(a,50)
print '\n'
print 'Applying percentile() function along axis 1:'
print np.percentile(a,50, axis = 1)
print '\n'
print 'Applying percentile() function along axis 0:'
print np.percentile(a,50, axis = 0)
It will produce the following output −
Our array is:
[[30 40 70]
[80 20 10]
[50 90 60]]
Applying percentile() function:
Applying percentile() function along axis 1:
[ 40. 20. 60.]
Applying percentile() function along axis 0:
[ 50. 40. 60.]
Median 被定义为将数据样本的上半部分与下半部分分开的 value。 numpy.median() 函数的使用如下所示。
Median is defined as the value separating the higher half of a data sample from the lower half. The numpy.median() function is used as shown in the following program.
import numpy as np
a = np.array([[30,65,70],[80,95,10],[50,90,60]])
print 'Our array is:'
print a
print '\n'
print 'Applying median() function:'
print np.median(a)
print '\n'
print 'Applying median() function along axis 0:'
print np.median(a, axis = 0)
print '\n'
print 'Applying median() function along axis 1:'
print np.median(a, axis = 1)
It will produce the following output −
Our array is:
[[30 65 70]
[80 95 10]
[50 90 60]]
Applying median() function:
Applying median() function along axis 0:
[ 50. 90. 60.]
Applying median() function along axis 1:
[ 65. 80. 60.]
算术平均数是沿轴上的元素总和除以元素数量。 numpy.mean() 函数返回数组中元素的算术平均数。如果提到了轴,则沿着该轴计算。
Arithmetic mean is the sum of elements along an axis divided by the number of elements. The numpy.mean() function returns the arithmetic mean of elements in the array. If the axis is mentioned, it is calculated along it.
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print 'Our array is:'
print a
print '\n'
print 'Applying mean() function:'
print np.mean(a)
print '\n'
print 'Applying mean() function along axis 0:'
print np.mean(a, axis = 0)
print '\n'
print 'Applying mean() function along axis 1:'
print np.mean(a, axis = 1)
It will produce the following output −
Our array is:
[[1 2 3]
[3 4 5]
[4 5 6]]
Applying mean() function:
Applying mean() function along axis 0:
[ 2.66666667 3.66666667 4.66666667]
Applying mean() function along axis 1:
[ 2. 4. 5.]
加权平均数是由每个分量的乘积加上表示其重要性的因数后形成的平均数。 numpy.average() 函数根据另一个数组中给出的各个权重计算数组中元素的加权平均数。此函数可以有 axis 参数。如果未指定轴,则将数组展平。
Weighted average is an average resulting from the multiplication of each component by a factor reflecting its importance. The numpy.average() function computes the weighted average of elements in an array according to their respective weight given in another array. The function can have an axis parameter. If the axis is not specified, the array is flattened.
考虑数组 [1,2,3,4] 和相应的权重 [4,3,2,1],加权平均数是通过将相应元素的乘积相加并将其和除以权重之和来计算的。
Considering an array [1,2,3,4] and corresponding weights [4,3,2,1], the weighted average is calculated by adding the product of the corresponding elements and dividing the sum by the sum of weights.
加权平均数 = (1*4+2*3+3*2+4*1)/(4+3+2+1)
Weighted average = (1*4+2*3+3*2+4*1)/(4+3+2+1)
import numpy as np
a = np.array([1,2,3,4])
print 'Our array is:'
print a
print '\n'
print 'Applying average() function:'
print np.average(a)
print '\n'
# this is same as mean when weight is not specified
wts = np.array([4,3,2,1])
print 'Applying average() function again:'
print np.average(a,weights = wts)
print '\n'
# Returns the sum of weights, if the returned parameter is set to True.
print 'Sum of weights'
print np.average([1,2,3, 4],weights = [4,3,2,1], returned = True)
It will produce the following output −
Our array is:
[1 2 3 4]
Applying average() function:
Applying average() function again:
Sum of weights
(2.0, 10.0)
In a multi-dimensional array, the axis for computation can be specified.
import numpy as np
a = np.arange(6).reshape(3,2)
print 'Our array is:'
print a
print '\n'
print 'Modified array:'
wt = np.array([3,5])
print np.average(a, axis = 1, weights = wt)
print '\n'
print 'Modified array:'
print np.average(a, axis = 1, weights = wt, returned = True)
It will produce the following output −
Our array is:
[[0 1]
[2 3]
[4 5]]
Modified array:
[ 0.625 2.625 4.625]
Modified array:
(array([ 0.625, 2.625, 4.625]), array([ 8., 8., 8.]))
Standard Deviation
标准差是平均平方离差的平方根。标准差的公式如下 −
Standard deviation is the square root of the average of squared deviations from mean. The formula for standard deviation is as follows −
std = sqrt(mean(abs(x - x.mean())**2))
如果数组为 [1, 2, 3, 4],则其平均值为 2.5。因此,平方偏差为 [2.25, 0.25, 0.25, 2.25],其均值的平方根除以 4,即 sqrt (5/4) 等于 1.1180339887498949。
If the array is [1, 2, 3, 4], then its mean is 2.5. Hence the squared deviations are [2.25, 0.25, 0.25, 2.25] and the square root of its mean divided by 4, i.e., sqrt (5/4) is 1.1180339887498949.