R 简明教程

R - Data Types

通常,在任何编程语言中进行编程时,你需要使用各种变量来存储各种信息。变量只不过是用于存储值的保留内存位置。这意味着,当你创建变量时,你会在内存中保留一些空间。

Generally, while doing programming in any programming language, you need to use various variables to store various information. Variables are nothing but reserved memory locations to store values. This means that, when you create a variable you reserve some space in memory.

你可能希望存储各种数据类型的信息,如字符、宽字符、整数、浮点数、双浮点数、布尔值等。根据变量的数据类型,操作系统分配内存,并决定在保留内存中可以存储什么。

You may like to store information of various data types like character, wide character, integer, floating point, double floating point, Boolean etc. Based on the data type of a variable, the operating system allocates memory and decides what can be stored in the reserved memory.

与其他编程语言(如 C 和 Java)不同,在 R 中,变量不会被声明为某种数据类型。这些变量会被分配 R 对象并且 R 对象的数据类型会成为变量的数据类型。有很多类型的 R 对象。经常使用的是 −

In contrast to other programming languages like C and java in R, the variables are not declared as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are many types of R-objects. The frequently used ones are −

  1. Vectors

  2. Lists

  3. Matrices

  4. Arrays

  5. Factors

  6. Data Frames

这些对象中最简单的是 vector object 并且这些原子向量的有六个数据类型,也被称为向量的六个类别。其他 R 对象建立在原子向量之上。

The simplest of these objects is the vector object and there are six data types of these atomic vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic vectors.

Data Type

Example

Verify

Logical

TRUE, FALSE

Live Demov ← TRUE print(class(v)) it produces the following result − [1] "logical"

Numeric

12.3, 5, 999

Live Demov ← 23.5 print(class(v)) it produces the following result − [1] "numeric"

Integer

2L, 34L, 0L

Live Demov ← 2L print(class(v)) it produces the following result − [1] "integer"

Complex

3 + 2i

Live Demov ← 2+5i print(class(v)) it produces the following result − [1] "complex"

Character

'a' , '"good", "TRUE", '23.4'

Live Demov ← "TRUE" print(class(v)) it produces the following result − [1] "character"

Raw

"Hello" is stored as 48 65 6c 6c 6f

Live Demov ← charToRaw("Hello") print(class(v)) it produces the following result − [1] "raw"

在 R 编程中,最基本的数据类型是 R 对象,称为 vectors ,它持有不同类别的元素,如上所示。请注意,在 R 中,类别的数量不限于上述六种类型。例如,我们可以使用许多原子向量并创建类为数组的数组。

In R programming, the very basic data types are the R-objects called vectors which hold elements of different classes as shown above. Please note in R the number of classes is not confined to only the above six types. For example, we can use many atomic vectors and create an array whose class will become array.

Vectors

当您希望创建包含多个元素的向量时,您应该使用 c() 函数,它表示将元素组合成一个向量。

When you want to create vector with more than one element, you should use c() function which means to combine the elements into a vector.

# Create a vector.
apple <- c('red','green',"yellow")
print(apple)

# Get the class of the vector.
print(class(apple))

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

[1] "red"    "green"  "yellow"
[1] "character"

Lists

列表是 R 对象,它可以在内部包含许多不同类型的元素,例如向量、函数,甚至另一个列表。

A list is an R-object which can contain many different types of elements inside it like vectors, functions and even another list inside it.

# Create a list.
list1 <- list(c(2,5,3),21.3,sin)

# Print the list.
print(list1)

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

[[1]]
[1] 2 5 3

[[2]]
[1] 21.3

[[3]]
function (x)  .Primitive("sin")

Matrices

矩阵是二维矩形数据集。它可以用一个向量输入到矩阵函数中来创建。

A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix function.

# Create a matrix.
M = matrix( c('a','a','b','c','b','a'), nrow = 2, ncol = 3, byrow = TRUE)
print(M)

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

     [,1] [,2] [,3]
[1,] "a"  "a"  "b"
[2,] "c"  "b"  "a"

Arrays

当矩阵限制在两个维度时,数组可以是任何数量的维度。数组函数采用 dim 属性,它创建所需数量的维度。在下面的示例中,我们创建了一个数组,其中有两个元素,每个元素都是 3x3 矩阵。

While matrices are confined to two dimensions, arrays can be of any number of dimensions. The array function takes a dim attribute which creates the required number of dimension. In the below example we create an array with two elements which are 3x3 matrices each.

# Create an array.
a <- array(c('green','yellow'),dim = c(3,3,2))
print(a)

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

, , 1

     [,1]     [,2]     [,3]
[1,] "green"  "yellow" "green"
[2,] "yellow" "green"  "yellow"
[3,] "green"  "yellow" "green"

, , 2

     [,1]     [,2]     [,3]
[1,] "yellow" "green"  "yellow"
[2,] "green"  "yellow" "green"
[3,] "yellow" "green"  "yellow"

Factors

因子是使用向量创建的 r 对象。它存储向量以及元素在向量中的离散值,作为标签。标签总是字符,而不管输入向量中是数字、字符还是布尔等。它们在统计建模中很有用。

Factors are the r-objects which are created using a vector. It stores the vector along with the distinct values of the elements in the vector as labels. The labels are always character irrespective of whether it is numeric or character or Boolean etc. in the input vector. They are useful in statistical modeling.

使用 factor() 函数创建因子。 nlevels 函数给出级别计数。

Factors are created using the factor() function. The nlevels functions gives the count of levels.

# Create a vector.
apple_colors <- c('green','green','yellow','red','red','red','green')

# Create a factor object.
factor_apple <- factor(apple_colors)

# Print the factor.
print(factor_apple)
print(nlevels(factor_apple))

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

[1] green  green  yellow red    red    red    green
Levels: green red yellow
[1] 3

Data Frames

数据框是表格数据对象。与数据框中的矩阵不同,每列可以包含不同模式的数据。第一列可以是数字,而第二列可以是字符,第三列可以是逻辑。它是一个长度相等的向量的列表。

Data frames are tabular data objects. Unlike a matrix in data frame each column can contain different modes of data. The first column can be numeric while the second column can be character and third column can be logical. It is a list of vectors of equal length.

使用 data.frame() 函数创建数据框。

Data Frames are created using the data.frame() function.

# Create the data frame.
BMI <- 	data.frame(
   gender = c("Male", "Male","Female"),
   height = c(152, 171.5, 165),
   weight = c(81,93, 78),
   Age = c(42,38,26)
)
print(BMI)

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

  gender height weight Age
1   Male  152.0     81  42
2   Male  171.5     93  38
3 Female  165.0     78  26