Git 简明教程

Git - Basic Concepts

Version Control System

Version Control System (VCS) 是一款软件,用于帮助软件开发人员共同协作并维护其工作完整记录。

Version Control System (VCS) is a software that helps software developers to work together and maintain a complete history of their work.

以下列出了 VCS 的功能 −

Listed below are the functions of a VCS −

  1. Allows developers to work simultaneously.

  2. Does not allow overwriting each other’s changes.

  3. Maintains a history of every version.

以下为 VCS 的类型 −

Following are the types of VCS −

  1. Centralized version control system (CVCS).

  2. Distributed/Decentralized version control system (DVCS).

在本章中,我们将仅集中讨论分布式版本控制系统,特别是 Git。Git 属于分布式版本控制系统。

In this chapter, we will concentrate only on distributed version control system and especially on Git. Git falls under distributed version control system.

Distributed Version Control System

集中式版本控制系统 (CVCS) 使用中央服务器存储所有文件,并实现团队协作。但 CVCS 的主要缺点在于它的单点故障,即中央服务器发生故障。不幸的是,如果中央服务器中断一个小时,那么在这个小时内,根本没有人可以协作。在最坏的情况下,如果中央服务器的磁盘损坏且未进行适当的备份,那么您将丢失项目的整个历史记录。此处,分布式版本控制系统 (DVCS) 发挥了作用。

Centralized version control system (CVCS) uses a central server to store all files and enables team collaboration. But the major drawback of CVCS is its single point of failure, i.e., failure of the central server. Unfortunately, if the central server goes down for an hour, then during that hour, no one can collaborate at all. And even in a worst case, if the disk of the central server gets corrupted and proper backup has not been taken, then you will lose the entire history of the project. Here, distributed version control system (DVCS) comes into picture.

DVCS 客户端不仅可以签出目录的最新快照,还完全镜像仓库。如果服务器发生故障,那么可以从任何客户端将仓库复制回服务器,以便恢复它。每次签出都是仓库的完整备份。Git 不依赖于中央服务器,这就是为什么您在离线时可以执行许多操作的原因。您可以在离线时提交更改、创建分支、查看日志并执行其他操作。您仅需要网络连接即可发布更改并获取最新更改。

DVCS clients not only check out the latest snapshot of the directory but they also fully mirror the repository. If the server goes down, then the repository from any client can be copied back to the server to restore it. Every checkout is a full backup of the repository. Git does not rely on the central server and that is why you can perform many operations when you are offline. You can commit changes, create branches, view logs, and perform other operations when you are offline. You require network connection only to publish your changes and take the latest changes.

Advantages of Git

Free and open source

Git 在 GPL 开源许可下发布。它可免费在网上获得。您无需支付一分钱即可使用 Git 来管理房地产项目。由于它是开源的,您可以下载其源代码,并根据您的要求执行更改。

Git is released under GPL’s open source license. It is available freely over the internet. You can use Git to manage property projects without paying a single penny. As it is an open source, you can download its source code and also perform changes according to your requirements.

Fast and small

由于大多数操作都将在本地执行,因此它在速度方面具有巨大的优势。Git 不依赖于中央服务器;这就是为什么无需为每个操作与远程服务器进行交互的原因。Git 的核心部分是用 C 语言编写的,这避免了与其他高级语言相关的运行时开销。尽管 Git 镜像整个仓库,但在客户端上的数据大小却很小。这说明了 Git 在客户端压缩和存储数据方面的效率。

As most of the operations are performed locally, it gives a huge benefit in terms of speed. Git does not rely on the central server; that is why, there is no need to interact with the remote server for every operation. The core part of Git is written in C, which avoids runtime overheads associated with other high-level languages. Though Git mirrors entire repository, the size of the data on the client side is small. This illustrates the efficiency of Git at compressing and storing data on the client side.

Implicit backup

当存在多个副本时,丢失数据的机会非常罕见。客户机侧上的数据镜像仓库,因此可以在发生崩溃或磁盘损坏时使用它。

The chances of losing data are very rare when there are multiple copies of it. Data present on any client side mirrors the repository, hence it can be used in the event of a crash or disk corruption.

Security

Git 使用一种称为安全散列函数 (SHA1) 的常见加密散列函数,在数据库中对对象命名和标识。每次签出时,都会对每个文件和提交进行校验和,并通过其校验和进行检索。这意味着,在不知道 Git 的情况下,无法从 Git 数据库更改文件、日期和提交消息以及任何其他数据。

Git uses a common cryptographic hash function called secure hash function (SHA1), to name and identify objects within its database. Every file and commit is check-summed and retrieved by its checksum at the time of checkout. It implies that, it is impossible to change file, date, and commit message and any other data from the Git database without knowing Git.

No need of powerful hardware

对于 CVCS,中央服务器需要足够强大,才能为整个团队提供请求。对于较小的团队而言,这并不是问题,但随着团队规模的扩大,服务器的硬件限制可能会成为性能瓶颈。对于 DVCS,开发人员无需与服务器进行交互,除非他们需要推送或拉取更改。所有繁重的任务都在客户端上进行,因此服务器硬件确实可以非常简单。

In case of CVCS, the central server needs to be powerful enough to serve requests of the entire team. For smaller teams, it is not an issue, but as the team size grows, the hardware limitations of the server can be a performance bottleneck. In case of DVCS, developers don’t interact with the server unless they need to push or pull changes. All the heavy lifting happens on the client side, so the server hardware can be very simple indeed.

Easier branching

CVCS 使用廉价复制机制,如果我们创建一个新分支,它将所有代码复制到新分支,因此非常耗时且效率低下。此外,在 CVCS 中删除和合并分支很复杂且耗时。但使用 Git 进行分支管理非常简单。创建、删除和合并分支只需几秒钟即可完成。

CVCS uses cheap copy mechanism, If we create a new branch, it will copy all the codes to the new branch, so it is time-consuming and not efficient. Also, deletion and merging of branches in CVCS is complicated and time-consuming. But branch management with Git is very simple. It takes only a few seconds to create, delete, and merge branches.

DVCS Terminologies

Local Repository

每个 VCS 工具都提供一个私有工作区作为工作副本。开发人员在其私有工作区中进行更改,而在提交后,这些更改将成为仓库的一部分。Git 将其更进一步,为他们提供了整个仓库的私有副本。用户可以使用此仓库执行许多操作,例如添加文件、移除文件、重命名文件、移动文件、提交更改等许多操作。

Every VCS tool provides a private workplace as a working copy. Developers make changes in their private workplace and after commit, these changes become a part of the repository. Git takes it one step further by providing them a private copy of the whole repository. Users can perform many operations with this repository such as add file, remove file, rename file, move file, commit changes, and many more.

Working Directory and Staging Area or Index

工作目录是签出文件的位置。在其他 CVCS 中,开发人员通常进行修改并直接提交其更改到仓库。但是 Git 使用了一种不同的策略。Git 不会跟踪每个修改过的文件。每当您执行提交操作时,Git 都会查找暂存区中存在的文件。只有暂存区中存在的文件才会提交,而并非所有修改过的文件。

The working directory is the place where files are checked out. In other CVCS, developers generally make modifications and commit their changes directly to the repository. But Git uses a different strategy. Git doesn’t track each and every modified file. Whenever you do commit an operation, Git looks for the files present in the staging area. Only those files present in the staging area are considered for commit and not all the modified files.

让我们看看 Git 的基本工作流。

Let us see the basic workflow of Git.

Step 1 − 你修改了工作目录中的一个文件。

Step 1 − You modify a file from the working directory.

Step 2 − 你将这些文件添加到暂存区。

Step 2 − You add these files to the staging area.

Step 3 − 你执行提交操作,该操作将文件从暂存区移动。在执行推送操作后,它会将这些更改永久地存储到 Git 存储库中。

Step 3 − You perform commit operation that moves the files from the staging area. After push operation, it stores the changes permanently to the Git repository.

staging area

假设你修改了两个文件,即“sort.c”和“search.c”,并且你希望对每个操作进行两次不同的提交。你可以将一个文件添加到暂存区并进行提交。在第一次提交后,对另一个文件重复相同的过程。

Suppose you modified two files, namely “sort.c” and “search.c” and you want two different commits for each operation. You can add one file in the staging area and do commit. After the first commit, repeat the same procedure for another file.

# First commit
[bash]$ git add sort.c

# adds file to the staging area
[bash]$ git commit –m “Added sort operation”

# Second commit
[bash]$ git add search.c

# adds file to the staging area
[bash]$ git commit –m “Added search operation”

Blobs

Blob 表示 二进制 对象。文件的每个版本都由 blob 表示。blob 保存着文件数据,但不包含任何关于该文件元数据。它是一个二进制文件,在 Git 数据库中,它被命名为该文件 SHA1 哈希值。在 Git 中,文件并非通过名称来寻址。每种文件都是根据其内容进行寻址的。

Blob stands for *B*inary *L*arge *Ob*ject. Each version of a file is represented by blob. A blob holds the file data but doesn’t contain any metadata about the file. It is a binary file, and in Git database, it is named as SHA1 hash of that file. In Git, files are not addressed by names. Everything is content-addressed.

Trees

Tree 是一个对象,表示一个目录。它保存着 blob 和其他子目录。tree 是一个二进制文件,它存储着对 blob 和 tree 的引用,它们也被命名为树对象的 SHA1 哈希值。

Tree is an object, which represents a directory. It holds blobs as well as other sub-directories. A tree is a binary file that stores references to blobs and trees which are also named as SHA1 hash of the tree object.

Commits

Commit 存储着存储库的当前状态。commit 也被 SHA1 哈希值命名。你可以将 commit 对象视作一个链表的节点。每个 commit 对象都有一个父级 commit 对象指针。从给定的 commit 开始,你可以通过查看父级指针来追溯历史提交。如果一个 commit 有多个父级 commit,那么该特定 commit 是通过合并两个分支创建的。

Commit holds the current state of the repository. A commit is also named by SHA1 hash. You can consider a commit object as a node of the linked list. Every commit object has a pointer to the parent commit object. From a given commit, you can traverse back by looking at the parent pointer to view the history of the commit. If a commit has multiple parent commits, then that particular commit has been created by merging two branches.

Branches

Branch 用来创建另一个开发线路。默认情况下,Git 有一个 master 分支,它与 Subversion 的 trunk 相同。一般来说,一个分支被创建来开发一个新功能。一旦该功能完成,它就会被合并回 master 分支,然后我们删除这个分支。每个分支都通过 HEAD 引用,该引用指向分支中的最新 commit。每当你进行一次 commit,HEAD 就会被更新为最新 commit。

Branches are used to create another line of development. By default, Git has a master branch, which is same as trunk in Subversion. Usually, a branch is created to work on a new feature. Once the feature is completed, it is merged back with the master branch and we delete the branch. Every branch is referenced by HEAD, which points to the latest commit in the branch. Whenever you make a commit, HEAD is updated with the latest commit.

Tags

Tag 使用一个有意义的名称为存储库中的一个特定版本命名。Tag 与分支非常相似,但不同之处在于 tag 是不可变的。这意味着,tag 是一个分支,没有人打算对其进行修改。一旦为特定 commit 创建了一个 tag,即使你创建了一个新的 commit,它也不会被更新。一般来说,开发人员会在产品发布时创建 tag。

Tag assigns a meaningful name with a specific version in the repository. Tags are very similar to branches, but the difference is that tags are immutable. It means, tag is a branch, which nobody intends to modify. Once a tag is created for a particular commit, even if you create a new commit, it will not be updated. Usually, developers create tags for product releases.

Clone

Clone 操作创建存储库的实例。Clone 操作不仅会检出工作副本,还会镜像整个存储库。用户可以使用 此本地存储库执行许多操作。进行网络交互的唯一时间是存储库实例正在同步的时候。

Clone operation creates the instance of the repository. Clone operation not only checks out the working copy, but it also mirrors the complete repository. Users can perform many operations with this local repository. The only time networking gets involved is when the repository instances are being synchronized.

Pull

Pull 操作将更改从一个远程存储库实例复制到一个本地存储库实例。pull 操作用于在两个存储库实例之间执行同步。这与 Subversion 中的更新操作相同。

Pull operation copies the changes from a remote repository instance to a local one. The pull operation is used for synchronization between two repository instances. This is same as the update operation in Subversion.

Push

Push 操作将更改从一个本地存储库实例复制到一个远程存储库实例。这用于将更改永久地存储到 Git 存储库中。这与 Subversion 中的 commit 操作相同。

Push operation copies changes from a local repository instance to a remote one. This is used to store the changes permanently into the Git repository. This is same as the commit operation in Subversion.

HEAD

HEAD 是一个指针,它总是指向分支中的最新 commit。每当你进行一次 commit,HEAD 就会被更新为最新 commit。分支的头部存储在 .git/refs/heads/ 目录中。

HEAD is a pointer, which always points to the latest commit in the branch. Whenever you make a commit, HEAD is updated with the latest commit. The heads of the branches are stored in .git/refs/heads/ directory.

[CentOS]$ ls -1 .git/refs/heads/
master

[CentOS]$ cat .git/refs/heads/master
570837e7d58fa4bccd86cb575d884502188b0c49

Revision

Revision 表示源代码的版本。Git 中的 Revision 由 commit 表示。这些 commit 通过 SHA1 安全哈希值来识别。

Revision represents the version of the source code. Revisions in Git are represented by commits. These commits are identified by SHA1 secure hashes.

URL

URL 表示 Git 存储库的位置。Git URL 存储在配置文件中。

URL represents the location of the Git repository. Git URL is stored in config file.

[tom@CentOS tom_repo]$ pwd
/home/tom/tom_repo

[tom@CentOS tom_repo]$ cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[remote "origin"]
url = gituser@git.server.com:project.git
fetch = +refs/heads/*:refs/remotes/origin/*