Couchdb 简明教程

CouchDB - Introduction

数据库管理系统提供存储和检索数据的机制。数据库管理系统主要有三种类型,即 RDBMS(关系数据库管理系统)、OLAP(联机分析处理系统)和 NoSQL。

Database management system provides mechanism for storage and retrieval of data. There are three main types of database management systems namely RDBMS (Relational Database management Systems), OLAP (Online Analytical Processing Systems) and NoSQL.

RDBMS

RDBMS 表示关系数据库管理系统。RDBMS 是 SQL 和所有现代数据库系统(例如 MS SQL Server、IBM DB2、Oracle、MySQL 和 Microsoft Access)的基础。

RDBMS stands for Relational Database Management System. RDBMS is the basis for SQL, and for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.

关系数据库管理系统 (RDBMS) 是一个数据库管理系统 (DBMS),其基于 E. F. Codd 引入的关系模型。

A Relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as introduced by E. F. Codd.

RDBMS 中的数据存储在名为 tables 的数据库对象中。表是相关数据项的集合,并且由列和行组成。它只存储结构化数据。

The data in RDBMS is stored in database objects called tables. The table is a collection of related data entries and it consists of columns and rows. It stores only structured data.

OLAP

联机分析处理服务器 (OLAP) 以多维数据模型为基础。它允许经理和分析师通过快速、一致且交互方式访问信息来深入了解该信息。

Online Analytical Processing Server (OLAP) is based on the multidimensional data model. It allows managers and analysts to get an insight of the information through fast, consistent, and interactive access to information.

NoSQL Databases

NoSQL 数据库(有时称为非 SQL)是除关系数据库中使用的表格关系以外提供存储和检索数据的机制的数据库。这些数据库没有架构,支持简单的复制,拥有简单的 API,最终一致,并且能处理大量数据(大数据)。

A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts of data (big data).

NoSQL 数据库的主要目标是拥有以下内容 −

The primary objective of a NoSQL database is to have the following −

  1. Simplicity of design,

  2. Horizontal scaling, and

  3. Finer control over availability.

与关系数据库相比,NoSQL 数据库使用不同的数据结构。它让一些操作在 NoSQL 中更快。给定 NoSQL 数据库的适用性取决于它必须解决的问题。这些数据库存储结构化数据和非结构化数据,如音频文件、视频文件、文档等。这些 NoSQL 数据库分为三类,如下所述。

NoSQL databases use different data structures compared to relational databases. It makes some operations faster in NoSQL. The suitability of a given NoSQL database depends on the problem it must solve. These databases store both structured data and unstructured data like audio files, video files, documents, etc. These NoSQL databases are classified into three types and they are explained below.

Key-value Store − 这些数据库旨在将数据存储在键值对中,并且这些数据库没有任何架构。在这些数据库中,每个数据值都包含一个已编制索引的键和该键的一个值。

Key-value Store − These databases are designed for storing data in key-value pairs and these databases will not have any schema. In these databases, each data value consists of an indexed key and a value for that key.

示例 − BerkeleyDB、Cassandra、DynamoDB、Riak。

Examples − BerkeleyDB, Cassandra, DynamoDB, Riak.

Column Store − 在这些数据库中,数据存储在分组为数据列的单元格中,并且这些列进一步分组到列族中。这些列族可以包含任意数量的列。

Column Store − In these databases, data is stored in cells grouped in columns of data, and these columns are further grouped into Column families. These column families can contain any number of columns.

示例 − BigTable、HBase 和 HyperTable。

Examples − BigTable, HBase, and HyperTable.

Document Store - 这是基于键值存储的基本理念开发的数据库,其中“文档”包含更多复杂的数据。此处,每个文档都会被分配一个唯一键,这个键用于检索文档。这些文档被设计用于存储、检索和管理面向文档的信息,也称为半结构化数据。

Document Store − These are the databases developed on the basic idea of key-value stores where "documents" contain more complex data. Here, each document is assigned a unique key, which is used to retrieve the document. These are designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data.

示例 − CouchDB 和 MongoDB。

Examples − CouchDB and MongoDB.

What is CouchDB?

CouchDB 是由 Apache 软件基金会开发的一个开源数据库。其重点在于易用性,涵盖网络。它是一个 NoSQL 文档存储数据库。

CouchDB is an open source database developed by Apache software foundation. The focus is on the ease of use, embracing the web. It is a NoSQL document store database.

它使用 JSON 来存储数据(文档),使用 JavaScript 作为查询语言来转换文档,使用 http 协议,通过 api 访问文档,使用 web 浏览器查询索引。它是一个多主应用,发布的时间是 2005 年,并且在 2008 年成为了一个 Apache 项目。

It uses JSON, to store data (documents), java script as its query language to transform the documents, http protocol for api to access the documents, query the indices with the web browser. It is a multi master application released in 2005 and it became an apache project in 2008.

Why CouchDB?

  1. CouchDB have an HTTP-based REST API, which helps to communicate with the database easily. And the simple structure of HTTP resources and methods (GET, PUT, DELETE) are easy to understand and use.

  2. As we store data in the flexible document-based structure, there is no need to worry about the structure of the data.

  3. Users are provided with powerful data mapping, which allows querying, combining, and filtering the information.

  4. CouchDB provides easy-to-use replication, using which you can copy, share, and synchronize the data between databases and machines.

Data Model

  1. Database is the outermost data structure/container in CouchDB.

  2. Each database is a collection of independent documents.

  3. Each document maintains its own data and self-contained schema.

  4. Document metadata contains revision information, which makes it possible to merge the differences occurred while the databases were disconnected.

  5. CouchDB implements multi version concurrency control, to avoid the need to lock the database field during writes.

Features of CouchDB:Reduce the Content

Document Storage

CouchDB 是一个 NoSQL 文档存储数据库。它提供了存储具有唯一名称的文档的功能,并且还提供了一个称为 RESTful HTTP API 的 API,用于读取和更新(添加、编辑、删除)数据库文档。

CouchDB is a document storage NoSQL database. It provides the facility of storing documents with unique names, and it also provides an API called RESTful HTTP API for reading and updating (add, edit, delete) database documents.

在 CouchDB 中,文档是数据的基本单位,它们也包括元数据。文档字段具有唯一名称,并且包含各种类型的属性值(文本、数字、布尔值、列表等);文本大小或元素数量没有固定的限制。

In CouchDB, documents are the primary unit of data and they also include metadata. Document fields are uniquely named and contain values of varying types (text, number, Boolean, lists, etc.), and there is no set limit to text size or element count.

文档更新(添加、编辑、删除)遵循原子性,即它们将被完全保存或根本不会被保存。数据库将不会有任何部分保存或编辑的文档。

Document updates (add, edit, delete) follow Atomicity, i.e., they will be saved completely or not saved at all. The database will not have any partially saved or edited documents.

Json Document Structure

{
   "field" : "value",
   "field" : "value",
   "field" : "value",
}

ACID Properties

CouchDB 包含 ACID 属性作为其一项功能。

CouchDB contains ACID properties as one of its features.

一致性 − 当 CouchDB 中的数据被提交一次时,就不允许修改或覆盖该数据。因此,CouchDB 确保数据库文件将始终处于一致的状态。

Consistency − When the data in CouchDB was once committed, then this data will not be modified or overwritten. Thus, CouchDB ensures that the database file will always be in a consistent state.

CouchDB reads 使用多版本并发控制 (MVCC) 模型,因此客户端将从读取操作的开始到结束看到数据库的一致快照。

A multi-Version Concurrency Control (MVCC) model is used by CouchDB reads, because of which the client will see a consistent snapshot of the database from the beginning to the end of the read operation.

每当更新文档时,CouchDB 便会将数据写入到磁盘中,并将更新后的数据库标头写入两个连续且相同的块中,以构成文件的前 4k,然后同步写入磁盘。写入期间的部分更新会丢弃。

Whenever a documents is updated, CouchDB flushes the data into the disk, and the updated database header is written in two consecutive and identical chunks to make up the first 4k of the file, and then synchronously flushed to disk. Partial updates during the flush will be discarded.

如果写入标头时出现故障,先前相同的标头副本将继续保留,从而保证所有先前提交数据的一致性。除了标头区域之外,在崩溃或断电后永远不需要执行一致性检查或修复。

If the failure occurred while committing the header, a surviving copy of the previous identical headers will remain, ensuring coherency of all previously committed data. Except the header area, consistency checks or fix-ups after a crash or a power failure are never necessary.

Compaction

每当数据库文件中空间浪费超过某个程度时,所有活动数据都会被复制(克隆)到一个新文件中。复制过程完全完成后,旧文件将被丢弃。所有这些都是由压缩过程完成的。数据库在压缩期间保持联机,并且允许所有更新和读取都成功完成。

Whenever the space in the database file got wasted above certain extent, all the active data will be copied (cloned) to a new file. When the copying process is entirely done, then the old file will be discarded. All this is done by compaction process. The database remains online during the compaction and all updates and reads are allowed to complete successfully.

Views

CouchDB 中的数据存储在半结构化文档中,这些文档以各个隐式结构为基础非常灵活,但它是一个用于数据存储和共享的简单文档模型。如果我们希望以多种不同的方式查看数据,我们需要一种方法来筛选、组织和报告尚未分解为表的数据。

Data in CouchDB is stored in semi-structured documents that are flexible with individual implicit structures, but it is a simple document model for data storage and sharing. If we want see our data in many different ways, we need a way to filter, organize and report on data that hasn’t been decomposed into tables.

为了解决这个问题,CouchDB 提供一个视图模型。视图是聚合同数据库中文档并对其报告的方法,并且根据需要进行构建以聚合、联接和报告数据库文档。由于视图是动态构建的,且不会影响基础文档,因此您可以拥有任意多个不同视图表示相同的同一数据。

To solve this problem, CouchDB provides a view model. Views are the method of aggregating and reporting on the documents in a database, and are built on-demand to aggregate, join and report on database documents. Because views are built dynamically and don’t affect the underlying document, you can have as many different view representations of the same data as you like.

History

  1. CouchDB was written in Erlang programming language.

  2. It was started by Damien Katz in 2005.

  3. CouchDB became an Apache project in 2008.

CouchDB 的当前版本是 1.61。

The current version of CouchDB is 1.61.