Postgresql 中文操作指南
Chapter 27. High Availability, Load Balancing, and Replication
Table of Contents
数据库服务器可以协同工作,以在主服务器发生故障时允许第二服务器快速地接管(高可用性),或者允许多台计算机提供相同的数据(负载平衡)。理想情况下,数据库服务器可以无缝地协同工作。通过仅仅将 Web 请求负载平衡到多台机器,可以非常容易地对提供静态 Web 页面的 Web 服务器进行组合。事实上,也可以相当容易地对只读数据库服务器进行组合。不幸的是,大多数数据库服务器都有读/写混合请求,而读/写服务器则难以组合。这是因为虽然只需将只读数据放在每个服务器上一次,但必须将对任何服务器的写入传播到所有服务器,以便将来对这些服务器的读取请求返回一致的结果。
Database servers can work together to allow a second server to take over quickly if the primary server fails (high availability), or to allow several computers to serve the same data (load balancing). Ideally, database servers could work together seamlessly. Web servers serving static web pages can be combined quite easily by merely load-balancing web requests to multiple machines. In fact, read-only database servers can be combined relatively easily too. Unfortunately, most database servers have a read/write mix of requests, and read/write servers are much harder to combine. This is because though read-only data needs to be placed on each server only once, a write to any server has to be propagated to all servers so that future read requests to those servers return consistent results.
这个同步问题是服务器协同工作的基本难题。由于没有单一的解决方案可以消除同步问题对所有用例的影响,因此有多种解决方案。每种解决方案都以不同的方式来应对这个问题,并针对特定的工作负载将影响降至最低。
This synchronization problem is the fundamental difficulty for servers working together. Because there is no single solution that eliminates the impact of the sync problem for all use cases, there are multiple solutions. Each solution addresses this problem in a different way, and minimizes its impact for a specific workload.
有些解决方案通过仅允许一个服务器修改数据来处理同步。可以修改数据的服务器称为读/写服务器, master 或 primary 服务器。跟踪主要数据更改的服务器称为 standby 或 secondary 服务器。在未升级为主服务器之前无法连接的备用服务器称为 warm standby 服务器,可以接受连接并提供只读查询的服务器称为 hot standby 服务器。
Some solutions deal with synchronization by allowing only one server to modify the data. Servers that can modify data are called read/write, master or primary servers. Servers that track changes in the primary are called standby or secondary servers. A standby server that cannot be connected to until it is promoted to a primary server is called a warm standby server, and one that can accept connections and serves read-only queries is called a hot standby server.
有些解决方案是同步的,这意味着只有当所有服务器都已提交事务时,才将数据修改事务视为已提交。这保证了故障转移不会丢失任何数据,并且无论查询哪一台服务器,所有负载均衡的服务器都将返回一致的结果。相比之下,异步解决方案允许提交和向其他服务器传播之间存在一些延迟,这使得可能会在切换到备份服务器时丢失一些事务,并且负载均衡的服务器可能会返回稍旧的结果。异步通信用于同步通信太慢的情况。
Some solutions are synchronous, meaning that a data-modifying transaction is not considered committed until all servers have committed the transaction. This guarantees that a failover will not lose any data and that all load-balanced servers will return consistent results no matter which server is queried. In contrast, asynchronous solutions allow some delay between the time of a commit and its propagation to the other servers, opening the possibility that some transactions might be lost in the switch to a backup server, and that load balanced servers might return slightly stale results. Asynchronous communication is used when synchronous would be too slow.
这些解决方案还可以按粒度进行分类。有些解决方案只能处理整个数据库服务器,而其他解决方案则允许在每个表或每个数据库级别进行控制。
Solutions can also be categorized by their granularity. Some solutions can deal only with an entire database server, while others allow control at the per-table or per-database level.
在进行任何选择时都必须考虑性能。通常在功能和性能之间进行权衡。例如,通过慢速网络进行的完全同步解决方案可能会将性能降低一半以上,而异步解决方案可能对性能的影响则很小。
Performance must be considered in any choice. There is usually a trade-off between functionality and performance. For example, a fully synchronous solution over a slow network might cut performance by more than half, while an asynchronous one might have a minimal performance impact.
本部分的其余部分概述了各种故障转移、复制和负载平衡解决方案。
The remainder of this section outlines various failover, replication, and load balancing solutions.