Zookeeper 简明教程
Zookeeper - Overview
ZooKeeper 是一种分布式协调服务,用于管理大量主机。在分布式环境中协调和管理服务是一个复杂的过程。ZooKeeper 通过其简单的架构和 API 解决此问题。ZooKeeper 使开发人员能够专注于核心应用程序逻辑,而不必担心应用程序的分布式性质。
ZooKeeper is a distributed co-ordination service to manage large set of hosts. Co-ordinating and managing a service in a distributed environment is a complicated process. ZooKeeper solves this issue with its simple architecture and API. ZooKeeper allows developers to focus on core application logic without worrying about the distributed nature of the application.
ZooKeeper 框架最初是在“Yahoo!”构建的,目的是以一种简单而稳健的方式访问其应用程序。后来,Apache ZooKeeper 成为 Hadoop、HBase 和其他分布式框架使用的有组织服务的标准。例如,Apache HBase 使用 ZooKeeper 来跟踪分布式数据的状态。
The ZooKeeper framework was originally built at “Yahoo!” for accessing their applications in an easy and robust manner. Later, Apache ZooKeeper became a standard for organized service used by Hadoop, HBase, and other distributed frameworks. For example, Apache HBase uses ZooKeeper to track the status of distributed data.
在我们继续之前,了解分布式应用程序的一些内容非常重要。那么,让我们以分布式应用程序的快速概述开始讨论。
Before moving further, it is important that we know a thing or two about distributed applications. So, let us start the discussion with a quick overview of distributed applications.
Distributed Application
分布式应用程序可以通过在它们之间进行协调来同时在网络中的多个系统上以快速高效的方式运行,以完成一项特定任务。通常,分布式应用程序可以通过使用所有涉及的系统的计算能力,在数分钟内完成一项复杂且耗时的任务,而一个非分布式应用程序(在一个系统中运行)完成该任务需要数小时。
A distributed application can run on multiple systems in a network at a given time (simultaneously) by coordinating among themselves to complete a particular task in a fast and efficient manner. Normally, complex and time-consuming tasks, which will take hours to complete by a non-distributed application (running in a single system) can be done in minutes by a distributed application by using computing capabilities of all the system involved.
通过配置分布式应用程序在更多系统上运行可以进一步缩短完成任务的时间。一个分布式应用程序正在运行的系统组称为 Cluster ,在集群中运行的每台机器称为 Node 。
The time to complete the task can be further reduced by configuring the distributed application to run on more systems. A group of systems in which a distributed application is running is called a Cluster and each machine running in a cluster is called a Node.
一个分布式应用程序有两部分, Server 和 Client 应用程序。服务器应用程序实际上是分布式的,并具有一个共同的接口,以便客户端可以连接到集群中的任何一台服务器并获取相同的结果。客户端应用程序是与分布式应用程序进行交互的工具。
A distributed application has two parts, Server and Client application. Server applications are actually distributed and have a common interface so that clients can connect to any server in the cluster and get the same result. Client applications are the tools to interact with a distributed application.

Benefits of Distributed Applications
-
Reliability − Failure of a single or a few systems does not make the whole system to fail.
-
Scalability − Performance can be increased as and when needed by adding more machines with minor change in the configuration of the application with no downtime.
-
Transparency − Hides the complexity of the system and shows itself as a single entity / application.
Challenges of Distributed Applications
-
Race condition − Two or more machines trying to perform a particular task, which actually needs to be done only by a single machine at any given time. For example, shared resources should only be modified by a single machine at any given time.
-
Deadlock − Two or more operations waiting for each other to complete indefinitely.
-
Inconsistency − Partial failure of data.
What is Apache ZooKeeper Meant For?
Apache ZooKeeper 是一项服务,由集群(节点组)使用,用于在它们之间进行协调,并以稳健的同步技术维护共享数据。ZooKeeper 本身是一个分布式应用程序,它提供编写分布式应用程序的服务。
Apache ZooKeeper is a service used by a cluster (group of nodes) to coordinate between themselves and maintain shared data with robust synchronization techniques. ZooKeeper is itself a distributed application providing services for writing a distributed application.
ZooKeeper 提供的常见服务如下所示 -
The common services provided by ZooKeeper are as follows −
-
Naming service − Identifying the nodes in a cluster by name. It is similar to DNS, but for nodes.
-
Configuration management − Latest and up-to-date configuration information of the system for a joining node.
-
Cluster management − Joining / leaving of a node in a cluster and node status at real time.
-
Leader election − Electing a node as leader for coordination purpose.
-
Locking and synchronization service − Locking the data while modifying it. This mechanism helps you in automatic fail recovery while connecting other distributed applications like Apache HBase.
-
Highly reliable data registry − Availability of data even when one or a few nodes are down.
分布式应用程序提供了很多好处,但也带来了一些复杂且难以破解的挑战。ZooKeeper 框架提供了一种完整的机制来克服所有挑战。竞态条件和死锁使用 fail-safe synchronization approach 处理。另一个主要缺点是数据不一致,ZooKeeper 使用 atomicity 解决此问题。
Distributed applications offer a lot of benefits, but they throw a few complex and hard-to-crack challenges as well. ZooKeeper framework provides a complete mechanism to overcome all the challenges. Race condition and deadlock are handled using fail-safe synchronization approach. Another main drawback is inconsistency of data, which ZooKeeper resolves with atomicity.
Benefits of ZooKeeper
以下是在使用 ZooKeeper 时的优点 -
Here are the benefits of using ZooKeeper −
-
Simple distributed coordination process
-
Synchronization − Mutual exclusion and co-operation between server processes. This process helps in Apache HBase for configuration management.
-
Ordered Messages
-
Serialization − Encode the data according to specific rules. Ensure your application runs consistently. This approach can be used in MapReduce to coordinate queue to execute running threads.
-
Reliability
-
Atomicity − Data transfer either succeed or fail completely, but no transaction is partial.
Zookeeper - Fundamentals
在深入探讨 ZooKeeper 的工作原理之前,让我们了解一下 ZooKeeper 的基本概念。在本章中,我们将讨论以下主题:
Before going deep into the working of ZooKeeper, let us take a look at the fundamental concepts of ZooKeeper. We will discuss the following topics in this chapter −
-
Architecture
-
Hierarchical namespace
-
Session
-
Watches
Architecture of ZooKeeper
请看以下图表。它描述了 ZooKeeper 的“客户端-服务器架构”。
Take a look at the following diagram. It depicts the “Client-Server Architecture” of ZooKeeper.

ZooKeeper 架构的一部分的每个组件在以下表格中都有说明。
Each one of the components that is a part of the ZooKeeper architecture has been explained in the following table.
Part |
Description |
Client |
Clients, one of the nodes in our distributed application cluster, access information from the server. For a particular time interval, every client sends a message to the server to let the sever know that the client is alive. Similarly, the server sends an acknowledgement when a client connects. If there is no response from the connected server, the client automatically redirects the message to another server. |
Server |
Server, one of the nodes in our ZooKeeper ensemble, provides all the services to clients. Gives acknowledgement to client to inform that the server is alive. |
Ensemble |
Group of ZooKeeper servers. The minimum number of nodes that is required to form an ensemble is 3. |
Leader |
Server node which performs automatic recovery if any of the connected node failed. Leaders are elected on service startup. |
Follower |
Server node which follows leader instruction. |
Hierarchical Namespace
以下图表描述了用于内存表示的 ZooKeeper 文件系统的树结构。ZooKeeper 节点称为 znode 。每个 znode 由一个名称标识并由路径(/)序列分隔。
The following diagram depicts the tree structure of ZooKeeper file system used for memory representation. ZooKeeper node is referred as znode. Every znode is identified by a name and separated by a sequence of path (/).
-
In the diagram, first you have a root znode separated by “/”. Under root, you have two logical namespaces config and workers.
-
The config namespace is used for centralized configuration management and the workers namespace is used for naming.
-
Under config namespace, each znode can store upto 1MB of data. This is similar to UNIX file system except that the parent znode can store data as well. The main purpose of this structure is to store synchronized data and describe the metadata of the znode. This structure is called as ZooKeeper Data Model.

ZooKeeper 数据模型中的每个 znode 都维护一个 stat 结构。一个状态信息简单地提供了 znode 的 metadata 。它包括版本号、操作控制列表 (ACL)、时间戳和数据长度。
Every znode in the ZooKeeper data model maintains a stat structure. A stat simply provides the metadata of a znode. It consists of Version number, Action control list (ACL), Timestamp, and Data length.
-
Version number − Every znode has a version number, which means every time the data associated with the znode changes, its corresponding version number would also increased. The use of version number is important when multiple zookeeper clients are trying to perform operations over the same znode.
-
Action Control List (ACL) − ACL is basically an authentication mechanism for accessing the znode. It governs all the znode read and write operations.
-
Timestamp − Timestamp represents time elapsed from znode creation and modification. It is usually represented in milliseconds. ZooKeeper identifies every change to the znodes from “Transaction ID” (zxid). Zxid is unique and maintains time for each transaction so that you can easily identify the time elapsed from one request to another request.
-
Data length − Total amount of the data stored in a znode is the data length. You can store a maximum of 1MB of data.
Types of Znodes
znode 被归类为持久性、顺序性和临时性。
Znodes are categorized as persistence, sequential, and ephemeral.
-
Persistence znode − Persistence znode is alive even after the client, which created that particular znode, is disconnected. By default, all znodes are persistent unless otherwise specified.
-
Ephemeral znode − Ephemeral znodes are active until the client is alive. When a client gets disconnected from the ZooKeeper ensemble, then the ephemeral znodes get deleted automatically. For this reason, only ephemeral znodes are not allowed to have a children further. If an ephemeral znode is deleted, then the next suitable node will fill its position. Ephemeral znodes play an important role in Leader election.
-
Sequential znode − Sequential znodes can be either persistent or ephemeral. When a new znode is created as a sequential znode, then ZooKeeper sets the path of the znode by attaching a 10 digit sequence number to the original name. For example, if a znode with path /myapp is created as a sequential znode, ZooKeeper will change the path to /myapp0000000001 and set the next sequence number as 0000000002. If two sequential znodes are created concurrently, then ZooKeeper never uses the same number for each znode. Sequential znodes play an important role in Locking and Synchronization.
Sessions
会话对于 ZooKeeper 的操作非常重要。会话中的请求按先进先出的顺序执行。客户端连接到服务器后,将建立会话,并为客户端分配一个会话 session id 。
Sessions are very important for the operation of ZooKeeper. Requests in a session are executed in FIFO order. Once a client connects to a server, the session will be established and a session id is assigned to the client.
客户端在特定时间间隔发送 heartbeats 以保持会话有效。如果 ZooKeeper 集群未从客户端收到心跳时间超过在服务启动时指定的周期(会话超时),它将判定客户端已死亡。
The client sends heartbeats at a particular time interval to keep the session valid. If the ZooKeeper ensemble does not receive heartbeats from a client for more than the period (session timeout) specified at the starting of the service, it decides that the client died.
会话超时通常以毫秒为单位表示。当会话因任何原因而结束时,在此会话期间创建的临时 znode 也会被删除。
Session timeouts are usually represented in milliseconds. When a session ends for any reason, the ephemeral znodes created during that session also get deleted.
Watches
监控器是一种简单的机制,供客户端获得 ZooKeeper 集群中更改的通知。客户端可以在读取特定 znode 时设置监控。监控器向注册的客户端发送有关 znode(客户端上进行注册的 znode)中任何更改的通知。
Watches are a simple mechanism for the client to get notifications about the changes in the ZooKeeper ensemble. Clients can set watches while reading a particular znode. Watches send a notification to the registered client for any of the znode (on which client registers) changes.
Znode 更改是对与 znode 关联的数据的修改或对 znode 子项的修改。监控器只会触发一次。如果客户端再次需要通知,必须通过另一项读取操作来完成。当连接会话过期时,客户端将与服务器断开连接,关联的监控器也将被移除。
Znode changes are modification of data associated with the znode or changes in the znode’s children. Watches are triggered only once. If a client wants a notification again, it must be done through another read operation. When a connection session is expired, the client will be disconnected from the server and the associated watches are also removed.
Zookeeper - Workflow
当 ZooKeeper 协同程序启动后,会等待客户端连接。客户端将连接到 ZooKeeper 协同程序中的一个节点。可以是领导者节点或追随者节点。一旦客户端连接,节点会为特定客户端分配会话 ID,并向客户端发送确认。如果客户端没有收到确认,它会尝试连接 ZooKeeper 协同程序中的另一个节点。连接到一个节点后,客户端会定期向该节点发送心跳,以确保连接不会丢失。
Once a ZooKeeper ensemble starts, it will wait for the clients to connect. Clients will connect to one of the nodes in the ZooKeeper ensemble. It may be a leader or a follower node. Once a client is connected, the node assigns a session ID to the particular client and sends an acknowledgement to the client. If the client does not get an acknowledgment, it simply tries to connect another node in the ZooKeeper ensemble. Once connected to a node, the client will send heartbeats to the node in a regular interval to make sure that the connection is not lost.
-
If a client wants to read a particular znode, it sends a read request to the node with the znode path and the node returns the requested znode by getting it from its own database. For this reason, reads are fast in ZooKeeper ensemble.
-
If a client wants to store data in the ZooKeeper ensemble, it sends the znode path and the data to the server. The connected server will forward the request to the leader and then the leader will reissue the writing request to all the followers. If only a majority of the nodes respond successfully, then the write request will succeed and a successful return code will be sent to the client. Otherwise, the write request will fail. The strict majority of nodes is called as Quorum.
Nodes in a ZooKeeper Ensemble
让我们分析同时在 ZooKeeper 协同程序中拥有不同数量的节点的影响。
Let us analyze the effect of having different number of nodes in the ZooKeeper ensemble.
-
If we have a single node, then the ZooKeeper ensemble fails when that node fails. It contributes to “Single Point of Failure” and it is not recommended in a production environment.
-
If we have two nodes and one node fails, we don’t have majority as well, since one out of two is not a majority.
-
If we have three nodes and one node fails, we have majority and so, it is the minimum requirement. It is mandatory for a ZooKeeper ensemble to have at least three nodes in a live production environment.
-
If we have four nodes and two nodes fail, it fails again and it is similar to having three nodes. The extra node does not serve any purpose and so, it is better to add nodes in odd numbers, e.g., 3, 5, 7.
我们知道一个写入进程在 ZooKeeper 协同程序中的开销要大于一个读取进程,因为所有的节点都需要在其数据库中写入相同的数据。因此,为了实现一个均衡的环境,拥有较少数量的节点(3、5 或 7)要优于拥有大量节点。
We know that a write process is expensive than a read process in ZooKeeper ensemble, since all the nodes need to write the same data in its database. So, it is better to have less number of nodes (3, 5 or 7) than having a large number of nodes for a balanced environment.
下图描绘了 ZooKeeper 工作流,而随后的表格解释了其不同的组件。
The following diagram depicts the ZooKeeper WorkFlow and the subsequent table explains its different components.

Component |
Description |
Write |
Write process is handled by the leader node. The leader forwards the write request to all the znodes and waits for answers from the znodes. If half of the znodes reply, then the write process is complete. |
Read |
Reads are performed internally by a specific connected znode, so there is no need to interact with the cluster. |
Replicated Database |
It is used to store data in zookeeper. Each znode has its own database and every znode has the same data at every time with the help of consistency. |
Leader |
Leader is the Znode that is responsible for processing write requests. |
Follower |
Followers receive write requests from the clients and forward them to the leader znode. |
Request Processor |
Present only in leader node. It governs write requests from the follower node. |
Atomic broadcasts |
Responsible for broadcasting the changes from the leader node to the follower nodes. |
Zookeeper - Leader Election
让我们分析一下如何在 ZooKeeper 协同程序中选举一个领导者节点。考虑集群中存在 N 节点。领导者选举过程如下 −
Let us analyze how a leader node can be elected in a ZooKeeper ensemble. Consider there are N number of nodes in a cluster. The process of leader election is as follows −
-
All the nodes create a sequential, ephemeral znode with the same path, /app/leader_election/guid_.
-
ZooKeeper ensemble will append the 10-digit sequence number to the path and the znode created will be /app/leader_election/guid_0000000001, /app/leader_election/guid_0000000002, etc.
-
For a given instance, the node which creates the smallest number in the znode becomes the leader and all the other nodes are followers.
-
Each follower node watches the znode having the next smallest number. For example, the node which creates znode /app/leader_election/guid_0000000008 will watch the znode /app/leader_election/guid_0000000007 and the node which creates the znode /app/leader_election/guid_0000000007 will watch the znode /app/leader_election/guid_0000000006.
-
If the leader goes down, then its corresponding znode /app/leader_electionN gets deleted.
-
The next in line follower node will get the notification through watcher about the leader removal.
-
The next in line follower node will check if there are other znodes with the smallest number. If none, then it will assume the role of the leader. Otherwise, it finds the node which created the znode with the smallest number as leader.
-
Similarly, all other follower nodes elect the node which created the znode with the smallest number as leader.
当从头开始时,领导选举是复杂的过程。但 ZooKeeper 服务使其非常简单。让我们在下一章转到用于开发目的的 ZooKeeper 安装。
Leader election is a complex process when it is done from scratch. But ZooKeeper service makes it very simple. Let us move on to the installation of ZooKeeper for development purpose in the next chapter.
Zookeeper - Installation
在安装 ZooKeeper 之前,请确保您的系统在下列任何操作系统上运行:
Before installing ZooKeeper, make sure your system is running on any of the following operating systems −
-
Any of Linux OS − Supports development and deployment. It is preferred for demo applications.
-
Windows OS − Supports only development.
-
Mac OS − Supports only development.
ZooKeeper 服务器在 Java 中创建,并在 JVM 上运行。您需要使用 JDK 6 或更高版本。
ZooKeeper server is created in Java and it runs on JVM. You need to use JDK 6 or greater.
现在,按照以下步骤在您的机器上安装 ZooKeeper 框架。
Now, follow the steps given below to install ZooKeeper framework on your machine.
Step 1: Verifying Java Installation
我们相信您已经在系统上安装了 Java 环境。只需使用以下命令进行验证。
We believe you already have a Java environment installed on your system. Just verify it using the following command.
$ java -version
如果您的机器上已安装 Java,那么您可以看到已安装 Java 的版本。否则,请按照以下简单步骤安装最新版本的 Java。
If you have Java installed on your machine, then you could see the version of installed Java. Otherwise, follow the simple steps given below to install the latest version of Java.
Step 1.1: Download JDK
访问以下链接并下载 Java 的最新版本。 Java
Download the latest version of JDK by visiting the following link and download the latest version. Java
最新版本(在编写本教程时)是 JDK 8u 60,文件为“jdk-8u60-linuxx64.tar.gz”。请下载该文件到您的机器上。
The latest version (while writing this tutorial) is JDK 8u 60 and the file is “jdk-8u60-linuxx64.tar.gz”. Please download the file on your machine.
Step 1.2: Extract the files
通常,文件会下载到 downloads 文件夹。验证此文件夹并使用以下命令提取tar设置。
Generally, files are downloaded to the downloads folder. Verify it and extract the tar setup using the following commands.
$ cd /go/to/download/path
$ tar -zxf jdk-8u60-linux-x64.gz
Step 1.3: Move to opt directory
为了让所有用户都可以使用 Java,将提取的 Java 内容移动到“/usr/local/java”文件夹中。
To make Java available to all users, move the extracted java content to “/usr/local/java” folder.
$ su
password: (type password of root user)
$ mkdir /opt/jdk
$ mv jdk-1.8.0_60 /opt/jdk/
Step 1.4: Set path
若要设置 path 及 JAVA_HOME 变量,请将以下命令添加到 ~/.bashrc 文件中。
To set path and JAVA_HOME variables, add the following commands to ~/.bashrc file.
export JAVA_HOME = /usr/jdk/jdk-1.8.0_60
export PATH=$PATH:$JAVA_HOME/bin
现在,将所有更改应用到当前正在运行的系统中。
Now, apply all the changes into the current running system.
$ source ~/.bashrc
Step 2: ZooKeeper Framework Installation
Step 2.1: Download ZooKeeper
若要在你的机器上安装ZooKeeper框架,请访问以下链接并下载ZooKeeper的最新版本。 http://zookeeper.apache.org/releases.html
To install ZooKeeper framework on your machine, visit the following link and download the latest version of ZooKeeper. http://zookeeper.apache.org/releases.html
到目前为止,ZooKeeper 的最新版本是 3.4.6 (ZooKeeper-3.4.6.tar.gz)。
As of now, the latest version of ZooKeeper is 3.4.6 (ZooKeeper-3.4.6.tar.gz).
Step 2.2: Extract the tar file
使用以下命令提取 tar 文件 −
Extract the tar file using the following commands −
$ cd opt/
$ tar -zxf zookeeper-3.4.6.tar.gz
$ cd zookeeper-3.4.6
$ mkdir data
Step 2.3: Create configuration file
使用命令 vi conf/zoo.cfg 打开名为 conf/zoo.cfg 的配置文件,并将所有以下参数设置为起始点。
Open the configuration file named conf/zoo.cfg using the command vi conf/zoo.cfg and all the following parameters to set as starting point.
$ vi conf/zoo.cfg
tickTime = 2000
dataDir = /path/to/zookeeper/data
clientPort = 2181
initLimit = 5
syncLimit = 2
成功保存配置文件后,再次返回终端。你现在可以启动zookeeper服务器。
Once the configuration file has been saved successfully, return to the terminal again. You can now start the zookeeper server.
Step 2.4: Start ZooKeeper server
执行以下命令−
Execute the following command −
$ bin/zkServer.sh start
执行该命令后,你会收到如下的响应 −
After executing this command, you will get a response as follows −
$ JMX enabled by default
$ Using config: /Users/../zookeeper-3.4.6/bin/../conf/zoo.cfg
$ Starting zookeeper ... STARTED
Step 2.5: Start CLI
键入以下命令−
Type the following command −
$ bin/zkCli.sh
键入以上命令后,你将连接到ZooKeeper服务器,并且你应该得到以下响应。
After typing the above command, you will be connected to the ZooKeeper server and you should get the following response.
Connecting to localhost:2181
................
................
................
Welcome to ZooKeeper!
................
................
WATCHER::
WatchedEvent state:SyncConnected type: None path:null
[zk: localhost:2181(CONNECTED) 0]
Zookeeper - CLI
ZooKeeper命令行界面(CLI)用于出于开发目的与ZooKeeper组合进行交互。这对于调试和与不同选项协同工作很有用。
ZooKeeper Command Line Interface (CLI) is used to interact with the ZooKeeper ensemble for development purpose. It is useful for debugging and working around with different options.
若要执行ZooKeeper CLI操作,请首先启用ZooKeeper服务器(“bin/zkServer.sh start”),然后启用ZooKeeper客户端(“bin/zkCli.sh”)。客户端启动后,你可以执行以下操作−
To perform ZooKeeper CLI operations, first turn on your ZooKeeper server (“bin/zkServer.sh start”) and then, ZooKeeper client (“bin/zkCli.sh”). Once the client starts, you can perform the following operation −
-
Create znodes
-
Get data
-
Watch znode for changes
-
Set data
-
Create children of a znode
-
List children of a znode
-
Check Status
-
Remove / Delete a znode
现在让我们逐个查看以上命令,并使用一个示例。
Now let us see above command one by one with an example.
Create Znodes
使用给定的路径创建一个znode。 flag 参数指定创建的znode为临时性、持久性还是连续性。默认情况下,所有znode都是持久的。
Create a znode with the given path. The flag argument specifies whether the created znode will be ephemeral, persistent, or sequential. By default, all znodes are persistent.
-
Ephemeral znodes (flag: e) will be automatically deleted when a session expires or when the client disconnects.
-
Sequential znodes guaranty that the znode path will be unique.
-
ZooKeeper ensemble will add sequence number along with 10 digit padding to the znode path. For example, the znode path /myapp will be converted to /myapp0000000001 and the next sequence number will be /myapp0000000002. If no flags are specified, then the znode is considered as persistent.
Output
[zk: localhost:2181(CONNECTED) 0] create /FirstZnode “Myfirstzookeeper-app”
Created /FirstZnode
要创建一个 Sequential znode ,添加 -s flag ,如下所示。
To create a Sequential znode, add -s flag as shown below.
Output
[zk: localhost:2181(CONNECTED) 2] create -s /FirstZnode “second-data”
Created /FirstZnode0000000023
要创建一个 Ephemeral Znode ,添加 -e flag ,如下所示。
To create an Ephemeral Znode, add -e flag as shown below.
Output
[zk: localhost:2181(CONNECTED) 2] create -e /SecondZnode “Ephemeral-data”
Created /SecondZnode
请记住,当客户端连接丢失时,短暂节点将被删除。你可以尝试退出 ZooKeeper CLI,然后重新打开 CLI,来进行验证。
Remember when a client connection is lost, the ephemeral znode will be deleted. You can try it by quitting the ZooKeeper CLI and then re-opening the CLI.
Get Data
它返回与指定节点相关联的数据和元数据。你将获得一些信息,例如数据最后一次修改的时间、以及修改的地点,和有关数据的信息。此 CLI 还用于分配监视,以显示有关数据的通知。
It returns the associated data of the znode and metadata of the specified znode. You will get information such as when the data was last modified, where it was modified, and information about the data. This CLI is also used to assign watches to show notification about the data.
Output
[zk: localhost:2181(CONNECTED) 1] get /FirstZnode
“Myfirstzookeeper-app”
cZxid = 0x7f
ctime = Tue Sep 29 16:15:47 IST 2015
mZxid = 0x7f
mtime = Tue Sep 29 16:15:47 IST 2015
pZxid = 0x7f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 22
numChildren = 0
要访问序列节点,你必须输入节点的完整路径。
To access a sequential znode, you must enter the full path of the znode.
Watch
监视在指定节点或节点的子节点数据发生更改时显示通知。你只能在 get 命令中设置 watch 。
Watches show a notification when the specified znode or znode’s children data changes. You can set a watch only in get command.
Output
[zk: localhost:2181(CONNECTED) 1] get /FirstZnode 1
“Myfirstzookeeper-app”
cZxid = 0x7f
ctime = Tue Sep 29 16:15:47 IST 2015
mZxid = 0x7f
mtime = Tue Sep 29 16:15:47 IST 2015
pZxid = 0x7f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 22
numChildren = 0
输出类似于正常的 get 命令,但它将在后台等待节点更改。<从这里开始>
The output is similar to normal get command, but it will wait for znode changes in the background. <Start here>
Set Data
设置指定节点的数据。完成此设置操作后,你可以使用 get CLI 命令检查数据。
Set the data of the specified znode. Once you finish this set operation, you can check the data using the get CLI command.
Output
[zk: localhost:2181(CONNECTED) 1] get /SecondZnode “Data-updated”
cZxid = 0x82
ctime = Tue Sep 29 16:29:50 IST 2015
mZxid = 0x83
mtime = Tue Sep 29 16:29:50 IST 2015
pZxid = 0x82
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x15018b47db00000
dataLength = 14
numChildren = 0
如果你在 get 命令中分配了 children 选项(如前一个命令中所示),则输出将类似于如下所示:
If you assigned watch option in get command (as in previous command), then the output will be similar as shown below −
Output
[zk: localhost:2181(CONNECTED) 1] get /FirstZnode “Mysecondzookeeper-app”
WATCHER: :
WatchedEvent state:SyncConnected type:NodeDataChanged path:/FirstZnode
cZxid = 0x7f
ctime = Tue Sep 29 16:15:47 IST 2015
mZxid = 0x84
mtime = Tue Sep 29 17:14:47 IST 2015
pZxid = 0x7f
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 23
numChildren = 0
Create Children / Sub-znode
创建子节点类似于创建新节点。唯一不同的是子节点的路径也包含父路径。
Creating children is similar to creating new znodes. The only difference is that the path of the child znode will have the parent path as well.
List Children
此命令用于列出和显示节点的 children 。
This command is used to list and display the children of a znode.
Check Status
Status 描述指定节点的元数据。它包含诸如时间戳、版本号、ACL、数据长度和子节点等详细信息。
Status describes the metadata of a specified znode. It contains details such as Timestamp, Version number, ACL, Data length, and Children znode.
Remove a Znode
删除指定节点及其所有子节点。仅当此类节点可用时才会执行此操作。
Removes a specified znode and recursively all its children. This would happen only if such a znode is available.
Output
[zk: localhost:2181(CONNECTED) 10] rmr /FirstZnode
[zk: localhost:2181(CONNECTED) 11] get /FirstZnode
Node does not exist: /FirstZnode
删除 (delete /path) 命令类似于 remove 命令,不同之处在于它仅适用于没有子节点的节点。
Delete (delete /path) command is similar to remove command, except the fact that it works only on znodes with no children.
Zookeeper - API
ZooKeeper 有一个官方的 Java 和 C API 绑定。ZooKeeper 社区为大多数语言(.NET、python 等)提供非官方 API。使用 ZooKeeper API,应用程序可以连接、交互、操作数据、协调,最后从 ZooKeeper 集群断开连接。
ZooKeeper has an official API binding for Java and C. The ZooKeeper community provides unofficial API for most of the languages (.NET, python, etc.). Using ZooKeeper API, an application can connect, interact, manipulate data, coordinate, and finally disconnect from a ZooKeeper ensemble.
ZooKeeper API 具有丰富的一组功能,可以以一种简单安全的方式获得 ZooKeeper 集群的所有功能。ZooKeeper API 提供同步和异步方法。
ZooKeeper API has a rich set of features to get all the functionality of the ZooKeeper ensemble in a simple and safe manner. ZooKeeper API provides both synchronous and asynchronous methods.
ZooKeeper 集群和 ZooKeeper API 在各个方面都完全相辅相成,极大地有益于开发人员。让我们在本章中讨论 Java 绑定。
ZooKeeper ensemble and ZooKeeper API completely complement each other in every aspect and it benefits the developers in a great way. Let us discuss Java binding in this chapter.
Basics of ZooKeeper API
与 ZooKeeper 集群互动的应用程序称为 ZooKeeper Client 或简称 Client 。
Application interacting with ZooKeeper ensemble is referred as ZooKeeper Client or simply Client.
节点是 ZooKeeper 集群的核心组件,而 ZooKeeper API 提供了一小套方法来使用 ZooKeeper 集群操作所有节点的详细信息。
Znode is the core component of ZooKeeper ensemble and ZooKeeper API provides a small set of methods to manipulate all the details of znode with ZooKeeper ensemble.
要与 ZooKeeper 集合进行清晰简洁的交互,客户端应遵循以下步骤。
A client should follow the steps given below to have a clear and clean interaction with ZooKeeper ensemble.
-
Connect to the ZooKeeper ensemble. ZooKeeper ensemble assign a Session ID for the client.
-
Send heartbeats to the server periodically. Otherwise, the ZooKeeper ensemble expires the Session ID and the client needs to reconnect.
-
Get / Set the znodes as long as a session ID is active.
-
Disconnect from the ZooKeeper ensemble, once all the tasks are completed. If the client is inactive for a prolonged time, then the ZooKeeper ensemble will automatically disconnect the client.
Java Binding
让我们在本节中了解最重要的 ZooKeeper API 集合。ZooKeeper API 的核心部分是 ZooKeeper class 。它提供选项,在其构造函数中连接 ZooKeeper 集合,并具有以下方法:
Let us understand the most important set of ZooKeeper API in this chapter. The central part of the ZooKeeper API is ZooKeeper class. It provides options to connect the ZooKeeper ensemble in its constructor and has the following methods −
-
connect − connect to the ZooKeeper ensemble
-
create − create a znode
-
exists − check whether a znode exists and its information
-
getData − get data from a particular znode
-
setData − set data in a particular znode
-
getChildren − get all sub-nodes available in a particular znode
-
delete − get a particular znode and all its children
-
close − close a connection
Connect to the ZooKeeper Ensemble
ZooKeeper 类通过其构造函数提供连接功能。构造函数的签名如下所示:
The ZooKeeper class provides connection functionality through its constructor. The signature of the constructor is as follows −
ZooKeeper(String connectionString, int sessionTimeout, Watcher watcher)
其中,
Where,
-
connectionString − ZooKeeper ensemble host.
-
sessionTimeout − session timeout in milliseconds.
-
watcher − an object implementing “Watcher” interface. The ZooKeeper ensemble returns the connection status through the watcher object.
让我们创建一个新的帮助程序类 ZooKeeperConnection 并添加一个方法 connect 。 connect 方法创建 ZooKeeper 对象,连接到 ZooKeeper 集合,然后返回对象。
Let us create a new helper class ZooKeeperConnection and add a method connect. The connect method creates a ZooKeeper object, connects to the ZooKeeper ensemble, and then returns the object.
此处 CountDownLatch 用来停止(等待)主进程,直到客户端连接到 ZooKeeper 集群。
Here CountDownLatch is used to stop (wait) the main process until the client connects with the ZooKeeper ensemble.
ZooKeeper 集群通过 Watcher callback 返回连接状态。客户端连接到 ZooKeeper 集群后,将调用 Watcher 回调,并且 Watcher 回调会调用 CountDownLatch 的 countDown 方法来释放 await 中主进程的锁。
The ZooKeeper ensemble replies the connection status through the Watcher callback. The Watcher callback will be called once the client connects with the ZooKeeper ensemble and the Watcher callback calls the countDown method of the CountDownLatch to release the lock, await in the main process.
以下是用于连接到 ZooKeeper 集群的完整代码。
Here is the complete code to connect with a ZooKeeper ensemble.
Coding: ZooKeeperConnection.java
// import java classes
import java.io.IOException;
import java.util.concurrent.CountDownLatch;
// import zookeeper classes
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.AsyncCallback.StatCallback;
import org.apache.zookeeper.KeeperException.Code;
import org.apache.zookeeper.data.Stat;
public class ZooKeeperConnection {
// declare zookeeper instance to access ZooKeeper ensemble
private ZooKeeper zoo;
final CountDownLatch connectedSignal = new CountDownLatch(1);
// Method to connect zookeeper ensemble.
public ZooKeeper connect(String host) throws IOException,InterruptedException {
zoo = new ZooKeeper(host,5000,new Watcher() {
public void process(WatchedEvent we) {
if (we.getState() == KeeperState.SyncConnected) {
connectedSignal.countDown();
}
}
});
connectedSignal.await();
return zoo;
}
// Method to disconnect from zookeeper server
public void close() throws InterruptedException {
zoo.close();
}
}
保存以上代码,它将在下一节中用于连接到 ZooKeeper 集群。
Save the above code and it will be used in the next section for connecting the ZooKeeper ensemble.
Create a Znode
ZooKeeper 类提供了 create method 用于在 ZooKeeper 集群中创建新 znode。 create 方法的签名如下 −
The ZooKeeper class provides create method to create a new znode in the ZooKeeper ensemble. The signature of the create method is as follows −
create(String path, byte[] data, List<ACL> acl, CreateMode createMode)
其中,
Where,
-
path − Znode path. For example, /myapp1, /myapp2, /myapp1/mydata1, myapp2/mydata1/myanothersubdata
-
data − data to store in a specified znode path
-
acl − access control list of the node to be created. ZooKeeper API provides a static interface ZooDefs.Ids to get some of basic acl list. For example, ZooDefs.Ids.OPEN_ACL_UNSAFE returns a list of acl for open znodes.
-
createMode − the type of node, either ephemeral, sequential, or both. This is an enum.
让我们创建一个新的 Java 应用程序来检查 ZooKeeper API 的 create 功能。创建一个文件 ZKCreate.java 。在主方法中,创建一个类型为 ZooKeeperConnection 的对象并调用 connect 方法来连接到 ZooKeeper 集群。
Let us create a new Java application to check the create functionality of the ZooKeeper API. Create a file ZKCreate.java. In the main method, create an object of type ZooKeeperConnection and call the connect method to connect to the ZooKeeper ensemble.
connect 方法将返回 ZooKeeper 对象 zk 。现在,使用自定义 path 和 data 调用 zk 对象的 create 方法。
The connect method will return the ZooKeeper object zk. Now, call the create method of zk object with custom path and data.
用于创建 znode 的完整程序代码如下 −
The complete program code to create a znode is as follows −
Coding: ZKCreate.java
import java.io.IOException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.ZooDefs;
public class ZKCreate {
// create static instance for zookeeper class.
private static ZooKeeper zk;
// create static instance for ZooKeeperConnection class.
private static ZooKeeperConnection conn;
// Method to create znode in zookeeper ensemble
public static void create(String path, byte[] data) throws
KeeperException,InterruptedException {
zk.create(path, data, ZooDefs.Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT);
}
public static void main(String[] args) {
// znode path
String path = "/MyFirstZnode"; // Assign path to znode
// data in byte array
byte[] data = "My first zookeeper app”.getBytes(); // Declare data
try {
conn = new ZooKeeperConnection();
zk = conn.connect("localhost");
create(path, data); // Create the data to the specified path
conn.close();
} catch (Exception e) {
System.out.println(e.getMessage()); //Catch error message
}
}
}
一旦编译并执行应用程序,将在 ZooKeeper 集群中创建一个具有指定数据的新 znode。你可以使用 ZooKeeper CLI zkCli.sh 来检查它。
Once the application is compiled and executed, a znode with the specified data will be created in the ZooKeeper ensemble. You can check it using the ZooKeeper CLI zkCli.sh.
cd /path/to/zookeeper
bin/zkCli.sh
>>> get /MyFirstZnode
Exists – Check the Existence of a Znode
ZooKeeper 类提供 exists method 来检查 znode 的存在。如果指定的 znode 存在,它将返回 znode 的元数据。 exists 方法的签名如下 −
The ZooKeeper class provides the exists method to check the existence of a znode. It returns the metadata of a znode, if the specified znode exists. The signature of the exists method is as follows −
exists(String path, boolean watcher)
其中,
Where,
-
path − Znode path
-
watcher − boolean value to specify whether to watch a specified znode or not
让我们创建一个新的 Java 应用程序来检查 ZooKeeper API 的“exists”功能。创建一个文件 “ZKExists.java”。在主方法中,使用 “ZooKeeperConnection” 对象创建一个 ZooKeeper 对象“zk”。然后,使用自定义“path”调用“zk”对象的“exists”方法。完整的清单如下 −
Let us create a new Java application to check the “exists” functionality of the ZooKeeper API. Create a file “ZKExists.java”. In the main method, create ZooKeeper object, “zk” using “ZooKeeperConnection” object. Then, call “exists” method of “zk” object with custom “path”. The complete listing is as follow −
Coding: ZKExists.java
import java.io.IOException;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import org.apache.zookeeper.data.Stat;
public class ZKExists {
private static ZooKeeper zk;
private static ZooKeeperConnection conn;
// Method to check existence of znode and its status, if znode is available.
public static Stat znode_exists(String path) throws
KeeperException,InterruptedException {
return zk.exists(path, true);
}
public static void main(String[] args) throws InterruptedException,KeeperException {
String path = "/MyFirstZnode"; // Assign znode to the specified path
try {
conn = new ZooKeeperConnection();
zk = conn.connect("localhost");
Stat stat = znode_exists(path); // Stat checks the path of the znode
if(stat != null) {
System.out.println("Node exists and the node version is " +
stat.getVersion());
} else {
System.out.println("Node does not exists");
}
} catch(Exception e) {
System.out.println(e.getMessage()); // Catches error messages
}
}
}
一旦编译并执行应用程序,你将获得以下输出。
Once the application is compiled and executed, you will get the below output.
Node exists and the node version is 1.
getData Method
ZooKeeper 类提供了 getData 方法来获取附加在指定 znode 中的数据及其状态。 getData 方法的签名如下 −
The ZooKeeper class provides getData method to get the data attached in a specified znode and its status. The signature of the getData method is as follows −
getData(String path, Watcher watcher, Stat stat)
其中,
Where,
-
path − Znode path.
-
watcher − Callback function of type Watcher. The ZooKeeper ensemble will notify through the Watcher callback when the data of the specified znode changes. This is one-time notification.
-
stat − Returns the metadata of a znode.
我们创建一个新的 Java 应用程序以了解 ZooKeeper API 的 getData 功能。创建文件 ZKGetData.java 。在 main 方法中,使用 he ZooKeeperConnection 对象创建一个 ZooKeeper 对象 zk 。然后,使用自定义路径调用 zk 对象的 getData 方法。
Let us create a new Java application to understand the getData functionality of the ZooKeeper API. Create a file ZKGetData.java. In the main method, create a ZooKeeper object zk using he ZooKeeperConnection object. Then, call the getData method of zk object with custom path.
以下是获取指定节点数据的完整程序代码 -
Here is the complete program code to get the data from a specified node −
Coding: ZKGetData.java
import java.io.IOException;
import java.util.concurrent.CountDownLatch;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import org.apache.zookeeper.data.Stat;
public class ZKGetData {
private static ZooKeeper zk;
private static ZooKeeperConnection conn;
public static Stat znode_exists(String path) throws
KeeperException,InterruptedException {
return zk.exists(path,true);
}
public static void main(String[] args) throws InterruptedException, KeeperException {
String path = "/MyFirstZnode";
final CountDownLatch connectedSignal = new CountDownLatch(1);
try {
conn = new ZooKeeperConnection();
zk = conn.connect("localhost");
Stat stat = znode_exists(path);
if(stat != null) {
byte[] b = zk.getData(path, new Watcher() {
public void process(WatchedEvent we) {
if (we.getType() == Event.EventType.None) {
switch(we.getState()) {
case Expired:
connectedSignal.countDown();
break;
}
} else {
String path = "/MyFirstZnode";
try {
byte[] bn = zk.getData(path,
false, null);
String data = new String(bn,
"UTF-8");
System.out.println(data);
connectedSignal.countDown();
} catch(Exception ex) {
System.out.println(ex.getMessage());
}
}
}
}, null);
String data = new String(b, "UTF-8");
System.out.println(data);
connectedSignal.await();
} else {
System.out.println("Node does not exists");
}
} catch(Exception e) {
System.out.println(e.getMessage());
}
}
}
一旦应用程序编译并执行,您会得到以下输出
Once the application is compiled and executed, you will get the following output
My first zookeeper app
并且应用程序会等待 ZooKeeper 集成的进一步通知。使用 ZooKeeper CLI zkCli.sh 更改指定的 z 节点的数据。
And the application will wait for further notification from the ZooKeeper ensemble. Change the data of the specified znode using ZooKeeper CLI zkCli.sh.
cd /path/to/zookeeper
bin/zkCli.sh
>>> set /MyFirstZnode Hello
现在,应用程序将打印以下输出并退出。
Now, the application will print the following output and exit.
Hello
setData Method
ZooKeeper 类提供 setData 方法来修改附加在指定的 z 节点中的数据。 setData 方法的签名如下 -
The ZooKeeper class provides setData method to modify the data attached in a specified znode. The signature of the setData method is as follows −
setData(String path, byte[] data, int version)
其中,
Where,
-
path − Znode path
-
data − data to store in a specified znode path.
-
version − Current version of the znode. ZooKeeper updates the version number of the znode whenever the data gets changed.
我们现在创建一个新的 Java 应用程序以了解 ZooKeeper API 的 setData 功能。创建文件 ZKSetData.java 。在 main 方法中,使用 ZooKeeperConnection 对象创建一个 ZooKeeper 对象 zk 。然后,使用指定的路径、新数据和节点版本调用 zk 对象的 setData 方法。
Let us now create a new Java application to understand the setData functionality of the ZooKeeper API. Create a file ZKSetData.java. In the main method, create a ZooKeeper object zk using the ZooKeeperConnection object. Then, call the setData method of zk object with the specified path, new data, and version of the node.
以下是修改附加在指定 z 节点中的数据的完整程序代码。
Here is the complete program code to modify the data attached in a specified znode.
Code: ZKSetData.java
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import java.io.IOException;
public class ZKSetData {
private static ZooKeeper zk;
private static ZooKeeperConnection conn;
// Method to update the data in a znode. Similar to getData but without watcher.
public static void update(String path, byte[] data) throws
KeeperException,InterruptedException {
zk.setData(path, data, zk.exists(path,true).getVersion());
}
public static void main(String[] args) throws InterruptedException,KeeperException {
String path= "/MyFirstZnode";
byte[] data = "Success".getBytes(); //Assign data which is to be updated.
try {
conn = new ZooKeeperConnection();
zk = conn.connect("localhost");
update(path, data); // Update znode data to the specified path
} catch(Exception e) {
System.out.println(e.getMessage());
}
}
}
一旦应用程序编译并执行,指定 z 节点的数据将被更改,可以使用 ZooKeeper CLI, zkCli.sh 来检查它。
Once the application is compiled and executed, the data of the specified znode will be changed and it can be checked using the ZooKeeper CLI, zkCli.sh.
cd /path/to/zookeeper
bin/zkCli.sh
>>> get /MyFirstZnode
getChildrenMethod
ZooKeeper 类提供 getChildren 方法来获取某个特定 z 节点的全部子节点。 getChildren 方法的签名如下 -
The ZooKeeper class provides getChildren method to get all the sub-node of a particular znode. The signature of the getChildren method is as follows −
getChildren(String path, Watcher watcher)
其中,
Where,
-
path − Znode path.
-
watcher − Callback function of type “Watcher”. The ZooKeeper ensemble will notify when the specified znode gets deleted or a child under the znode gets created / deleted. This is a one-time notification.
Coding: ZKGetChildren.java
import java.io.IOException;
import java.util.*;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import org.apache.zookeeper.data.Stat;
public class ZKGetChildren {
private static ZooKeeper zk;
private static ZooKeeperConnection conn;
// Method to check existence of znode and its status, if znode is available.
public static Stat znode_exists(String path) throws
KeeperException,InterruptedException {
return zk.exists(path,true);
}
public static void main(String[] args) throws InterruptedException,KeeperException {
String path = "/MyFirstZnode"; // Assign path to the znode
try {
conn = new ZooKeeperConnection();
zk = conn.connect("localhost");
Stat stat = znode_exists(path); // Stat checks the path
if(stat!= null) {
//“getChildren” method- get all the children of znode.It has two
args, path and watch
List <String> children = zk.getChildren(path, false);
for(int i = 0; i < children.size(); i++)
System.out.println(children.get(i)); //Print children's
} else {
System.out.println("Node does not exists");
}
} catch(Exception e) {
System.out.println(e.getMessage());
}
}
}
在运行程序之前,让我们使用 ZooKeeper CLI, zkCli.sh 为 /MyFirstZnode 创建两个子节点。
Before running the program, let us create two sub-nodes for /MyFirstZnode using the ZooKeeper CLI, zkCli.sh.
cd /path/to/zookeeper
bin/zkCli.sh
>>> create /MyFirstZnode/myfirstsubnode Hi
>>> create /MyFirstZnode/mysecondsubmode Hi
现在,编译并运行程序将输出创建的上述 z 节点。
Now, compiling and running the program will output the above created znodes.
myfirstsubnode
mysecondsubnode
Delete a Znode
ZooKeeper 类提供 delete 方法来删除指定的 z 节点。 delete 方法的签名如下 -
The ZooKeeper class provides delete method to delete a specified znode. The signature of the delete method is as follows −
delete(String path, int version)
其中,
Where,
-
path − Znode path.
-
version − Current version of the znode.
我们创建一个新的 Java 应用程序以了解 ZooKeeper API 的 delete 功能。创建文件 ZKDelete.java 。在 main 方法中,创建一个 ZooKeeper 对象 zk 使用 ZooKeeperConnection 对象。然后,使用 zk 对象的指定 path 和节点的版本调用 delete 方法。
Let us create a new Java application to understand the delete functionality of the ZooKeeper API. Create a file ZKDelete.java. In the main method, create a ZooKeeper object zk using ZooKeeperConnection object. Then, call the delete method of zk object with the specified path and version of the node.
删除znode的完整程序代码如下所示−
The complete program code to delete a znode is as follows −
Coding: ZKDelete.java
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.KeeperException;
public class ZKDelete {
private static ZooKeeper zk;
private static ZooKeeperConnection conn;
// Method to check existence of znode and its status, if znode is available.
public static void delete(String path) throws KeeperException,InterruptedException {
zk.delete(path,zk.exists(path,true).getVersion());
}
public static void main(String[] args) throws InterruptedException,KeeperException {
String path = "/MyFirstZnode"; //Assign path to the znode
try {
conn = new ZooKeeperConnection();
zk = conn.connect("localhost");
delete(path); //delete the node with the specified path
} catch(Exception e) {
System.out.println(e.getMessage()); // catches error messages
}
}
}
Zookeeper - Applications
Zookeeper 为分布式环境提供了灵活的协调基础架构。ZooKeeper 框架支持当今许多最佳工业应用。本章我们将讨论 ZooKeeper 一些最著名的应用。
Zookeeper provides a flexible coordination infrastructure for distributed environment. ZooKeeper framework supports many of the today’s best industrial applications. We will discuss some of the most notable applications of ZooKeeper in this chapter.
Yahoo!
ZooKeeper 框架最初是在“Yahoo!”中构建的。设计良好的分布式应用需要满足数据透明性、更好的性能、健壮性、集中式配置和协调等要求。因此,他们设计了 ZooKeeper 框架来满足这些要求。
The ZooKeeper framework was originally built at “Yahoo!”. A well-designed distributed application needs to meet requirements such as data transparency, better performance, robustness, centralized configuration, and coordination. So, they designed the ZooKeeper framework to meet these requirements.
Apache Hadoop
Apache Hadoop 是大数据行业增长的推动力。Hadoop 依赖 ZooKeeper 来进行配置管理和协调。让我们采用一个场景来理解 ZooKeeper 在 Hadoop 中扮演的角色。
Apache Hadoop is the driving force behind the growth of Big Data industry. Hadoop relies on ZooKeeper for configuration management and coordination. Let us take a scenario to understand the role of ZooKeeper in Hadoop.
假设 Hadoop cluster 桥接 100 or more commodity servers 。因此,需要协调和命名服务。因为计算涉及大量节点,所以每个节点需要相互同步,知道在哪里访问服务,并知道它们应如何配置。在此时刻,Hadoop 集群需要跨节点服务。ZooKeeper 为 cross-node synchronization 提供了便利,并确保 Hadoop 项目中的任务被序列化和同步。
Assume that a Hadoop cluster bridges 100 or more commodity servers. Therefore, there’s a need for coordination and naming services. As computation of large number of nodes are involved, each node needs to synchronize with each other, know where to access services, and know how they should be configured. At this point of time, Hadoop clusters require cross-node services. ZooKeeper provides the facilities for cross-node synchronization and ensures the tasks across Hadoop projects are serialized and synchronized.
多个 ZooKeeper 服务器支持大型 Hadoop 集群。每台客户端机器与一台 ZooKeeper 服务器通信以检索和更新其同步信息。一些实时示例如下 −
Multiple ZooKeeper servers support large Hadoop clusters. Each client machine communicates with one of the ZooKeeper servers to retrieve and update its synchronization information. Some of the real-time examples are −
-
Human Genome Project − The Human Genome Project contains terabytes of data. Hadoop MapReduce framework can be used to analyze the dataset and find interesting facts for human development.
-
Healthcare − Hospitals can store, retrieve, and analyze huge sets of patient medical records, which are normally in terabytes.
Apache HBase
Apache HBase 是一个开源、分布式 NoSQL 数据库,用于大数据集的实时读/写访问,并在 HDFS 之上运行。HBase 遵循 master-slave architecture ,其中 HBase Master 管理所有从属。从属被称为 Region servers 。
Apache HBase is an open source, distributed, NoSQL database used for real-time read/write access of large datasets and runs on top of the HDFS. HBase follows master-slave architecture where the HBase Master governs all the slaves. Slaves are referred as Region servers.
HBase 分布式应用程序的安装依赖于一个正在运行的 ZooKeeper 集群。Apache HBase 使用 ZooKeeper 来跟踪主服务器和地区服务器中分布式数据的状态,这依靠 centralized configuration management 和 distributed mutex 机制。以下是一些 HBase 的用例 −
HBase distributed application installation depends on a running ZooKeeper cluster. Apache HBase uses ZooKeeper to track the status of distributed data throughout the master and region servers with the help of centralized configuration management and distributed mutex mechanisms. Here are some of the use-cases of HBase −
-
Telecom − Telecom industry stores billions of mobile call records (around 30TB / month) and accessing these call records in real time become a huge task. HBase can be used to process all the records in real time, easily and efficiently.
-
Social network − Similar to telecom industry, sites like Twitter, LinkedIn, and Facebook receive huge volumes of data through the posts created by users. HBase can be used to find recent trends and other interesting facts.
Apache Solr
Apache Solr 是一个用 Java 编写的快速、开源的搜索平台。它是一个速度惊人、容错的分布式搜索引擎。它建立在 Lucene 之上,是一个高性能、功能齐全的文本搜索引擎。
Apache Solr is a fast, open source search platform written in Java. It is a blazing fast, faulttolerant distributed search engine. Built on top of Lucene, it is a high-performance, full-featured text search engine.
Solr 大量使用了 ZooKeeper 的每一个功能,例如配置管理、领导者选举、节点管理、数据的锁定和同步。
Solr extensively uses every feature of ZooKeeper such as Configuration management, Leader election, node management, Locking and syncronization of data.
Solr 有两个不同的部分: indexing 和 searching 。索引是一个以适当格式存储数据的过程,以便以后可以对其进行搜索。Solr 使用 ZooKeeper 在多个节点上对数据建立索引并从多个节点进行搜索。ZooKeeper 贡献了以下功能 −
Solr has two distinct parts, indexing and searching. Indexing is a process of storing the data in a proper format so that it can be searched later. Solr uses ZooKeeper for both indexing the data in multiple nodes and searching from multiple nodes. ZooKeeper contributes the following features −
-
Add / remove nodes as and when needed
-
Replication of data between nodes and subsequently minimizing data loss
-
Sharing of data between multiple nodes and subsequently searching from multiple nodes for faster search results
Apache Solr 的一些用例包括电子商务、职位搜索等。
Some of the use-cases of Apache Solr include e-commerce, job search, etc.