ZooKeeper Getting Started Guide

http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html

What is ZooKeeper?

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them ,which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.

Getting Started: Coordinating Distributed Applications with ZooKeeper

This document contains information to get you started quickly with ZooKeeper. It is aimed primarily at developers hoping to try it out, and contains simple installation instructions for a single ZooKeeper server, a few commands to verify that it is running, and a simple programming example. Finally, as a convenience, there are a few sections regarding more complicated installations, for example running replicated deployments, and optimizing the transaction log. However for the complete instructions for commercial deployments, please refer to the ZooKeeper Administrator‘s Guide.

Pre-requisites

See System Requirements in the Admin guide.

Standalone Operation

To start ZooKeeper you need a configuration file. Here is a sample, create it in conf/zoo.cfg:

tickTime=2000
dataDir=/home/stu/zookeeper
clientPort=2181

This file can be called anything, but for the sake of this discussion call it conf/zoo.cfg. Change the value of dataDir to specify an existing (empty to start with) directory. Here are the meanings for each of the fields:

tickTime

the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.

dataDir

the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.

clientPort

the port to listen for client connections

 start ZooKeeper:

bin/zkServer.sh start

ZooKeeper logs messages using log4j -- more detail available in the Logging section of the Programmer‘s Guide. You will see log messages coming to the console (default) and/or a log file depending on the log4j configuration.

The steps outlined here run ZooKeeper in standalone mode. There is no replication, so if ZooKeeper process fails, the service will go down. This is fine for most development situations, but to run ZooKeeper in replicated mode, please see Running Replicated ZooKeeper.

Managing ZooKeeper Storage

For long running production systems ZooKeeper storage must be managed externally (dataDir and logs). See the section on maintenance for more details.

Connecting to ZooKeeper

Once ZooKeeper is running, you have several options for connection to it:

  • Java: Use

    bin/zkCli.sh -server 127.0.0.1:2181

    This lets you perform simple, file-like operations.

  • C: compile cli_mt (multi-threaded) or cli_st (single-threaded) by running make cli_mt or make cli_st in the src/c subdirectory in the ZooKeeper sources. See the README contained within src/c for full details.

    You can run the program from src/c using:

    LD_LIBRARY_PATH=. cli_mt 127.0.0.1:2181

    or

    LD_LIBRARY_PATH=. cli_st 127.0.0.1:2181

    This will give you a simple shell to execute file system like operations on ZooKeeper.

Once you have connected, you should see something like:

Connecting to localhost:2181
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
Welcome to ZooKeeper!
JLine support is enabled
[zkshell: 0]
        

From the shell, type help to get a listing of commands that can be executed from the client, as in:

[zkshell: 0] help
ZooKeeper host:port cmd args
        get path [watch]
        ls path [watch]
        set path data [version]
        delquota [-n|-b] path
        quit
        printwatches on|off
        create path data acl
        stat path [watch]
        listquota path
        history
        setAcl path acl
        getAcl path
        sync path
        redo cmdno
        addauth scheme auth
        delete path [version]
        deleteall path
        setquota -n|-b val path

From here, you can try a few simple commands to get a feel for this simple command line interface. First, start by issuing the list command, as in ls, yielding:

[zkshell: 8] ls /
[zookeeper]
        

Next, create a new znode by running create /zk_test my_data. This creates a new znode and associates the string "my_data" with the node. You should see:

[zkshell: 9] create /zk_test my_data
Created /zk_test
      

Issue another ls / command to see what the directory looks like:

[zkshell: 11] ls /
[zookeeper, zk_test]

Notice that the zk_test directory has now been created.

Next, verify that the data was associated with the znode by running the get command, as in:

[zkshell: 12] get /zk_test
my_data
cZxid = 5
ctime = Fri Jun 05 13:57:06 PDT 2009
mZxid = 5
mtime = Fri Jun 05 13:57:06 PDT 2009
pZxid = 5
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0
dataLength = 7
numChildren = 0
        

We can change the data associated with zk_test by issuing the set command, as in:

[zkshell: 14] set /zk_test junk
cZxid = 5
ctime = Fri Jun 05 13:57:06 PDT 2009
mZxid = 6
mtime = Fri Jun 05 14:01:52 PDT 2009
pZxid = 5
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0
dataLength = 4
numChildren = 0
[zkshell: 15] get /zk_test
junk
cZxid = 5
ctime = Fri Jun 05 13:57:06 PDT 2009
mZxid = 6
mtime = Fri Jun 05 14:01:52 PDT 2009
pZxid = 5
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0
dataLength = 4
numChildren = 0
      

(Notice we did a get after setting the data and it did, indeed, change.

Finally, let‘s delete the node by issuing:

[zkshell: 16] delete /zk_test
[zkshell: 17] ls /
[zookeeper]
[zkshell: 18]

That‘s it for now. To explore more, continue with the rest of this document and see the Programmer‘s Guide.

Programming to ZooKeeper

ZooKeeper has a Java bindings and C bindings. They are functionally equivalent. The C bindings exist in two variants: single threaded and multi-threaded. These differ only in how the messaging loop is done. For more information, see the Programming Examples in the ZooKeeper Programmer‘s Guide for sample code using of the different APIs.

Running Replicated ZooKeeper

Running ZooKeeper in standalone mode is convenient for evaluation, some development, and testing. But in production, you should run ZooKeeper in replicated mode. A replicated group of servers in the same application is called a quorum, and in replicated mode, all servers in the quorum have copies of the same configuration file. The file is similar to the one used in standalone mode, but with a few differences. Here is an example:

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888

The new entry, initLimit is timeouts ZooKeeper uses to limit the length of time the ZooKeeper servers in quorum have to connect to a leader. The entry syncLimit limits how far out of date a server can be from a leader.

With both of these timeouts, you specify the unit of time using tickTime. In this example, the timeout for initLimit is 5 ticks at 2000 milleseconds a tick, or 10 seconds.

The entries of the form server.X list the servers that make up the ZooKeeper service. When the server starts up, it knows which server it is by looking for the file myid in the data directory. That file has the contains the server number, in ASCII.

Finally, note the two port numbers after each server name: " 2888" and "3888". Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.

Note

If you want to test multiple servers on a single machine, specify the servername as localhost with unique quorum & leader election ports (i.e. 2888:3888, 2889:3889, 2890:3890 in the example above) for each server.X in that server‘s config file. Of course separate dataDirs and distinct clientPorts are also necessary (in the above replicated example, running on a single localhost, you would still have three config files).

Other Optimizations

There are a couple of other configuration parameters that can greatly increase performance:

  • To get low latencies on updates it is important to have a dedicated transaction log directory. By default transaction logs are put in the same directory as the data snapshots and myid file. The dataLogDir parameters indicates a different directory to use for the transaction logs.
  • [tbd: what is the other config param?]
时间: 2024-10-10 23:38:05

ZooKeeper Getting Started Guide的相关文章

Apache ZooKeeper Getting Started Guide 翻译

ZooKeeper 開始向导 開始:用zookeeper协调分布式程序 单例操作 管理zookeeper存储 连接zookeeper 执行zookeeper 以复制模式执行zookeeper 其他优化 Getting Started:通过zookeeper协调分布式程序 这份文档包括了让你高速開始使用zookeeper的帮助信息. 文章主要是针对0基础想尝试使用zookeeper的开发人员,当中包括了一些简单的样例.仅用一台zookeeperserver,一些命令确认server正在执行,一个简

ZooKeeper Getting Started Guide 翻译

ZooKeeper 开始向导 开始:用zookeeper协调分布式程序 单例操作 管理zookeeper存储 连接zookeeper 运行zookeeper 以复制模式运行zookeeper 其它优化 Getting Started:通过zookeeper协调分布式程序 这份文档包含了让你快速开始使用zookeeper的帮助信息.文章主要是针对初级想尝试使用zookeeper的开发者,其中包含了一些简单的例子,仅用一台zookeeper服务器,一些命令确认服务器正在运行,一个简单的程序样例.文章

ZooKeeper Administrator's Guide A Guide to Deployment and Administration(吃别人嚼过的馍没意思,直接看官网资料)

Deployment System Requirements Supported Platforms Required Software Clustered (Multi-Server) Setup Single Server and Developer Setup Administration Designing a ZooKeeper Deployment Cross Machine Requirements Single Machine Requirements Provisioning

zookeeper入门资料

????????zookeeper使用和原理探究(一)???????? http://www.blogjava.net/BucketLi/archive/2010/12/21/341268.html 分布式服务框架 Zookeeper -- 管理分布式环境中的数据 http://www.ibm.com/developerworks/cn/opensource/os-cn-zookeeper/ zookeeper getting started guide http://zookeeper.apa

Zookeeper内幕

这篇博文是关于Zookeeper官网上zookeeperInternals的翻译(http://zookeeper.apache.org/doc/trunk/zookeeperInternals.html),讲述了Zookeeper的内部机制.由于博主的水平有限,如有错误和疏漏之处,恳请读者不吝指正. 名词解释 Quorum: 在同一应用中服务器的一组复制(A replicated group of servers in the same application)(这个是在官网上找到的解释,可能

Zookeeper官方文档

ZooKeeper Getting Started Guide Getting Started: Coordinating Distributed Applications with      ZooKeeper Pre-requisites Download Standalone Operation Managing ZooKeeper Storage Connecting to ZooKeeper Programming to ZooKeeper Running Replicated Zoo

1、ZooKeeper 基本概念、使用方法、实践场景

ZooKeeper 基本概念 ZooKeeper 是面向分布式应用的协调服务,其实现了树形结构的数据模型(与文件系统类似),并且提供了简洁的编程原语.ZooKeeper 能够作为基础,用于构建更高层级的分布式服务. ZooKeeper 是分布式的,具备高性能.高可用的特点. 如上架构图所示,ZooKeeper 集群中包括: Leader:提供 “读” & “写” 服务(Leader 由集群全部机器通过“Leader 选举”产生). Follower:集群中非 “Leader” 的其他节点. 集群

[译]zookpeer 入门教程

入门:分布式应用程序协调服务 ZooKeeper 本文档包含的信息来帮助你的ZooKeeper快速入门.它是在开发人员希望能够尝试一下主要目的,并包含安装简单说明一个ZooKeeper的服 务器,几个命令,以验证它是否正在运行,一个简单的编程示例.最后,为了方便,还有更多的关于安装复杂,几节,例如运行复制的部署和优化事务日志.然而, 对于商业部署的完整说明,请参阅的ZooKeeper管理员指南. 先决条件 见管理员指南中的系统要求. 下载 从Apache下载镜像下载最近的稳定版本 ,从而得到 Z

ZookeeperGettingStarted

reference url:  http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html#sc_FileManagement ZooKeeper Getting Started Guide Getting Started: Coordinating Distributed Applications with ZooKeeper Pre-requisites Download Standalone Operation Managing