The ZooKeeper Data Model
ZooKeeper has a hierarchal name space, much like a distributed file system. The only difference is that each node in the namespace can have data associated with it as well as children. It is like having a file system that allows a file to also be a directory. Paths to nodes are always expressed as canonical, absolute, slash-separated paths; there are no relative reference. Any unicode character can be used in a path subject to the following constraints:
- The null character (\u0000) cannot be part of a path name. (This causes problems with the C binding.)
- The following characters can‘t be used because they don‘t display well, or render in confusing ways: \u0001 - \u0019 and \u007F - \u009F.
- The following characters are not allowed: \ud800 -uF8FFF, \uFFF0-uFFFF, \uXFFFE - \uXFFFF (where X is a digit 1 - E), \uF0000 - \uFFFFF.
- The "." character can be used as part of another name, but "." and ".." cannot alone be used to indicate a node along a path, because ZooKeeper doesn‘t use relative paths. The following would be invalid: "/a/b/./c" or "/a/b/../c".
- The token "zookeeper" is reserved.
ZNodes
Every node in a ZooKeeper tree is referred to as a znode. Znodes maintain a stat structure that includes version numbers for data changes, acl changes. The stat structure also has timestamps. The version number, together with the timestamp allow ZooKeeper to validate the cache and to coordinate updates. Each time a znode‘s data changes, the version number increases. For instance, whenever a client retrieves data, it also receives the version of the data. And when a client performs an update or a delete, it must supply the version of the data of the znode it is changing. If the version it supplies doesn‘t match the actual version of the data, the update will fail.
Ephemeral Nodes(临时的znode)
ZooKeeper also has the notion(概念) of ephemeral(暂时的) nodes. These znodes exists as long as the session that created the znode is active. When the session ends the znode is deleted. Because of this behavior ephemeral znodes are not allowed to have children.
Persistent Nodes(永久性的znode)
永久有效地节点,除非client显式的删除,否则一直存在
Sequence Nodes(顺序性的znode)
顺序节点,client申请创建该节点时,zk会自动在节点路径末尾添加递增序号
Access Control List (ACL)
访问控制列表
The data stored at each znode in a namespace is read and written atomically(原子的). Reads get all the data bytes associated with a znode and a write replaces all the data. Each node has an Access Control List (ACL) that restricts(限制) who can do what.
ZooKeeper access control using ACLs
ZooKeeper uses ACLs to control access to its znodes (the data nodes of a ZooKeeper data tree). The ACL implementation is quite similar to UNIX file access permissions: it employs permission bits to allow/disallow various operations against a node and the scope to which the bits apply. Unlike standard UNIX permissions, a ZooKeeper node is not limited by the three standard scopes for user (owner of the file), group, and world (other). ZooKeeper does not have a notion of an owner of a znode. Instead, an ACL specifies sets of ids and permissions that are associated with those ids.
Note also that an ACL pertains only to a specific znode. In particular it does not apply to children. For example, if /app is only readable by ip:172.16.16.1 and /app/status is world readable, anyone will be able to read /app/status; ACLs are not recursive(递归).
ZooKeeper supports pluggable authentication schemes. Ids are specified using the form scheme:id, where scheme is a the authentication scheme that the id corresponds to. For example, ip:172.16.16.1 is an id for a host with the address 172.16.16.1.
When a client connects to ZooKeeper and authenticates itself, ZooKeeper associates all the ids that correspond to a client with the clients connection. These ids are checked against the ACLs of znodes when a clients tries to access a node. ACLs are made up of pairs of (scheme:expression, perms). The format of the expression is specific to the scheme. For example, the pair (ip:19.22.0.0/16, READ) gives the READ permission to any clients with an IP address that starts with 19.22.
ACL Permissions
ZooKeeper supports the following permissions:
- CREATE: you can create a child node
- READ: you can get data from a node and list its children.
- WRITE: you can set data for a node
- DELETE: you can delete a child node
- ADMIN: you can set permissions
The CREATE and DELETE permissions have been broken out of the WRITE permission for finer grained access controls. The cases for CREATE and DELETE are the following:
You want A to be able to do a set on a ZooKeeper node, but not be able to CREATE or DELETE children.
CREATE without DELETE: clients create requests by creating ZooKeeper nodes in a parent directory. You want all clients to be able to add, but only request processor can delete. (This is kind of like the APPEND permission for files.)
Also, the ADMIN permission is there since ZooKeeper doesn’t have a notion of file owner. In some sense the ADMIN permission designates the entity as the owner. ZooKeeper doesn’t support the LOOKUP permission (execute permission bit on directories to allow you to LOOKUP even though you can‘t list the directory). Everyone implicitly has LOOKUP permission. This allows you to stat a node, but nothing more. (The problem is, if you want to call zoo_exists() on a node that doesn‘t exist, there is no permission to check.)
ZooKeeper Stat Structure
如下查看znode的stat的信息,
[zk: localhost:2181(CONNECTED) 5] stat /zk_test cZxid = 0x600000004 ctime = Mon Mar 16 17:51:58 CST 2015 mZxid = 0x600000004 mtime = Mon Mar 16 17:51:58 CST 2015 pZxid = 0x600000004 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 6 numChildren = 0
The Stat structure for each znode in ZooKeeper is made up of the following fields:
czxid
The zxid of the change that caused this znode to be created.
mzxid
The zxid of the change that last modified this znode.
ctime
The time in milliseconds from epoch when this znode was created.
mtime
The time in milliseconds from epoch when this znode was last modified.
version
The number of changes to the data of this znode.
cversion
The number of changes to the children of this znode.
aversion
The number of changes to the ACL of this znode.
ephemeralOwner
The session id of the owner of this znode if the znode is an ephemeral node. If it is not an ephemeral node, it will be zero.
dataLength
The length of the data field of this znode.
numChildren
The number of children of this znode.
zxid
Every change to the ZooKeeper state receives a stamp in the form of a zxid (ZooKeeper Transaction Id). This exposes the total ordering of all changes to ZooKeeper. Each change will have a unique zxid and if zxid1 is smaller than zxid2 then zxid1 happened before zxid2.
====================================END====================================