Last Updated : 28 Apr, 2025
Zookeeper is a distributed, open-source coordination service for distributed applications. It exposes a simple set of primitives to implement higher-level services for synchronization, configuration maintenance, and group and naming.
In a distributed system, there are multiple nodes or machines that need to communicate with each other and coordinate their actions. ZooKeeper provides a way to ensure that these nodes are aware of each other and can coordinate their actions. It does this by maintaining a hierarchical tree of data nodes called "Znodes", which can be used to store and retrieve data and maintain state information. ZooKeeper provides a set of primitives, such as locks, barriers, and queues, that can be used to coordinate the actions of nodes in a distributed system. It also provides features such as leader election, failover, and recovery, which can help ensure that the system is resilient to failures. ZooKeeper is widely used in distributed systems such as Hadoop, Kafka, and HBase, and it has become an essential component of many distributed applications.
Why do we need it?Apache Zookeeper is a distributed, open-source coordination service for distributed systems. It provides a central place for distributed applications to store data, communicate with one another, and coordinate activities. Zookeeper is used in distributed systems to coordinate distributed processes and services. It provides a simple, tree-structured data model, a simple API, and a distributed protocol to ensure data consistency and availability. Zookeeper is designed to be highly reliable and fault-tolerant, and it can handle high levels of read and write throughput.
Zookeeper is implemented in Java and is widely used in distributed systems, particularly in the Hadoop ecosystem. It is an Apache Software Foundation project and is released under the Apache License 2.0.
Architecture of Zookeeper Zookeeper ServicesThe ZooKeeper architecture consists of a hierarchy of nodes called znodes, organized in a tree-like structure. Each znode can store data and has a set of permissions that control access to the znode. The znodes are organized in a hierarchical namespace, similar to a file system. At the root of the hierarchy is the root znode, and all other znodes are children of the root znode. The hierarchy is similar to a file system hierarchy, where each znode can have children and grandchildren, and so on.
Important Components in Zookeeper ZooKeeper ServicesIn Zookeeper, data is stored in a hierarchical namespace, similar to a file system. Each node in the namespace is called a Znode, and it can store data and have children. Znodes are similar to files and directories in a file system. Zookeeper provides a simple API for creating, reading, writing, and deleting Znodes. It also provides mechanisms for detecting changes to the data stored in Znodes, such as watches and triggers. Znodes maintain a stat structure that includes: Version number, ACL, Timestamp, Data Length
Types of Znodes:
Zookeeper is used to manage and coordinate the nodes in a Hadoop cluster, including the NameNode, DataNode, and ResourceManager. In a Hadoop cluster, Zookeeper helps to:
Zookeeper helps to ensure the availability and reliability of a Hadoop cluster by providing a central coordination service for the nodes in the cluster.
How ZooKeeper in Hadoop Works?ZooKeeper operates as a distributed file system and exposes a simple set of APIs that enable clients to read and write data to the file system. It stores its data in a tree-like structure called a znode, which can be thought of as a file or a directory in a traditional file system. ZooKeeper uses a consensus algorithm to ensure that all of its servers have a consistent view of the data stored in the Znodes. This means that if a client writes data to a znode, that data will be replicated to all of the other servers in the ZooKeeper ensemble.
One important feature of ZooKeeper is its ability to support the notion of a "watch." A watch allows a client to register for notifications when the data stored in a znode changes. This can be useful for monitoring changes to the data stored in ZooKeeper and reacting to those changes in a distributed system.
In Hadoop, ZooKeeper is used for a variety of purposes, including:
ZooKeeper is an essential component of Hadoop and plays a crucial role in coordinating the activity of its various subcomponents.
Reading and Writing in Apache ZookeeperZooKeeper provides a simple and reliable interface for reading and writing data. The data is stored in a hierarchical namespace, similar to a file system, with nodes called znodes. Each znode can store data and have children znodes. ZooKeeper clients can read and write data to these znodes by using the getData() and setData() methods, respectively. Here is an example of reading and writing data using the ZooKeeper Java API:
Java
// Connect to the ZooKeeper ensemble
ZooKeeper zk = new ZooKeeper("localhost:2181", 3000, null);
// Write data to the znode "/myZnode"
String path = "/myZnode";
String data = "hello world";
zk.create(path, data.getBytes(), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
// Read data from the znode "/myZnode"
byte[] bytes = zk.getData(path, false, null);
String readData = new String(bytes);
// Prints "hello world"
System.out.println(readData);
// Closing the connection
// to the ZooKeeper ensemble
zk.close();
Python3
from kazoo.client import KazooClient
# Connect to ZooKeeper
zk = KazooClient(hosts='localhost:2181')
zk.start()
# Create a node with some data
zk.ensure_path('/gfg_node')
zk.set('/gfg_node', b'some_data')
# Read the data from the node
data, stat = zk.get('/gfg_node')
print(data)
# Stop the connection to ZooKeeper
zk.stop()
Session and Watches
Session
Watches
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4