Centos7安装zookeeper+kafka集群

Centos7安装zookeeper+kafka集群

1  Zookeeper和kafka简介

1)  ZooKeeper

是一个分布式的、分层级的文件系统,能促进客户端间的松耦合,并提供最终一致的,用于管理、协调Kafka代理,zookeeper集群中一台服务器作为Leader,其它作为Follower

2)  Apache Kafka

是分布式发布-订阅消息系统,kafka对消息保存时根据Topic进行归类,每个topic将被分成多个partition(区),每条消息在文件中的位置称为offset(偏移量),offset为一个long型数字,它是唯一标记一条消息,它唯一的标记一条消息。

一个partition中的消息只会被group中的一个consumer消费,每个group中consumer消息消费互相独立,不过一个consumer可以消费多个partitions中的消息。

kafka只能保证一个partition中的消息被某个consumer消费时,消息是顺序的。从Topic角度来说,消息仍不是有序的。

每个partition都有一个server为"leader";leader负责所有的读写操作,如果leader失效,那么将会有其他follower来接管成为新的leader(有zookeeper选举);follower只是单调的和leader跟进,同步消息即可..由此可见作为leader的server承载了全部的请求压力,因此从集群的整体考虑,有多少个partitions就意味着有多少个"leader",kafka会将"leader"均衡的分散在每个实例上,来确保整体的性能稳定.

2  服务器地址

使用3台服务器搭建集群

Server1:192.168.89.11

Server2:192.168.89.12

Server3:192.168.89.13

3  安装jdk

(jdk1.8.0_102)

# yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel

4  搭建zookeeper集群

Zookeeper集群的工作是超过半数才能对外提供服务,所以选择主机数可以是1台、3台、5台…。

3台中允许1台挂掉 ,是否可以用偶数,其实没必要。

如果有4台,那么挂掉一台还剩下三台服务器,如果再挂掉一个就不行了,记住是超过半数。

4.1 下载地址

http://mirrors.shu.edu.cn/apache/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz

4.2 所有节点安装zookeeper

1)创建zookeeper安装目录

# mkdir -p /data/zookeeper

2)将zookeeper解压到安装目录

# tar -zxvf zookeeper-3.4.13.tar.gz -C /data/zookeeper/

3)新建保存数据的目录

# mkdir -p /data/zookeeper/zookeeper-3.4.13/data

4)新建日志目录

# mkdir -p /data/zookeeper/zookeeper-3.4.13/dataLog

5)配置环境变量并刷新

# vim /etc/profile

===================================================

export ZK_HOME=/data/zookeeper/zookeeper-3.4.13

export PATH=$PATH:$ZK_HOME/bin

===================================================

# source /etc/profile

4.3 所有节点配置zookeeper配置文件

4.3.1        各节点中配置

# cd /data/zookeeper/zookeeper-3.4.13/conf/

# cp -f zoo_sample.cfg zoo.cfg

# vim zoo.cfg

====================================================

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/data/zookeeper/zookeeper-3.4.13/data/

dataLogDir=/data/zookeeper/zookeeper-3.4.13/dataLog/

clientPort=2181

server.1=192.168.89.11:2888:3888

server.2=192.168.89.12:2888:3888

server.3=192.168.89.13:2888:3888

#第一个端口是master和slave之间的通信端口,默认是2888,第二个端口是leader选举的端口,集群刚启动的时候选举或者leader挂掉之后进行新的选举的端口默认是3888

====================================================

# echo "1" > /data/zookeeper/zookeeper-3.4.13/data/myid   #server1配置,各节点不同,跟上面配置server.1的号码一样

# echo "2" > /data/zookeeper/zookeeper-3.4.13/data/myid   #server2配置,各节点不同,跟上面配置server.2的号码一样

# echo "3" > /data/zookeeper/zookeeper-3.4.13/data/myid   #server3配置,各节点不同,跟上面配置server.3的号码一样

4.3.2        启动停止zookeeper命令并设置开机启动

1)启动停止zookeeper命令

# zkServer.sh start   #启动

# zkServer.sh stop   #停止

# zkCli.sh   #连接集群

2)设置开机启动

# cd /usr/lib/systemd/system

# vim zookeeper.service

=========================================

[Unit]

Description=zookeeper server daemon

After=zookeeper.target

[Service]

Type=forking

ExecStart=/data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh start

ExecReload=/data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh stop && sleep 2 && /data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh start

ExecStop=/data/zookeeper/zookeeper-3.4.13/bin/zkServer.sh stop

Restart=always

[Install]

WantedBy=multi-user.target

=======================================================

# systemctl start  zookeeper

# systemctl enable zookeeper

5   搭建kafka集群

5.1 下载地址

http://mirror.bit.edu.cn/apache/kafka/2.1.0/kafka_2.12-2.1.0.tgz

5.2 所有节点上搭建kafka

1)  新建kafka工作目录

# mkdir -p /data/kafka

2)  解压kafka

# tar -zxvf kafka_2.12-2.1.0.tgz -C /data/kafka/

3)  新建kafka日志目录

# mkdir -p /data/kafka/kafkalogs

4)  配置kafka配置文件

# vim /data/kafka/kafka_2.12-2.1.0/config/server.properties

=================================================

broker.id=1   #每一个broker在集群中的唯一标示,要求是正数

listeners=PLAINTEXT://192.168.89.11:9092  # 套接字服务器连接的地址

log.dirs=/data/kafka/kafkalogs/    #kafka数据的存放地址

message.max.byte=5242880     #消息体的最大大小,单位是字节

log.cleaner.enable=true #开启日志清理

log.retention.hours=72    #segment文件保留的最长时间(小时),超时将被删除,也就是说3天之前的数据将被清理掉

log.segment.bytes=1073741824  #日志文件中每个segmeng的大小(字节),默认为1G

log.retention.check.interval.ms=300000  #定期检查segment文件有没有达到1G(单位毫秒)

num.partitions=3  #每个topic的分区个数, 更多的分区允许更大的并行操作default.replication.factor=3  # 一个topic ,默认分区的replication个数 ,不得大于集群中broker的个数

delete.topic.enable=true   # 选择启用删除主题功能,默认false

replica.fetch.max.bytes=5242880  # replicas每次获取数据的最大大小

#以下三个参数设置影响消费者消费分区可以连接的kafka主机,详细请看第6点附录

offsets.topic.replication.factor=3  #Offsets topic的复制因子(备份数)

transaction.state.log.replication.factor=3  #事务主题的复制因子(设置更高以确保可用性)

transaction.state.log.min.isr=3 #覆盖事务主题的min.insync.replicas配置

zookeeper.connect=192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181

#zookeeper集群的地址,可以是多个

=================================================

5)   kafka节点默认需要的内存为1G,如果需要修改内存,可以修改kafka-server-start.sh的配置项

# vim /data/kafka/kafka_2.12-2.1.0/bin/kafka-server-start.sh

#找到KAFKA_HEAP_OPTS配置项,例如修改如下:

export KAFKA_HEAP_OPTS="-Xmx2G -Xms2G"

5.3 启动kafka并设置开机启动

1)启动kafka

# cd /data/kafka/kafka_2.12-2.1.0/

./bin/kafka-server-start.sh -daemon ./config/server.properties

启动后可以执行jps命令查看kafka是否启动,如果启动失败,可以进入logs目录,查看kafkaServer.out日志记录。

2)设置开机启动

# cd /usr/lib/systemd/system

# vim kafka.service

=========================================

[Unit]

Description=kafka server daemon

After=kafka.target

[Service]

Type=forking

ExecStart=/data/kafka/kafka_2.12-2.1.0/bin/kafka-server-start.sh -daemon /data/kafka/kafka_2.12-2.1.0/config/server.properties

ExecReload=/data/kafka/kafka_2.12-2.1.0/bin/kafka-server-stop.sh && sleep 2 && /data/kafka/kafka_2.12-2.1.0/bin/kafka-server-start.sh -daemon /

data/kafka/kafka_2.12-2.1.0/config/server.properties

ExecStop=/data/kafka/kafka_2.12-2.1.0/bin/kafka-server-stop.sh

Restart=always

[Install]

WantedBy=multi-user.target

=======================================================

# systemctl start kafka

# systemctl enable kafka

5.4 创建topic

创建3分区、3备份

# cd /data/kafka/kafka_2.12-2.1.0/

#./bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 3 --partitions 3 --topic SyslogTopic

5.5 常用命令

# cd /data/kafka/kafka_2.12-2.1.0/

1)  停止kafka

./bin/kafka-server-stop.sh

2)  创建topic

./bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 1 --partitions 1 --topic topic_name

3)  展示topic

./bin/kafka-topics.sh --list --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181

4) 查看描述topic

./bin/kafka-topics.sh --describe --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --topic topic_name

5)  生产者发送消息

./bin/kafka-console-producer.sh --broker-list 192.168.89.11:9092 --topic topic_name

6)  消费者消费消息

./bin/kafka-console-consumer.sh --bootstrap-server 192.168.89.11:9092,192.168.89.12:9092,192.168.89.13:9092 --topic topic_name

7)  删除topic

./bin/kafka-topics.sh --delete --topictopic_name --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181

8)  查看每分区consumer_offsets(可以连接到的消费主机)

./bin/kafka-topics.sh --describe --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --topic __consumer_offsets

6   附录

1、  消费者消费分区数据,kafka的负载均衡

在/data/kafka/kafka_2.12-2.1.0/config/server.properties文件中修改offsets.topic.replication.factor,这个值为kafka集群的主机数量(__consumer_offest不受server.properties中num.partitions和default.replication.factor参数的制约。相反地,它的分区数和备份因子分别由offsets.topic.num.partitions和offsets.topic.replication.factor参数决定。这两个参数的默认值分别是50和1,表示该topic有50个分区,副本因子是1。),设置正确如下图:

(执行命令:bin/kafka-topics.sh --describe --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --topic __consumer_offsets)

注:如果配置文件offsets.topic.replication.factor设置成1后启动了一次,再设置成3重新启动副本因子不会更改,需要以下方法:

另外一种方法设置:

1)  创建规则json

cat > increase-replication-factor.json <<EOF

{"version":1, "partitions":[

{"topic":"__consumer_offsets","partition":0,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":1,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":2,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":3,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":4,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":5,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":6,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":7,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":8,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":9,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":10,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":11,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":12,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":13,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":14,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":15,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":16,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":17,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":18,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":19,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":20,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":21,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":22,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":23,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":24,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":25,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":26,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":27,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":28,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":29,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":30,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":31,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":32,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":33,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":34,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":35,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":36,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":37,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":38,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":39,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":40,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":41,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":42,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":43,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":44,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":45,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":46,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":47,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":48,"replicas":[1,2,3]},

{"topic":"__consumer_offsets","partition":49,"replicas":[1,2,3]}]

}

EOF

2)  执行

bin/kafka-reassign-partitions.sh --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --reassignment-json-file increase-replication-factor.json --execute

3)  验证

bin/kafka-reassign-partitions.sh --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --reassignment-json-file increase-replication-factor.json --verify

原文地址:https://www.cnblogs.com/longBlogs/p/10340251.html

时间: 2024-08-21 16:24:17

Centos7安装zookeeper+kafka集群的相关文章

zookeeper+kafka集群安装之二

zookeeper+kafka集群安装之二 此为上一篇文章的续篇, kafka安装需要依赖zookeeper, 本文与上一篇文章都是真正分布式安装配置, 可以直接用于生产环境. zookeeper安装参考: http://blog.csdn.net/ubuntu64fan/article/details/26678877 首先了解几个kafka中的概念: kafka是一个消息队列服务器,服务称为broker, 消息发送者称为producer, 消息接收者称为consumer; 通常我们部署多个b

zookeeper+kafka集群安装之一

zookeeper+kafka集群安装之一 准备3台虚拟机, 系统是RHEL64服务版. 1) 每台机器配置如下: $ cat /etc/hosts ... # zookeeper hostnames: 192.168.8.182 zk1 192.168.8.183 zk2 192.168.8.184 zk3 2) 每台机器上安装jdk, zookeeper, kafka, 配置如下: $ vi /etc/profile ... # jdk, zookeeper, kafka export KA

Zookeeper+Kafka集群部署

Zookeeper+Kafka集群部署 主机规划: 10.200.3.85  Kafka+ZooKeeper 10.200.3.86  Kafka+ZooKeeper 10.200.3.87  Kafka+ZooKeeper 软件下载地址: #wget http://mirrors.hust.edu.cn/apache/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz #wget http://mirror.bit.edu.cn/apache/

zookeeper+kafka集群的安装

时效性要求很高的数据,库存,采取的是数据库+缓存双写的技术方案,也解决了双写的一致性的问题 缓存数据生产服务,监听一个消息队列,然后数据源服务(商品信息管理服务)发生了数据变更之后,就将数据变更的消息推送到消息队列中 缓存数据生产服务可以去消费到这个数据变更的消息,然后根据消息的指示提取一些参数,然后调用对应的数据源服务的接口,拉去数据,这个时候一般是从mysql库中拉去的 1.zookeeper集群搭建 zookeeper-3.4.5.tar.gz使用WinSCP拷贝到/usr/local目录

window环境搭建zookeeper,kafka集群

为了演示集群的效果,这里准备一台虚拟机(window 7),在虚拟机中搭建了单IP多节点的zookeeper集群(多IP节点的也是同理的),并且在本机(win 7)和虚拟机中都安装了kafka. 前期准备说明: 1.三台zookeeper服务器,本机安装一个作为server1,虚拟机安装两个(单IP) 2.三台kafka服务器,本机安装一个作为server1,虚拟机安装两个. 备注:当然你可以直接在虚拟机上安装三个服务器分别为server1.server2.server3 . 虚拟机和本机网络环

【kafka】安装部署kafka集群(kafka版本:kafka_2.12-2.3.0)

2.1 下载kafka并安装kafka_2.12-2.3.0.tgz tar -zxvf kafka_2.12-2.3.0.tgz 2.2 配置kafka集群 在config/server.properties中修改参数: [[email protected] kafka_2.12-2.3.0]$ cd config [[email protected] config]$ gedit server.properties 参数1:添加host.name=hadoop01 #############

zookeeper+KAFKA 集群搭建

ZooKeeper是一个分布式的,开放源码的分布式应用程序协调服务,是Google的Chubby一个开源的实现,是Hadoop和Hbase的重要组件.它是一个为分布式应用提供一致性服务的软件,提供的功能包括:配置维护.域名服务.分布式同步.集群管理等. 因为Kafka集群是把状态信息保存在Zookeeper中的,并且Kafka的动态扩容是通过Zookeeper来实现的,所以需要优先搭建Zookeerper集群,建立分布式状态管理.开始准备环境,搭建集群: zookeeper是基于Java环境开发

linux下安装zookeeper(集群版)

在linux下安装zookeeper(单机版)中已经介绍了如何在linux中搭建单机版本的zookeeper,本篇将基于上一篇的基础上继续搭建集群版的zookeeper. 在原来的基础上再准备两台虚拟机: 我的虚拟机ip分别是:192.168.174.132,192.168.174.130,192.168.174.131 对应的hostname分别是:master,slave1,slave2 hostname可自行查看和修改:http://jingyan.baidu.com/article/57

安装部署Kafka集群

kafka是一个开源的分布式消息订阅系统(消息中间件) 安装过程 1.下载kafka_2.11-0.10.1.0.gz(ps:千万不要下错了,博主就是下到了src文件上去了,kafka中的zookeeper起不起来) 2.上传至/usr/local/src 3.解压缩,并且移动到上级目录 4.进入主目录的config子目录, 5.修改server.properties配置文件 vim server.properties 内容如下: 6.保存并退出 7.主节点配置完毕,远程复制到另外两个节点 sc