kafka迁移与扩容

参考官网site:

http://kafka.apache.org/documentation.html#basic_ops_cluster_expansion

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool

说明:

当我们对kafka集群扩容时,需要满足2点要求:

  1. 将指定topic迁移到集群内新增的node上。
  2. 将topic的指定partition迁移到新增的node上。

1. 迁移topic到新增的node上

假如现在一个kafka集群运行三个broker,broker.id依次为101,102,103,后来由于业务数据突然暴增,需要新增三个broker,broker.id依次为104,105,106.目的是要把push-token-topic迁移到新增node上。

1、脚本migration-push-token-topic.json文件内容如下:

Java代码  

  1. {
  2. "topics":
  3. [
  4. {
  5. "topic": "push-token-topic"
  6. }
  7. ],
  8. "version":1
  9. }

2、执行脚本如下所示:

Java代码  

  1. root@localhost:$  ./bin/kafka-reassign-partitions.sh --zookeeper 192.168.2.225:2183 --topics-to-move-json-file  migration-push-token-topic.json  --broker-list  "104,105,106"  --generate

生成分配partitions的json脚本 备份恢复使用:

Current partition replica assignment

{"version":1,"partitions":[{"topic":"cluster-switch-topic","partition":10,"replicas":[8]},{"topic":"cluster-switch-topic","partition":5,"replicas":[4]},{"topic":"cluster-switch-topic","partition":3,"replicas":[5]},{"topic":"cluster-switch-topic","partition":4,"replicas":[5]},{"topic":"cluster-switch-topic","partition":9,"replicas":[5]},{"topic":"cluster-switch-topic","partition":1,"replicas":[5]},{"topic":"cluster-switch-topic","partition":11,"replicas":[4]},{"topic":"cluster-switch-topic","partition":7,"replicas":[5]},{"topic":"cluster-switch-topic","partition":2,"replicas":[4]},{"topic":"cluster-switch-topic","partition":0,"replicas":[4]},{"topic":"cluster-switch-topic","partition":6,"replicas":[4]},{"topic":"cluster-switch-topic","partition":8,"replicas":[4]}]}

重新分配parttions的json脚本如下:

migration-topic-cluster-switch-topic.json

{"version":1,"partitions":[{"topic":"cluster-switch-topic","partition":10,"replicas":[5]},{"topic":"cluster-switch-topic","partition":5,"replicas":[4]},{"topic":"cluster-switch-topic","partition":4,"replicas":[5]},{"topic":"cluster-switch-topic","partition":3,"replicas":[4]},{"topic":"cluster-switch-topic","partition":9,"replicas":[4]},{"topic":"cluster-switch-topic","partition":1,"replicas":[4]},{"topic":"cluster-switch-topic","partition":11,"replicas":[4]},{"topic":"cluster-switch-topic","partition":7,"replicas":[4]},{"topic":"cluster-switch-topic","partition":2,"replicas":[5]},{"topic":"cluster-switch-topic","partition":0,"replicas":[5]},{"topic":"cluster-switch-topic","partition":6,"replicas":[5]},{"topic":"cluster-switch-topic","partition":8,"replicas":[5]}]}

3、执行:

Java代码  

  1. root@localhost:$   bin/kafka-reassign-partitions.sh --zookeeper 192.168.2.225:2183 --reassignment-json-file migration-topic-cluster-switch-topic.json --execute

执行后会生成一个json格式文件expand-cluster-reassignment.json

4、查询执行状态:

Java代码  

  1. bin/kafka-reassign-partitions.sh --zookeeper 192.168.2.225:2183 --reassignment-json-file expand-cluster-reassignment.json --verify

正常执行后会返回当前数据迁移的不用partion的,信息状态类似下面

Java代码  

  1. Reassignment of partition [push-token-topic,0] completed successfully   //移动成功
  2. Reassignment of partition [push-token-topic,1] is in progress          //这行代表数据在移动中
  3. Reassignment of partition [push-token-topic,2] is in progress
  4. Reassignment of partition [push-token-topic,1] completed successfully
  5. Reassignment of partition [push-token-topic,2] completed successfully

这样做不会影响原来集群上的topic业务

2.topic修改(replicats-factor)副本个数

假如初始时push-token-topic为一个副本,为了提高可用性,需要改为2副本模式。

脚本replicas-update-push-token-topic.json文件内容如下:

{

"partitions":

[

{

"topic": "log.mobile_nginx",

"partition": 0,

"replicas": [101,102,104]

},

{

"topic": "log.mobile_nginx",

"partition": 1,

"replicas": [102,103,106]

}

],

"version":1

}

2、执行:

Java代码  

  1. root@localhost:$ ./bin/kafka-reassign-partitions.sh --zookeeper   192.168.2.225:2183 --reassignment-json-file  replicas-update-push-token-topic.json  --execute

执行后会列出当前的partition和修改后的patition

3、verify

Java代码  

  1. bin/kafka-reassign-partitions.sh --zookeeper 192.168.2.225:2181 --reassignment-json-file replicas-update-push-token-topic.json --verify

如下:

Status of partition reassignment:
Reassignment of partition [log.mobile_nginx,0] completed successfully
Reassignment of partition [log.mobile_nginx,1] completed successfully

3.自定义分区和迁移

1、The first step is to hand craft the custom reassignment plan in a json file-

> cat custom-reassignment.json
{"version":1,"partitions":[{"topic":"foo1","partition":0,"replicas":[5,6]},{"topic":"foo2","partition":1,"replicas":[2,3]}]}

2、Then, use the json file with the --execute option to start the reassignment process-

> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file custom-reassignment.json --execute

Current partition replica assignment

{"version":1,
 "partitions":[{"topic":"foo1","partition":0,"replicas":[1,2]},
               {"topic":"foo2","partition":1,"replicas":[3,4]}]
}

Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions
{"version":1,
 "partitions":[{"topic":"foo1","partition":0,"replicas":[5,6]},
               {"topic":"foo2","partition":1,"replicas":[2,3]}]
}

3、The --verify option can be used with the tool to check the status of the partition reassignment. Note that the same expand-cluster-reassignment.json (used with the --execute option) should be used with the --verify option

bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file custom-reassignment.json --verify

Status of partition reassignment:
Reassignment of partition [foo1,0] completed successfully
Reassignment of partition [foo2,1] completed successfully

4.topic的分区扩容用法

a.先扩容分区数量,脚本如下:

例如:push-token-topic初始分区数量为12,目前到增加到15个

[email protected]:$ ./bin/kafka-topics.sh --zookeeper 192.168.2.225:2183 --alter --partitions 15 --topic   push-token-topic

b.设置topic分区副本

[email protected]:$ ./bin/kafka-reassign-partitions.sh --zookeeper  192.168.2.225:2183

--reassignment-json-file partitions-extension-push-token-topic.json  --execute

脚本partitions-extension-push-token-topic.json文件内容如下:

{

"partitions":

[

{

"topic": "push-token-topic",

"partition": 12,

"replicas": [101,102]

},

{

"topic": "push-token-topic",

"partition": 13,

"replicas": [103,104]

},

{

"topic": "push-token-topic",

"partition": 14,

"replicas": [105,106]

}

],

"version":1

}

时间: 2024-11-05 02:22:57

kafka迁移与扩容的相关文章

apache kafka迁移与扩容工具用法

kafka迁移与扩容工具使用 参考官网site:https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool 说明: 当我们对kafka集群扩容时,需要满足2点要求: 将指定topic迁移到集群内新增的node上. 将topic的指定partition迁移到新增的node上. 1. 迁移topic到新增的node上 假如现在一个kafka集

kafka集群扩容以及数据迁移

一 kafka集群扩容比较简单,机器配置一样的前提下只需要把配置文件里的brokerid改一个新的启动起来就可以.比较需要注意的是如果公司内网dns更改的不是很及时的话,需要给原有的旧机器加上新服务器的host,不然可能会产生controller服务器从zk上拿到域名但是解析不到新机器地址的情况. 二 集群扩容后数据是不会自动均衡到新机器上的,需要采用kafka-reassign-partitions.sh这个工具脚本.脚本可以工作在三种模式--generate,--execute,--veri

kafka集群扩容后的topic分区迁移

kafka集群扩容后的topic分区迁移 ./bin/kafka-topics.sh --zookeeper node3:2181,node4:2181,node5:2181  --alter --topic dftt --partitions 4 kafka集群扩容后,新的broker上面不会数据进入这些节点,也就是说,这些节点是空闲的:它只有在创建新的topic时才会参与工作.除非将已有的partition迁移到新的服务器上面:所以需要将一些topic的分区迁移到新的broker上. kaf

Kafka 进行机器扩容后的副本再平衡 和 为已有分区增加 replica 实践

今天是继续对之前 kafka 集群遗留问题的查漏补缺. 扩容后对副本进行再平衡: 今天检查 kafka manager 发现了一个 __consumer_offsets 主题(消费者分区位移保存主题)的 leader 副本只被部署在了已有三节点中的两个节点上.并没有将三个 broker 上都平均分布上副本,具体表现为 我们点开这个主题 可以发现原本是三个节点的我们,却非常不均匀的只有两个节点承担了存放该 partition 的任务. 所以我们需要重新非配这个 topic 的副本均匀的到三个节点上

Mycat生产实践---数据迁移与扩容实践

1 离线扩容缩容 工具目前从mycat1.6开始支持. 一.准备工作 1.mycat所在环境安装mysql客户端程序 2.mycat的lib目录下添加mysql的jdbc驱动包 3.对扩容缩容的表所有节点数据进行备份,以便迁移失败后的数据恢复 二.扩容缩容步骤 1.复制schema.xml.rule.xml并重命名为newSchema.xml.newRule.xml放于conf目录下 2.修改newSchema.xml和newRule.xml配置文件为扩容缩容后的mycat配置参数(表的节点数.

redis cluster异地数据迁移,扩容,缩容

由于项目的服务器分布在重庆,上海,台北,休斯顿,所以需要做异地容灾需求.当前的mysql,redis cluster,elastic search都在重庆的如果重庆停电了,整个应用都不能用了. 现在考虑第一步做重庆和上海的异地容灾,大概测试了一下重庆的几台服务器之间大概是13m/s的传输速度也就是说100M的局域网带宽,重庆到上海只有1.2m/s的传输速度,大概10M的局域网带宽. 第一个方案先考虑简单的  mysql 重庆上海主主同步  redis cluster的master节点默认都设置在

Kafka集群扩容遇到的问题

首先,看一下遇到的问题: Partitions reassignment failed due to Partition reassignment data file imp-imps-app-web.json is empty kafka.common.AdminCommandFailedException: Partition reassignment data file imp-imps-app-web.json is empty at kafka.admin.ReassignPartit

kafka之partition分区及副本replica升级

修改kafka的partition分区 bin/kafka-topics.sh --zookeeper datacollect-2:2181 --alter --partitions 3 --topic client-agent-1 修改kafka副本数 官网解释如下: Increasing replication factor Increasing the replication factor of an existing partition is easy. Just specify the

分布式公布订阅消息系统 Kafka 架构设计

我们为什么要搭建该系统 Kafka是一个消息系统,原本开发自LinkedIn,用作LinkedIn的活动流(activity stream)和运营数据处理管道(pipeline)的基础. 如今它已为多家不同类型的公司 作为多种类型的数据管道(data pipeline)和消息系统使用. 活动流数据是全部站点在对其站点使用情况做报表时要用到的数据中最常规的部分.活动数据包含页面訪问量(page view).被查看内容方面的信息以及搜索情况等内容.这样的数据通常的处理方式是先把各种活动以日志的形式写