Recovering unassigned shards on elasticsearch 2.x——副本shard可以设置replica为0在设置回来

Recovering unassigned shards on elasticsearch 2.x

摘自:https://z0z0.me/recovering-unassigned-shards-on-elasticsearch/

I got accross the problem when decided to add a node to the elasticsearch cluster and that node was not able to replicate the indexes of the cluster. This issue is usually happens when there is not enough disk space available, or not available master or different elasticsearch version. While my servers had more than enough disk space and also the master was available with the help of the elasticsearch discuss I found out that the new node was having a different version than old nodes. Basically while installing on Debian jessie I just run apt-get install elasticsearch which ended up installing the latest available version. To install a specific version of the elasticsearch you prety much need to add ={version}.

#apt-get install elasticsearch={version}

Now that I have identified the reasons for unallocated shards and successfully downgraded the elasticsearch to the required version by running the command above after starting the node the cluster was still in red state with unassigned shards all over the place:

#curl http://localhost:9200/_cluster/health?pretty
 {
   "cluster_name" : "z0z0",
   "status" : "red",
   "timed_out" : false,
   "number_of_nodes" : 3,
   "number_of_data_nodes" : 3,
   "active_primary_shards" : 6,
   "active_shards" : 12,
   "relocating_shards" : 0,
   "initializing_shards" : 0,
   "unassigned_shards" : 8,
   "delayed_unassigned_shards" : 0,
   "number_of_pending_tasks" : 0,
   "number_of_in_flight_fetch" : 0,
   "task_max_waiting_in_queue_millis" : 0,
   "active_shards_percent_as_number" : 60.0
 }

#curl http://localhost:9200/_cat/shards
site-id      4 p UNASSIGNED
site-id      4 r UNASSIGNED
site-id      1 p UNASSIGNED
site-id      1 r UNASSIGNED
site-id      3 p STARTED    0 159b 10.0.0.6 node-2
site-id      3 r STARTED    0 159b 10.0.0.7 node-3
site-id      2 r STARTED    0 159b 10.0.0.6 node-2
site-id      2 p STARTED    0 159b 10.0.0.7 node-3
site-id      0 r STARTED    0 159b 10.0.0.6 node-2
site-id      0 p STARTED    0 159b 10.0.0.7 node-3
subscription 4 p UNASSIGNED
subscription 4 r UNASSIGNED
subscription 1 p UNASSIGNED
subscription 1 r UNASSIGNED
subscription 3 p STARTED    0 159b 10.0.0.6 node-2
subscription 3 r STARTED    0 159b 10.0.0.7 node-3
subscription 2 r STARTED    0 159b 10.0.0.6 node-2
subscription 2 p STARTED    0 159b 10.0.0.7 node-3
subscription 0 p STARTED    0 159b 10.0.0.6 node-2
subscription 0 r STARTED    0 159b 10.0.0.7 node-3

At this point I was pretty desperate and whatever I tried it either did not do anything or ended up in all kind of failures. So I set the number_of_replicas to 0 by running the following query:

#curl -XPUT http://localhost:9200/_settings?pretty -d ‘
{
  "index" : {
    "number_of_replicas‘ : 0
  }
}‘

and started to stop the nodes one by one until I was having only one live node. 
At this point I decided to start trying to reroute the unassigned shards and if it won‘t work I would just start over my cluster. So I did run the following:

#curl -XPOST -d ‘
{
  "commands" : [ {
    "allocate" : {
      "index" : "site-id",
      "shard" : 1,
      "node" : "node-3",
      "allow_primary" : true
    }
  } ]
}‘ http://localhost:9200/_cluster/reroute?pretty

I‘ve seen that the rerouted shard became initialized then running so I did the same command on the rest of unassigned shards. 
Running curl http://localhost:9200/_cluster/health?pretty confirmed that I am on the good track to fix the cluster.

#curl http://localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "z0z0",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 10,
  "active_shards" : 20,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

So the cluster was green again but was running out of one node. So it was time to bring up the other nodes one by one. When all the nodes were up I set the number_of_replicas to 1 by running the following:

#curl -XPUT http://localhost:9200/_settings -d ‘
{
  "index" : {
    "number_of_replicas" : 1
  }
}‘

So my elasticsearch cluster is back on running 3 nodes and still in green state. After alot of googling and wasted time I decided to write this article so that if anyone would come accross this issue would have a working example of how to fix it.

时间: 2024-10-14 12:34:11

Recovering unassigned shards on elasticsearch 2.x——副本shard可以设置replica为0在设置回来的相关文章

How to resolve unassigned shards in Elasticsearch——写得非常好

How to resolve unassigned shards in Elasticsearch 转自:https://www.datadoghq.com/blog/elasticsearch-unassigned-shards/ In Elasticsearch, a healthy cluster is a balanced cluster: primary and replica shards are distributed across all nodes for durable re

Reroute Unassigned Shards——遇到主shard 出现的解决方法就是重新路由

Red Cluster! 摘自:http://blog.kiyanpro.com/2016/03/06/elasticsearch/reroute-unassigned-shards/ There are 3 cluster states: green: All primary and replica shards are active yellow: All primary shards are active, but not all replica shards are active red

Kafka副本管理—— 为何去掉replica.lag.max.messages参数

今天查看Kafka 0.10.0的官方文档,发现了这样一句话:Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync. 即replica.lag.max.messages参数被正式地移除了,现在topic

Hadoop 副本放置策略的源码阅读和设置

本文通过MetaWeblog自动发布,原文及更新链接:https://extendswind.top/posts/technical/hadoop_block_placement_policy 大多数的叫法都是副本放置策略,实质上是HDFS对所有数据的位置放置策略,并非只是针对数据的副本.因此Hadoop的源码里有block replicator(configuration). BlockPlacementPolicy(具体逻辑源码)两种叫法. 主要用途:上传文件时决定文件在HDFS上存储的位置

Elasticsearch入门学习(一):安装ES7.0.1

一.Elasticsearch介绍 之前有学习使用过Solr.Elasticsearch也是基于Lucene的搜索服务器.它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口.Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎.设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便.官方客户端在Java..NET(C#).PHP.Python.Apache Groovy.Ruby和许多其他语言中都

Elasticsearch cluster health: yellow unassigned shards

查看ES各个分片的状态 $ curl -XGET http://127.0.0.1:9200/_cluster/health?pretty { "cluster_name" : "elasticsearch_brew", "status" : "yellow", "timed_out" : false, "number_of_nodes" : 1, "number_of_dat

(29)ElasticSearch分片和副本机制以及单节点环境中创建index解析

1.分片和副本机制 1.index包含多个shard 2.每个shard都是一个最小工作单元,承担部分数据:每个shard都是一个lucene示例,有完整的建立索引和处理请求的能力 3.增减节点时,shard会自动在nodes中负载均衡 4.primary shard和replica shard,每个document只存在于某个primary shard以及其对应的replica shard中,不可能存在于多个primary shard 5.replica shard是primary shard

Elasticsearch 简介

1. 背景 Elasticsearch 在公司的使用越来越广,很多同事之前并没有接触过 Elasticsearch,所以,最近在公司准备了一次关于 Elasticsearch 的分享,整理成此文.此文面向 Elasticsearch 新手,老司机们可以撤了. 2. 倒排索引 先简单介绍下搜索引擎的基础数据结构倒排索引. 我们在平时,会经常使用各种各样的索引,如我们根据链接,可以找到链接里的具体文本,这就是索引.反过来,如果,如果我们能根据具体文本,找到文本存在的具体链接,这就是倒排索引,可简单理

打造高性能高可靠的块存储系统

块存储系统 分布式存储有出色的性能,可以扛很多故障,能够轻松扩展,所以我们使用Ceph构建了高性能.高可靠的块存储系统,并使用它支撑公有云和托管云的云主机.云硬盘服务. 由于使用分布式块存储系统,避免了复制镜像的过程,所以云主机的创建时间可以缩短到10秒以内,而且云主机还能快速热迁移,方便了运维人员对物理服务器上硬件和软件的维护. 用户对于块存储系统最直观的感受来源于云硬盘服务,现在我们的云硬盘的特点是: 每个云硬盘最大支持 6000 IOPS和170 MB/s的吞吐率,95%的4K随机写操作的