

Partitioning: how to split data among multiple Redis instances.


Partitioning is the process of splitting your data into multiple Redis instances, so that every instance will only contain a subset of your keys. The first part of this document will introduce you to the concept of partitioning, the second part will show you the alternatives for Redis partitioning.


Why partitioning is useful


Partitioning in Redis serves two main goals:


  • It allows for much larger databases, using the sum of the memory of many computers. Without partitioning you are limited to the amount of memory a single computer can support.
  • It allows to scale the computational power to multiple cores and multiple computers, and the network bandwidth to multiple computers and network adapters.

Partitioning basics


There are different partitioning criteria. Imagine we have four Redis instances R0, R1, R2, R3, and many keys representing users like user:1user:2, ... and so forth, we can find different ways to select in which instance we store a given key. In other words there are different systems to map a given key to a given Redis server.

有多种分区方式。比如:我们有四个redis实例:R0, R1, R2, R3和许多代表用户的键(像 user:1user:2)等等,我可以用不同的方式来从中选择一个实例来存储一个键。换句话说,有不同的系统来映射给定的键存储到给定的redis服务器中。

One of the simplest way to perform partitioning is called range partitioning, and is accomplished by mapping ranges of objects into specific Redis instances. For example I could say, users from ID 0 to ID 10000 will go into instanceR0, while users form ID 10001 to ID 20000 will go into instance R1 and so forth.

一个最简单的分区方法就是范围分区,并通过具体的实例对象来映射该范围。比如,id 1到10000的用户存储到R0中,10001到20000的用户存储到R1中,依此类推。

This systems works and is actually used in practice, however it has the disadvantage that there is to take a table mapping ranges to instances. This table needs to be managed and we need a table for every kind of object we have. Usually with Redis it is not a good idea.

An alternative to to range partitioning is hash partitioning. This scheme works with any key, no need for a key in the form object_name:<id> as is as simple as this:


  • Take the key name and use an hash function to turn it into a number. For instance I could use the crc32 hash function. So if the key is foobar I do crc32(foobar) that will output something like 93024922.
  • I use a modulo operation with this number in order to turn it into a number between 0 and 3, so that I can map this number to one of the four Redis instances I‘ve. So 93024922 modulo 4 equals 2, so I know my key foobar should be stored into the R2 instance. Note: the modulo operation is just the rest of the division, usually it is implemented by the% operator in many programming languages.

There are many other ways to perform partitioning, but with this two examples you should get the idea. One advanced form of hash partitioning is called consistent hashing and is implemented by a few Redis clients and proxies.


Different implementations of partitioning


Partitioning can be responsibility of different parts of a software stack.


  • Client side partitioning means that the clients directly select the right node where to write or read a given key. Many Redis clients implement client side partitioning.
  • Proxy assisted partitioning means that our clients send requests to a proxy that is able to speak the Redis protocol, instead of sending requests directly to the right Redis instance. The proxy will make sure to forward our request to the right Redis instance accordingly to the configured partitioning schema, and will send the replies back to the client. The Redis and Memcached proxy Twemproxy implements proxy assisted partitioning.
    代理辅助分区: 是指客户端把请求通过redis协议发送给代理,而不是直接发送给真正的redis实例服务器。这个代理会确保我们的请求根据配置分区架构发送到正确的redis实例上,并返回给客户端。redis和memcached的代理都是用 Twemproxy(twitter的一个代理框架)来实现代理服务分区的。
  • Query routing means that you can send your query to a random instance, and the instance will make sure to forward your query to the right node. Redis Cluster implements an hybrid form of query routing, with the help of the client (the request is not directly forwarded from a Redis instance to another, but the client gets redirected to the right node).

Disadvantages of partitioning

Some features of Redis don‘t play very well with partitioning:


  • Operations involving multiple keys are usually not supported. For instance you can‘t perform the intersection between two sets if they are stored in keys that are mapped to different Redis instances (actually there are ways to do this, but not directly).
  • Redis transactions involving multiple keys can not be used.
  • The partitioning granuliary is the key, so it is not possible to shard a dataset with a single huge key like a very big sorted set.
  • When partitioning is used, data handling is more complex, for instance you have to handle multiple RDB / AOF files, and to make a backup of your data you need to aggregate the persistence files from multiple instances and hosts.
  • Adding and removing capacity can be complex. For instance Redis Cluster plans to support mostly transparent rebalancing of data with the ability to add and remove nodes at runtime, but other systems like client side partitioning and proxies don‘t support this feature. However a technique called Presharding helps in this regard.

Data store or cache?

Partitioning when using Redis ad a data store or cache is conceptually the same, however there is a huge difference. While when Redis is used as a data store you need to be sure that a given key always maps to the same instance, when Redis is used as a cache if a given node is unavailable it is not a big problem if we start using a different node, altering the key-instance map as we wish to improve the availability of the system (that is, the ability of the system to reply to our queries).

Consistent hashing implementations are often able to switch to other nodes if the preferred node for a given key is not available. Similarly if you add a new node, part of the new keys will start to be stored on the new node.

The main concept here is the following:

  • If Redis is used as a cache scaling up and down using consistent hashing is easy.
  • If Redis is used as a store, we need to take the map between keys and nodes fixed, and a fixed number of nodes. Otherwise we need a system that is able to rebalance keys between nodes when we add or remove nodes, and currently only Redis Cluster is able to do this, but Redis Cluster is not production ready.



We learned that a problem with partitioning is that, unless we are using Redis as a cache, to add and remove nodes can be tricky, and it is much simpler to use a fixed keys-instances map.


However the data storage needs may vary over the time. Today I can live with 10 Redis nodes (instances), but tomorrow I may need 50 nodes.


Since Redis is extremely small footprint and lightweight (a spare instance uses 1 MB of memory), a simple approach to this problem is to start with a lot of instances since the start. Even if you start with just one server, you can decide to live in a distributed world since your first day, and run multiple Redis instances in your single server, using partitioning.


And you can select this number of instances to be quite big since the start. For example, 32 or 64 instances could do the trick for most users, and will provide enough room for growth.


In this way as your data storage needs increase and you need more Redis servers, what to do is to simply move instances from one server to another. Once you add the first additional server, you will need to move half of the Redis instances from the first server to the second, and so forth.


Using Redis replication you will likely be able to do the move with minimal or no downtime for your users:
你可以使用redis 的主从复制来减少服务的停止时间:

  • Start empty instances in your new server.
  • Move data configuring these new instances as slaves for your source instances.
  • Stop your clients.
  • Update the configuration of the moved instances with the new server IP address.
  • Send the SLAVEOF NO ONE command to the slaves in the new server.
    发送slave no one 命令到新服务器的从节点。
  • Restart your clients with the new updated configuration.
  • Finally shut down the no longer used instances in the old server.

Implementations of Redis partitioning

So far we covered Redis partitioning in theory, but what about practice? What system should you use?


Redis Cluster


Unfortunately Redis Cluster is currently not production ready, however you can get more information about it reading the specification or checking the partial implementation in the unstable branch of the Redis GitHub repositoriy.


Once Redis Cluster will be available, and if a Redis Cluster complaint client is available for your language, Redis Cluster will be the de facto standard for Redis partitioning.


Redis Cluster is a mix between query routing and client side partitioning.



Twemproxy 框架

Twemproxy is a proxy developed at Twitter for the Memcached ASCII and the Redis protocol. It is single threaded, it is written in C, and is extremely fast. It is open source software released under the terms of the Apache 2.0 license.
Twemproxy是一个由Twitter开发的适合memached和redis协议的代理。它是单线程工作,使用C语言实现的,非常的快速。并且是Apache 2.0版权申明下的开源软件。

Twemproxy supports automatic partitioning among multiple Redis instances, with optional node ejection if a node is not available (this will change the keys-instances map, so you should use this feature only if you are using Redis as a cache).


It is not a single point of failure since you can start multiple proxies and instruct your clients to connect to the first that accepts the connection.


Basically Twemproxy is an intermediate layer between clients and Redis instances, that will reliably handle partitioning for us with minimal additional complexities. Currently it is the suggested way to handle partitioning with Redis.


You can read more about Twemproxy in this antirez blog post.


Clients supporting consistent hashing


An alternative to Twemproxy is to use a client that implements client side partitioning via consistent hashing or other similar algorithms. There are multiple Redis clients with support for consistent hashing, notably Redis-rb and Predis.

Please check the full list of Redis clients to check if there is a mature client with consistent hashing implementation for your language.



时间: 2024-10-07 16:28:27



分区的概念 分区是分割数据到多个Redis实例的处理过程,因此每个实例只保存key的一个子集. 如果只使用一个redis实例时,其中保存了服务器中全部的缓存数据,这样会有很大风险,如果单台redis服务宕机了将会影响到整个服务.解决的方法就是我们可以采用分片/分区的技术,将原来一台服务器维护的整个缓存,现在换为由多台服务器共同维护内存空间. 分片的实现 说明与分析: 关于redis的安装参照上一篇,默认安装好了redis. 思路:采用在一台主机上实现分片的方式,所以只需要在该主机上配置启动三台r

PEP 484 类型提示 -- Python官方文档译文 [原创]

英文原文: 采集日期:2019-12-27 PEP 484 -- 类型提示(Type Hints) PEP: 484 Title: Type Hints Author: Guido van Rossum <guido at>, Jukka Lehtosalo <jukka.lehtosalo at>, ?ukasz Langa <lukasz at pytho

PEP 3141 数值类型的层次结构 -- Python官方文档译文 [原创]

PEP 3141 -- 数值类型的层次结构(A Type Hierarchy for Numbers) 英文原文: 采集日期:2020-02-27 PEP: 3141 Title: A Type Hierarchy for Numbers Author: Jeffrey Yasskin [email protected] Status: Final Type: Standards Track Created: 23-

PEP 443 单分派泛型函数 -- Python官方文档译文 [原创]

PEP 443 -- 单分派泛型函数(Single-dispatch generic functions) 英文原文: 采集日期:2020-03-17 PEP: 443 Title: Single-dispatch generic functions Author: ?ukasz Langa [email protected] Discussions-To: Python-Dev [email protected]


本文是redis学习系列的第四篇,前面我们学习了redis的数据结构和一些高级特性,点击下面链接可回看 <详细讲解redis数据结构(内存模型)以及常用命令> <redis高级应用(主从.事务与锁.持久化)> 本文我们继续学习redis的高级特性--集群.本文主要内容包括集群搭建.集群分区原理和集群操作的学习. Redis集群简介 Redis 集群是3.0之后才引入的,在3.0之前,使用哨兵(sentinel)机制(本文将不做介绍,大家可另行查阅)来监控各个节点之间的状态.Redi


正文 Redis Partitioning即Redis分区,简单的说就是将数据分布到不同的redis实例中,因此对于每个redis实例所存储的内容仅仅是所有内容的一个子集.分区(Partitioning)不仅仅是Redis中的概念,几乎是所有数据存储系统都会涉及到的概念,这篇文章将会在理解分区基本概念的基础之上进一步了解Redis对分区的支持. 一.我们为什么要分区 我们为什么要分区?分区的动机是什么?通常来说,Redis分区的好处大致有如下两个方面: 性能的提升,单机Redis的网络I/O能力


redis学习教程五<管道.分区> 一:管道 Redis是一个TCP服务器,支持请求/响应协议. 在Redis中,请求通过以下步骤完成: 客户端向服务器发送查询,并从套接字读取,通常以阻塞的方式,用于服务器响应. 服务器处理命令并将响应发送回客户端. 管道的意义 管道的基本含义是,客户端可以向服务器发送多个请求,而不必等待回复,并最终在一个步骤中读取回复. 示例 要检查Redis管道,只需启动Redis实例,并在终端中键入以下命令. (echo -en "PING\r\n SET t


文章主目录 Redis集群简介 Redis集群搭建 Redis集群分区原理 集群操作 参考文档 本文是redis学习系列的第四篇,前面我们学习了redis的数据结构和一些高级特性,点击下面链接可回看 <详细讲解redis数据结构(内存模型)以及常用命令> <redis高级应用(主从.事务与锁.持久化)> 本文我们继续学习redis的高级特性--集群.本文主要内容包括集群搭建.集群分区原理和集群操作的学习. 回到顶部 Redis集群简介 Redis 集群是3.0之后才引入的,在3.0


认识的Redis 官方原文: Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglo