Sharding & IDs at Instagram, Flickr ID generation

Instagram: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram

Flickr: http://code.flickr.net/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/

Ticket Server based on database (auto-increment in DB, ACID guarantee)

Instagram:

id should be chronologically sorted;

Sharding: several thousands "logical" shard, and hosted in far fewer physical shards => makes it much easier to move logical shards out to new physical shards.

Id genration : PL/PGSQL stored procedure to generate next_id() in database (transactionally), which contains : 41bits for current time in mililliseconds, 13 bits for logical shards, 10 bits for auto-incrementing sequence % 1024 => at most 1024 IDs per shard per millisecond;

Given the ID also encodes the shard, it‘s easy to find out shard for a primary key id

CREATE OR REPLACE FUNCTION insta5.next_id(OUT result bigint) AS $$
DECLARE
    our_epoch bigint := 1314220021721;
    seq_id bigint;
    now_millis bigint;
    shard_id int := 5;
BEGIN
    SELECT nextval(‘insta5.table_id_seq‘) %% 1024 INTO seq_id;

    SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
    result := (now_millis - our_epoch) << 23;
    result := result | (shard_id << 10);
    result := result | (seq_id);
END;
$$ LANGUAGE PLPGSQL;

CREATE TABLE insta5.our_table (
    "id" bigint NOT NULL DEFAULT insta5.next_id(),
    ...rest of table schema...
)

Flickr : http://code.flickr.net/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/

/* The Tickets64 schema looks like: */

CREATE TABLE `Tickets64` (
  `id` bigint(20) unsigned NOT NULL auto_increment,
  `stub` char(1) NOT NULL default ‘‘,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `stub` (`stub`)
) ENGINE=MyISAM
SELECT * from Tickets64 returns a single row that looks something like:
+-------------------+------+
| id                | stub |
+-------------------+------+
| 72157623227190423 |    a |
+-------------------+------+
/* When I need a new globally unique 64-bit ID I issue the following SQL: */

REPLACE INTO Tickets64 (stub) VALUES (‘a‘);
SELECT LAST_INSERT_ID();

In order not to make it a single-point-of-failure:

TicketServer1:
auto-increment-increment = 2
auto-increment-offset = 1

TicketServer2:
auto-increment-increment = 2
auto-increment-offset = 2

presentation on this topic:

http://media.postgresql.org/sfpug/instagram_sfpug.pdf 

时间: 2024-10-23 20:58:48

Sharding & IDs at Instagram, Flickr ID generation的相关文章

Sharding &amp; IDs at Instagram(转)

英文原文:http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram 译文:http://www.cnblogs.com/xiekeli/archive/2012/07/10/2584255.html Instagram的存储量非常大,差不多每秒25-90张照片.为了保证我们的重要的数据能够合理的存储以便快速的提取应用,我们对数据进行了分片 -- 换句话说,将数据放到很多小的桶(bucket

【解决】Linux Tomcat启动慢--Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [236,325] milliseconds

一.背景 今天部署项目到tomcat,执行./startup.sh命令之后,访问项目迟迟加载不出来,查看日志又没报错(其实是我粗心了,当时tomcat日志还没打印完),一开始怀疑是阿里云主机出现问题,访问ip:80发现nginx运行正常.在我百思不得其解时,项目访问正常了,查看启动日志,发现如下: 15-Mar-2018 16:41:02.302 WARNING [main] org.apache.catalina.util.SessionIdGeneratorBase.createSecure

Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [271] milliseconds.

翻译过来是:使用[SHA1PRNG]创建用于会话ID生成的SecureRandom实例花费了[271]毫秒. 1/[SHA1PRNG] java的一个基于SHA-1算法实现且保密性较强的伪随机数生成器,详见java.security.SecureRandom() 找到两种解决办法/https://blog.csdn.net/u011627980/article/details/54024974: 1)在Tomcat环境中解决 可以通过配置JRE使用非阻塞的Entropy Source. 在cat

Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [32,176] milliseco

有一次,我启动tomcat时,居然花费了33秒.我不理解为什么一个新的tomcat,需要这么久, 网上查找后,找到了一个解决方法. # vim /usr/local/tomcat/bin/catalina.sh --------------------------------------------------- JAVA_OPTS="-Djava.security.egd=file:/dev/./urandom" -----------------------------------

PyConwwwtl666666comPythonI3O9439III2Instagram

PyCon 简介PyCon 是全世界最大的以 Python 编程语言为主题的技术大会.大会由 Python 社区组织,每年举办一次.在大会上,来自世界各地的 Python 用户与核心开发者齐聚一堂,共同分享 Python 世界的新鲜事.Python 语言的应用案例.使用技巧等等内容. Instagram 简介Instagram 是一款移动端的照片与视频分享软件,由 Kevin Systrom 和 Mike Krieger 在 2010 年创办.Instagram 在发布后开始快速流行.于 201

分布式ID生成方案

分布式ID生成方案(分布式数据库) 背景:在互联网应用中,应用需要为每一个用户分配一个id,在使用分布式数据库情况下,已经不能依靠自增主键来生成唯一性id了... 根据特定算法生成唯一ID 可重现的id生成方案:使用用户提供的特定的数据源(登录凭证),通过某种算法生成id,这个过程是可重现的,只要用户提供的数据源是唯一的,那么生成的id也是唯一的. 例如通过用户注册的email+salt,使用摘要算法(md5/sha)生成128bit的数据,然后通过混合因子转变为一个long类型的数据是64bi

Cocoa Core Competencies_3_App ID

注: 该系列文章翻译自iOS Developer Library –> Cocoa Core Competencies Cocoa Core Competencies, 顾名思义 Cocoa核心概念.只是各个部分概念介绍, 更加详尽的学习, 参见各个章节提供的相关链接. 译者水平有限, 难免存在各种问题, 欢迎指正交流. 欢迎转载, 转载请注明出处: Colin App ID An App ID is a two-part string used to identify one or more

理解Certificate、App Id、Identifiers 和 Provisioning Profile

做真机测试的时候,按照网上的流程,走通了,当时没有注意各种证书等的意思.现在做消息推送,需要各种证书.APP ID信息,为了更好的理解这个过程,所以整理了网上关于证书等的相关资料.方便自己和有需要的朋友. 内容参考自:http://blog.csdn.net/hitwhylz/article/details/22989507 http://my.oschina.net/u/1245365/blog/196263 当你准备进行真机测试或者发布应用到App Store上去的时候, 免不了要申请相应的

mongodb自增id

mark下 自增id并获取 # ids: {"type":"wei_id", "id":0} db.ids.findAndModify({"update":{"$inc":{"id":1}}, "query":{"type":"wei_id"}, "new":"true"}) mongod