What is a partition key?

DynamoDB supports two types of primary keys:

  • Partition key: A simple primary key, composed of one attribute known as the partition key. Attributes in DynamoDB are similar in many ways to fields or columns in other database systems.
  • Partition key and sort key: Referred to as a composite primary key, this type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key. Following is an example.

Why do I need a partition key?

DynamoDB stores data as groups of attributes, known as items. Items are similar to rows or records in other database systems. DynamoDB stores and retrieves each item based on the primary key value, which must be unique. Items are distributed across 10-GB storage units, called partitions (physical storage internal to DynamoDB). Each table has one or more partitions, as shown in the following illustration. For more information, see Partitions and Data Distributionin the DynamoDB Developer Guide.

DynamoDB uses the partition key’s value as an input to an internal hash function. The output from the hash function determines the partition in which the item is stored. Each item’s location is determined by the hash value of its partition key.

All items with the same partition key are stored together, and for composite partition keys, are ordered by the sort key value. DynamoDB splits partitions by sort key if the collection size grows bigger than 10 GB.

Partition keys and request throttling

DynamoDB evenly distributes provisioned throughput—read capacity units (RCUs) and write capacity units (WCUs)—among partitions and automatically supports your access patterns using the throughput you have provisioned. However, if your access pattern  exceeds 3000 RCU or 1000 WCU for a single partition key value, your requests might be throttled with a ProvisionedThroughputExceededException error.

Reading or writing above the limit can be caused by these issues:

  • Uneven distribution of data due to the wrong choice of partition key
  • Frequent access of the same key in a partition (the most popular item, also known as a hot key)
  • A request rate greater than the provisioned throughput

To avoid request throttling, design your DynamoDB table with the right partition key to meet your access requirements and provide even distribution of data.

原文地址:https://www.cnblogs.com/cloudrivers/p/11619356.html

时间: 2024-10-24 13:40:09

What is a partition key?的相关文章

cassandra的primary key, partition key, cluster key,

https://stackoverflow.com/questions/24949676/difference-between-partition-key-composite-key-and-clustering-key-in-cassandra primary key是一个宏观概念,用于从表中取出数据,primary key由多个column组合而成. create table stackoverflow ( key text PRIMARY KEY, data text ); 如上面的语句所

ORA-14400: inserted partition key does not map to any partition

数据库表已经分区,如果插入数据时出现错误提示: ORA-14400: 插入的分区关键字超出最高合法分区关键字. 原因是因为分区已经过期 解决方法: 手工添加了一个分区,终止日期大于当前日期即可. 建表的SQL: create table DATE (   ID            VARCHAR2(20) not null,   NEWYEAR   VARCHAR2(20) not null,   NEWMONTH  VARCHAR2(20) not null, ) partition by

ORA-14402: updating partition key column would cause a partition change

[说明]:提示此类错误说明表进行了分区,修改分区字段所引起的错误. [解决]:1.查询系统字典表 SELECT * FROM DBA_TABLES T WHERE T.TABLE_NAME = '表名称'; 如果出现ORA-00942则需要判断当前是否有权限. 此时可以改为SELECT * FROM USER_TABLES T WHERE T.TABLE_NAME = '表名称'; 2.修改ROW_MOVEMENT字段值  ROW_MOVEMENT关闭时的策略值为DISABLE,开启时为ENAB

使用复合键值对对key和value排序

1 package keySort; 2 import java.io.IOException; 3 4 import org.apache.hadoop.conf.Configuration; 5 import org.apache.hadoop.io.Text; 6 import org.apache.hadoop.mapreduce.Job; 7 import org.apache.hadoop.mapreduce.Mapper; 8 import org.apache.hadoop.ma

PostgreSQL 10 build-in table partition(Range)

1.下载 rpm知识库包 操作系统版本:CentOS Linux release 7.2.1511 (Core) X64 [[email protected] home]# yum install https://download.postgresql.org/pub/repos/yum/testing/10/redhat/rhel-7-x86_64/pgdg-centos10-10-1.noarch.rpm Preparing...                          #####

Oracle 11g 新特性:自动创建分区(Interval Partition)

分区(Partition)一直是Oracle数据库引以为傲的一项技术,正是分区的存在让Oracle高效的处理海量数据成为可能,在Oracle 11g中,分区技术在易用性和可扩展性上再次得到了增强.在10g的Oracle版本中,要对分区表做调整,尤其是对RANGE分区添加新的分区都需要DBA手动定期添加,或都使用存储过程进行管理.在11G的版本中的Interval Partition不再需要DBA去干预新分区的添加,Oracle会自动去执行这样的操作,减少了DBA的工作量.Interval Par

Kafka深度解析(如何在producer中指定partition)(转)

原文链接:Kafka深度解析 背景介绍 Kafka简介 Kafka是一种分布式的,基于发布/订阅的消息系统.主要设计目标如下: 以时间复杂度为O(1)的方式提供消息持久化能力,即使对TB级以上数据也能保证常数时间的访问性能 高吞吐率.即使在非常廉价的商用机器上也能做到单机支持每秒100K条消息的传输 支持Kafka Server间的消息分区,及分布式消费,同时保证每个partition内的消息顺序传输 同时支持离线数据处理和实时数据处理 为什么要用消息系统 解耦在项目启动之初来预测将来项目会碰到

key为null时Kafka会将消息发送给哪个分区?

当你编写kafka Producer时, 会生成KeyedMessage对象. KeyedMessage<K, V> keyedMessage = new KeyedMessage<>(topicName, key, message) 这里的key值可以为空,在这种情况下, kafka会将这个消息发送到哪个分区上呢?依据Kafka官方的文档, 默认的分区类会随机挑选一个分区: The third property  "partitioner.class" def

如何确定Kafka的分区数、key和consumer线程数

转自:http://www.tuicool.com/articles/Aj6fAj3 如何确定Kafka的分区数.key和consumer线程数 在Kafak中国社区的qq群中,这个问题被提及的比例是相当高的,这也是Kafka用户最常碰到的问题之一.本文结合Kafka源码试图对该问题相关的因素进行探讨.希望对大家有所帮助. 怎么确定分区数? "我应该选择几个分区?"--如果你在Kafka中国社区的群里,这样的问题你会经常碰到的.不过有些遗憾的是,我们似乎并没有很权威的答案能够解答这样的