Partition Table 查询性能

分区表的高效的查询性能是基于Partition Elimination 和 Partition Parallelism实现的。Partition Elimination 是指在执行TSql查询的时候，不是seek表的所有分区，而是根据Partition column排除部分分区，在符合 filtering the partition column 条件的 partition 上进行查询。Partition Parallelism是指分区之间可以并发执行查询。分区表查询使用更小的查询范围，更高的并发度，因而具有更高的查询性能。

一，Partition Elimination

A key benefit of table partitioning is partition elimination, whereby the query processor can eliminate inapplicable partitions of a table or index from a query plan. The more fine-grained the query filter is on the partition column, the more partitions can be eliminated and the more efficient the query can become. Partition elimination can occur with table partitions, index-aligned partitions, and partition-aligned indexed views in both SQL Server 2008 and SQL Server 2005. Partition elimination can also occur with nonaligned indexes.

实现分区消除主要有两种方式

1，Filtering the partition column

SQL Server 2008 adds a new partition-aware seek operation as the mechanism for partition elimination. It is based on a hidden computed column created internally to represent the ID of a table or index partition for a specific row. In effect, a new computed column, a partition ID, is added to the beginning of the clustered index of the partitioned table.

Partition elimination is most useful when a large set of table partitions are eliminated, because partition elimination helps reduce the amount of work needed to satisfy the query. So the key point is to filter on as small a number of partitions as possible based on filtering the partition column.

2，Join Collocation

If you query two tables that are partitioned with compatible partition functions, you may be able to join the two tables on their partition column. In that case, SQL Server may be able to pair up the partitions from each table and join them at the partition level. This is called ‘join collocation‘, implying compatibility between the partitions of the two tables that is leveraged in the query.

二，Partitioned Table Parallelism
Parallel operations are a key benefit of table partitioning. Indexes can be rebuilt in parallel, and the query processor can take advantage of multiple partitions to access a table in parallel.
If a partitioned table is sufficiently large and at least two CPU cores are available to SQL Server, a parallel execution strategy is used across the partitions that the query filter resolves to. Generally speaking, SQL Server attempts to balance the number of threads assigned to various partitions. The max degree of parallelism setting (which you set by using the sp_configure stored procedure or the MAXDOP query hint) determines the available thread count.
However, if the filter of the query specifically calls out ranges that determine a subset of the partitions so that partition elimination can occur, the number of partitions accessed will accordingly be less.

SQL Server 2005 was optimized for queries filtered to one partition. For such queries, more than one available thread could be assigned to scan the partition in parallel. However, for a filtered query touching more than one partition, only one thread could be assigned to any given partition. In SQL Server 2008, if the number of available threads is greater than the partitions accessed by a filtered query, more than one thread can be assigned to access data in each partition. This can improve performance in filtered queries that access more than one partition.

三，Designing Partitions to Improve Query Performance

Partitioning a table or index may improve query performance, based on the types of queries you frequently run and on your hardware configuration.

Partitioning for Join Queries

If you frequently run queries that involve an equi-join between two or more partitioned tables, their partitioning columns should be the same as the columns on which the tables are joined. Additionally, the tables, or their indexes, should be collocated. This means that they either use the same named partition function, or they use different ones that are essentially the same, in that they:

Have the same number of parameters that are used for partitioning, and the corresponding parameters are the same data types.
Define the same number of partitions.
Define the same boundary values for partitions.

In this way, the SQL Server query optimizer can process the join faster, because the partitions themselves can be joined. If a query joins two tables that are not collocated or are not partitioned on the join field, the presence of partitions may actually slow down query processing instead of accelerate it.

Taking Advantage of Multiple Disk Drives

It may be tempting to map your partitions to filegroups, each accessing a different physical disk drive, in order to improve I/O performance. When SQL Server performs data sorting for I/O operations, it sorts the data first by partition. Under this scenario, SQL Server accesses one drive at a time, and this might reduce performance. A better solution in terms of performance is to stripe the data files of your partitions across more than one disk by setting up a RAID. In this way, although SQL Server still sorts data by partition, it can access all the drives of each partition at the same time. This configuration can be designed regardless of whether all partitions are in one filegroup or multiple filegroups. For more information about how SQL Server works with different RAID levels, see RAID Levels and SQL Server.

Controlling Lock Escalation Behavior

Partitioning tables can improve performance by enabling lock escalation to a single partition instead of a whole table. To reduce lock contention by allowing lock escalation to the partition, use the LOCK_ESCALATION option of the ALTER TABLE statement.

参考文档：

https://msdn.microsoft.com/en-us/library/dd578580.aspx

https://msdn.microsoft.com/en-us/library/ms177411(v=sql.105).aspx

时间： 2024-07-31 14:31:50

Partition Table 查询性能

Partition Table 查询性能的相关文章

PLSQL_性能优化系列09_Oracle Partition Table大数据分区表

SQL Server-聚焦计算列或计算列持久化查询性能（二十二）

《高性能MySQL》读书笔记－－查询性能优化

新建一个索引能够同时提升三条SQL的查询性能

聚焦-移除Bookmark Lookup、RID Lookup、Key Lookup提高SQL查询性能（六）

SQL Server-聚焦使用视图若干限制/建议、视图查询性能问题，你懵逼了？（二十五）

SQL Server-聚焦过滤索引提高查询性能（十）

HAWQ与Hive查询性能对比测试

利用SET STATISTICS IO和SET STATISTICS TIME 优化SQL Server查询性能