Index Fragmentation Report in SQL Server 2005 and 2008

Problem
While indexes can speed up execution of queries several fold as they can make the querying process faster, there is overhead associated with them. They consume additional disk space and require additional time to update themselves whenever data is updated, deleted or appended in a table. Also when you perform any data modification operations (INSERT, UPDATE, or DELETE statements) index fragmentation may occur and the information in the index can get scattered in the database. Fragmented index data can cause SQL Server to perform unnecessary data reads and switching across different pages, so query performance against a heavily fragmented table can be very poor. In this article I am going to write about fragmentation and different queries to determine the level of fragmentation.

Solution
When indexes are first built, little or no fragmentation should exist. Over time, as data is inserted, updated, and deleted, fragmentation levels on the underlying indexes may begin to rise. So let‘s see how it happens.

When a page of data fills to 100 percent and more data must be added to it, a page split occurs. To make room for the new incoming data, SQL Server moves half of the data from the full page to a new page. The new page that is created is created after all the pages in the database. Therefore, instead of going right from one page to the next when looking for data, SQL Server has to go from one page to another page somewhere else in the database looking for the next page it needs. This is called index fragmentation.

There are basically two types of fragmentation:

  • External fragmentation - External, a.k.a logical,  fragmentation occurs when an index leaf page is not in logical order, in other words it occurs when the logical ordering of the index does not match the physical ordering of the index. This causes SQL Server to perform extra work to return ordered results. For the most part, external fragmentation isn‘t too big of a deal for specific searches that return very few records or queries that return result sets that do not need to be ordered.
  • Internal fragmentation - Internal fragmentation occurs when there is too much free space in the index pages. Typically, some free space is desirable, especially when the index is created or rebuilt. You can specify the Fill Factor setting when the index is created or rebuilt to indicate a percentage of how full the index pages are when created. If the index pages are too fragmented, it will cause queries to take longer (because of the extra reads required to find the dataset) and cause your indexes to grow larger than necessary. If no space is available in the index data pages, data changes (primarily inserts) will cause page splits as discussed above, which also require additional system resources to perform.

As we learned, heavily fragmented indexes can degrade query performance significantly and cause the application accessing it to respond slowly. So now the question is how to identify the fragmentation. For that purpose SQL Server 2005 and 2008 provide a dynamic management function (DMF) to determine index fragmentation level. This new DMF (sys.dm_db_index_physical_stats) function accepts parameters such as the database, database table, and index for which you want to find fragmentation. There are several options that allow you to specify the level of detail that you want to see in regards to index fragmentation, we will see some of these options in the examples below.

The sys.dm_db_index_physical_stats function returns tabular data regarding one particular table or index.

Input Parameter Description
database_id The default is 0 (NULL, 0, and DEFAULT are equivalent values in this context) which specify to return information for all databases in the instance of SQL Server else specify the databaseID from sys.databases if you want information about a specific database. If you specify NULL fordatabase_id, you must also specify NULL for object_idindex_id, andpartition_number.
object_id The default is 0 (NULL, 0, and DEFAULT are equivalent values in this context) which specify to return information for all tables and views in the specified database or else you can specify object_id for a particular object. If you specify NULL for object_id, you must also specify NULL for index_idand partition_number.
index_id The default is -1 (NULL, -1, and DEFAULT are equivalent values in this context) which specify to return information for all indexes for a base table or view. If you specify NULL for index_id, you must also specify NULL forpartition_number.
partition_number The default is 0 (NULL, 0, and DEFAULT are equivalent values in this context) which specify to return information for all partitions of the owning object. partition_number is 1-based. A nonpartitioned index or heap has partition_number set to 1.
mode mode specifies the scan level that is used to obtain statistics. Valid inputs are DEFAULT, NULL, LIMITED, SAMPLED, or DETAILED. The default (NULL) is LIMITED.

  • LIMITED - It is the fastest mode and scans the smallest number of pages. For an index, only the parent-level pages of the B-tree (that is, the pages above the leaf level) are scanned. In SQL Server 2008, only the associated PFS and IAM pages of a heap are examined; the data pages of the heap are not scanned. In SQL Server 2005, all pages of a heap are scanned in LIMITED mode.
  • SAMPLED - It returns statistics based on a 1 percent sample of all the pages in the index or heap. If the index or heap has fewer than 10,000 pages, DETAILED mode is used instead of SAMPLED.
  • DETAILED - It scans all pages and returns all statistics.

Note

  • The sys.dm_db_index_physical_stats dynamic management function replaces the DBCC SHOWCONTIG statement. It requires only an Intent-Shared (IS) table lock in comparison to DBCC SHOWCONTIG which required a Shared Lock, also the algorithm for calculating fragmentation is more precise than DBCC SHOWCONTIG and hence it gives a more accurate result.
  • For an index, one row is returned for each level of the B-tree in each partition (this is the reason, if you look at image below, for some indexes there are two or more than two records for a single index; you can refer to the Index_depth column which tells the number of index levels). For a heap, one row is returned for the IN_ROW_DATA allocation unit of each partition. For large object (LOB) data, one row is returned for the LOB_DATA allocation unit of each partition. If row-overflow data exists in the table, one row is returned for the ROW_OVERFLOW_DATA allocation unit in each partition.

Example

Let‘s see an example. The first script provided below gives the fragmentation level of a given database including all tables and views in the database and all indexes on these objects. The second script gives the fragmentation level of a particular object in the given database. The details about the columns and its meaning returned by thesys.dm_db_index_physical_stats are given in the below table.


Script : Index Fragmentation Report Script

--To Find out fragmentation level of a given database
--This query will give DETAILED information
--CAUTION : It may take very long time, depending on the number of tables in the DB
USE AdventureWorks
GO
SELECT object_name(IPS.object_id) AS [TableName],
   SI.name AS [IndexName],
   IPS.Index_type_desc,
   IPS.avg_fragmentation_in_percent,
   IPS.avg_fragment_size_in_pages,
   IPS.avg_page_space_used_in_percent,
   IPS.record_count,
   IPS.ghost_record_count,
   IPS.fragment_count,
   IPS.avg_fragment_size_in_pages
FROM sys.dm_db_index_physical_stats(db_id(N‘AdventureWorks‘), NULL, NULL, NULL , ‘DETAILED‘) IPS
   JOIN sys.tables ST WITH (nolock) ON IPS.object_id = ST.object_id
   JOIN sys.indexes SI WITH (nolock) ON IPS.object_id = SI.object_id AND IPS.index_id = SI.index_id
WHERE ST.is_ms_shipped = 0
ORDER BY 1,5
GO

--To Find out fragmentation level of a given database and table
--This query will give DETAILED information
DECLARE @db_id SMALLINT;
DECLARE @object_id INT;
SET @db_id = DB_ID(N‘AdventureWorks‘);
SET @object_id = OBJECT_ID(N‘Production.BillOfMaterials‘);
IF @object_id IS NULL
BEGIN
   PRINT N‘Invalid object‘;
END
ELSE
BEGIN
   SELECT IPS.Index_type_desc,
      IPS.avg_fragmentation_in_percent,
      IPS.avg_fragment_size_in_pages,
      IPS.avg_page_space_used_in_percent,
      IPS.record_count,
      IPS.ghost_record_count,
      IPS.fragment_count,
      IPS.avg_fragment_size_in_pages
   FROM sys.dm_db_index_physical_stats(@db_id, @object_id, NULL, NULL , ‘DETAILED‘) AS IPS;
END
GO

Returned Column Description
avg_fragmentation_in_percent It indicates the amount of external fragmentation you have for the given objects.

The lower the number the better - as this number approaches 100% the more pages you have in the given index that are not properly ordered.

For heaps, this value is actually the percentage of extent fragmentation and not external fragmentation.

avg_page_space_used_in_percent It indicates how dense the pages in your index are, i.e. on average how full each page in the index is (internal fragmentation).

The higher the number the better speaking in terms of fragmentation and read-performance. To achieve optimal disk space use, this value should be close to 100% for an index that will not have many random inserts. However, an index that has many random inserts and has very full pages will have an increased number of page splits. This causes more fragmentation. Therefore, in order to reduce page splits, the value should be less than 100 percent.

fragment_count A fragment is made up of physically consecutive leaf pages in the same file for an allocation unit. An index has at least one fragment. The maximum fragments an index can have are equal to the number of pages in the leaf level of the index. So the less fragments the more data is stored consecutively.
avg_fragment_size_in_pages Larger fragments mean that less disk I/O is required to read the same number of pages. Therefore, the larger the avg_fragment_size_in_pages value, the better the range scan performance.


Next Steps

http://www.mssqltips.com/sqlservertip/1708/index-fragmentation-report-in-sql-server-2005-and-2008/

时间: 2024-10-13 06:55:51

Index Fragmentation Report in SQL Server 2005 and 2008的相关文章

SQL Server 2005、2008 的 datetime 值范围(转)

SQL Server 2005.2008 的 datetime 最小值是:1753-01-01 00:00:00 最大值是:9999-12-31 23:59:59.997 这与 .NET 中的 DateTime.MinValue.DateTime.MaxValue 不匹配,与 System.Data.SqlTypes.SqlDateTime.MinValue.System.Data.SqlTypes.SqlDateTime.MaxValue 匹配. 其实 .NET 与 SQL Server 字段

(转)SQL Server 2005附加2008的数据库

1. 生成for 2005版本的数据库脚本  2008 的manger studio  -- 打开"对象资源管理器"(没有的话按F8), 连接到你的实例  -- 右键要转到2005的库  -- 任务  -- 生成脚本  -- 在"脚本向导"的"选择数据库"中, 确定选择的是要转到2005的库  -- 勾选"为所选数据库中的所有对象编写脚本"5-- 在接下来的"选择脚本选项"中, 将”编写创建数据库的脚本”设

SQL Server 2005中的分区表(六):将已分区表转换成普通表

在前面,我们介绍过怎么样直接创建一个分区表,也介绍过怎么将一个普通表转换成一个分区表.那么,这两种方式创建的表有什么区别呢?现在,我又最新地创建了两个表: 第一个表名为Sale,这个表使用的是<SQL Server 2005中的分区表(一):什么是分区表?为什么要用分区表?如何创建分区表?>中的方法创建的,在创建完之后,还为该表添加了一个主键. 第二个表名Sale1,这个表使用的是<SQL Server 2005中的分区表(三):将普通表转换成分区表>中的方法创建的,也就是先创建了

SQL Server 2005中的分区表(三):将普通表转换成分区表(转)

在设计数据库时,经常没有考虑到表分区的问题,往往在数据表承重的负担越来越重时,才会考虑到分区方式,这时,就涉及到如何将普通表转换成分区表的问题了. 那么,如何将一个普通表转换成一个分区表 呢?说到底,只要将该表创建一个聚集索引,并在聚集索引上使用分区方案即可. 不过,这回说起来简单,做起来就复杂了一点.还是接着上面的例子,我们先使用以下SQL语句将原有的Sale表删除. --删除原来的数据表 drop table Sale 然后使用以下SQL语句创建一个新的普通表,并在这个表里插入一些数据. -

[转]Under the covers: IAM chains and allocation units in SQL Server 2005

(I'm sitting here in Seattle airport at 7am on Sunday waiting to catch the same flight to Boston that I caught two weeks ago. Instead of TechEd, this time I'm going to a training course at MIT. I'd enjoy the air travel a lot more with a bigger gap in

使用SQL Server 2005 新的语法ROW_NUMBER()进行分页的两种不同方式的性能比较

相比在SQL Server 2000 中使用的分页方式,在SQL Server 2005中使用新的语法ROW_NUMBER()来分页效率要高出很多,但是很多人在使用ROW_NUMBER()这种分页方式时,使用的方法并不正确,以下列出不正确的和正确的做法并做简单分析: 首先假设我们已经创建了如下的表和索引并初始化了100万条数据: CREATE TABLE [dbo].[Users] ( [ID] [int] IDENTITY(1,1) NOT NULL, [Name] [varchar](50)

删除指定表的所有索引,包括主键索引,唯一索引和普通索引 ,适用于sql server 2005,

原文:删除指定表的所有索引,包括主键索引,唯一索引和普通索引 ,适用于sql server 2005, --删除指定表中所有索引 --用法:declare @tableName varchar(100) --set @tableName='表名' --表名 ,根据实际情况替换 --exec sp_dropindex @tableName if exists(select 1 from sysobjects where id = object_id('dropindex') and xtype =

SQL Server 2005故障转移群集

SQL Server使用最广的高可用性技术叫做故障转移群集.SQL Server故障转移群集是一项基于Windows故障转移群集的一种技术.SQL Server故障转移群集技术在部署和管理上都非常容易,同时又能提供非常良好高可用性,因此目前得到了非常广泛的使用.可以说,它是SQL2012之前的各个版本,实现高可用性的必选技术.下面我们就来聊聊SQL Server2005的故障转移群集配置. 配置SQL Server故障转移群集需要注意一下几个方面的问题: 1. 因为做SQL Server故障转移

Sql Server 2005 统计信息用途

1, 什么是统计信息 以下是官方的对统计信息的描述: 按照默认设置,如果表中的某列没有索引,则SQL Server会自动为该列创建统计.然后,查询优化器评估该列中数据分布范围的统计信息,以选择一个更为有效的查询处理方案.分辨自动创建的统计很简单,在SQL Server 7.0和SQL Server 2000中,自动创建的统计的前缀为_WA_Sys. 个人感想: 以前对_WA_Sys开头的统计信息一直不知道有什么用,在学习了oracle时的CBO和三个算法(嵌套循环,合并连接和hash连接)才发现