原文地址:http://www.codeproject.com/Tips/1023621/SQL-Performance-Improvement-Techniques
This article provides various options to improve the performance in database.
1) Re-Write Query:
If any query is taking much time to execute then the first step is to rewrite the query. Perform thorough analysis and move towards identifying the root cause. The below are few guidelines to improve the performance of a query.
- Avoid * in SELECT and specify the column names when dealing with JOINs on multiple tables.
- Avoid repeated logic, unnecessary subqueries and unnecessary JOINs
- Some cases EXIST will benefit instead of JOIN
- Use UNION ALL instead of UNION
- Use EXISTS instead of IN when necessary
- Use WITH clause (Oracle) or Common Table Expressions(Sql Server)
- Order or position of the columns in WHERE clause would play vital role to improve the performance and ensure the proper index is being used by the query.
- In Sql server, Adopt using SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED or WITH(NOLOCK).
- Use hints if necessary. There are table hints, query hints and plan hints.
2) Create a Highly Selective Index:
Index helps to retrieve the data fast and basically to speed up the searches/queries. The below are the few guide lines to create index.
- When the table is large and frequent selectivity of a table is less than 10%.
- Do not create index on low cardinality columns and also the index is not required for small tables.
- Frequently used columns in the WHERE clause and Columns used in joins for multiple tables.
- Order or position of a column in an index also plays a vital role. In general, you should put the column expected to be used most often first in the index.
- Limit the number of indexes on a table. The more indexes more overhead as the indexes need to be updated on every DML operation.
3) Limit the Number of Columns and Rows:
In some cases the applications may not use all the columns and rows fetched from the database. Means pull only the required columns and required rows.
Example if query (SELECT *) is pulling more than 100 column and the application may not use those 100 column in the application. And if any application is displaying data in page wise format then better to retrieve the corresponding records of the page instead of retrieving all the records.
4) Temporary tables:
The temporary tables should be used when there is a strong reason. Basically if any long running query is used in many places in procedure/function then better we store the results of long running query in a temporary table and reuse it later. Once it is completed then delete it to free the memory and do not wait for the table to be automatically deleted when the connection is ended. Using index on the temporary tables will help when you deal with very large tables. The temporary table can be used as the alternative for the cursors.
5) Pre-Stage data:
There are some applications do large imports from database and perform selectivity based on the information stored in flat file. The data import is done for every 1000/2000 records which would result in to perform SELECT with JOINs on large tables multiple times. These JOINs would be executed multiple times i.e. the same operation is executing several times. This can be improved by loading the flat file data into a stage table and then perform SELECT with JOINs only once based on the data exist in stage table. This would significantly improve the performance.
6) Indexed/Materialized Views:
This technique is very much helpful when there is a search operation on multiple large tables and on various columns. Obviously when you perform search on multiple tables would take lot of time to complete the search operation. This can be improved by creating single indexed/materialized view which would load and consolidate the key columns data into one or two columns. Here search is performed on one/two columns of large view instead of various columns on multiple large tables. In case of materialized views an index needs to be created on the key columns explicitly to get better performance.
7) Index Optimization:
Over a period of time the data size keep on increasing and at the same time the index size is keep on increasing. The index would become more fragmented and database engine would perform unnecessary data reads. So the heavy fragmentation of an index would lead to slow down the performance. There are two options to reduce the fragmentation of the index.
- Rebuild: Rebuild would drop the existing index and create a new index with updated data in the columns. It takes more server resources to perform the rebuild.
- Reorganize: Reorganize is more light weight and performs the defragmentation of the index. The existing index is used to update the leaf pages. It is better to do reorganize on periodic basis instead of rebuild.
8) Index Statistics:
The creation of statistics would enable the database engine to use a highly efficient execution plan for a query. Basically index statistics maintains the distribution of the values of an index column i.e. the cardinality of different column values. This information is used by the database engine to determine execution plan that can be used for processing a query. The statistics would need regular/periodical updates as the distribution of the values changes.
9) Archive key tables:
As you are already know, the data and index size will keep on increasing day by day. When the application is functioning since many years then index optimization may not be a good choice to improve the performance as each key table might have billions of records with different indexes on those tables. It is time to archive the key tables and this solution is suitable only if the application is not using the very old records. Create a new archive table for storing the very old records with the same structure of the key table and then move all the old records into the new archive table.
Better to rebuild/reorganize the indexes on all the key tables once the old records are moved to the newly created archive table as it would free lot of memory occupied for storing indexes. This activity can performed periodically (once/twice in year) during the off hours.
Enjoy faster SQL!!
Please remember to evaluate each situation individually to see which method works best.
License
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)