OPTIMIZATION OF QUERIES/ TABLES/ STORED PROCEDURES IN SQL SERVER
SOME MORE TIPS FOR OPTIMIZATION OF QUERIES/ TABLES/ STORED PROCEDURES IN SQL SERVER
a. Table should have minimum of one clustered index and appropriate number of non clustered indexes, which should be created on columns of table based on query which is running following the priority order as WHERE clause, then JOIN clause, then ORDER BY clause and finally the SELECT clause.
b. Avoid using Triggers if possible; incorporate the logic of trigger in a stored procedure.
          1. When creating a stored procedure, we can specify WITH RECOMPILE option in it. But such stored procedure will never benefit from the cached                  execution plan as each time it is executed; it forces the cached execution plan to invalidate or flush and create a new plan based upon the                            parameters passed, if any, to it. I do not see any such big benefits of this option. But one may find this useful when such stored procedure will return             results or execute    only selective part of the stored procedure body based upon supplied input parameters. For example, statements within If-block             or Select-Case block based upon input parameters.
         2.This option is good when the table properties are changed, and this table is in use of many other stored procedures. Instead of recompiling each and              every such depending stored procedure, a simple sp_recompilewill do enough, and with no server restart!
3. Situations like when we mix DDL and DML statements together inside stored procedure may also cause stored procedure recompilation.
4.We should always create a stored procedure with a full naming convention. Schema name should always prefix the stored procedure name. Schema name will help SQL Server name resolution easily when it is called. This helps SQL Server in which schema to query the stored procedure.
5.Tables should have proper indexes and should be compiled time to time as indexes may be weird off after some time due to huge data insertion or deletion.
6.Include SET NOCOUNT ON statement: With every SELECT and DML statement, the SQL server returns a message that indicates the number of affected rows by that statement. This information is mostly helpful in debugging the code, but it is useless after that. By setting SET NOCOUNT ON, we can disable the feature of returning this extra information. For stored procedures that contain several statements or contain Transact-SQL loops, setting SET NOCOUNT to ON can provide a significant performance boost because network traffic is greatly reduced.
7.Try to avoid using SQL Server cursors whenever possible: Cursor uses a lot of resources for overhead processing to maintain current record position in a recordset and this decreases the performance. If we need to process records one-by-one in a loop, then we should use the WHILE clause. Wherever possible, we should replace the cursor-based approach with SET-based approach. Because the SQL Server engine is designed and optimized to perform SET-based operation very fast. Again, please note cursor is also a kind of WHILE Loop.
8.Keep the Transaction as short as possible: The length of transaction affects blocking and deadlocking. Exclusive lock is not released until the end of transaction. In higher isolation level, the shared locks are also aged with transaction. Therefore, lengthy transaction means locks for longer time and locks for longer time turns into blocking. In some cases, blocking also converts into deadlocks. So, for faster execution and less blocking, the transaction should be kept as short as possible.
9.Use TRY-Catch for error handling: Prior to SQL server 2005 version code for error handling, there was a big portion of actual code because an error check statement was written after every t-sql statement. More code always consumes more resources and time. In SQL Server 2005, a new simple way is introduced for the same purpose.
c. Table should have a primary key.
d. Try to use constraints for selection instead of using triggers, whenever possible. Constraints are efficient than triggers enhance performance. So, you should use constraints instead of triggers, whenever possible.
e. Try to use table variables instead of temporary tables as table variables require less locking resources as well as less logging resources than the temporary tables, so table variables should be used whenever possible.
f. Avoid the use of views or replace views with original tables.
g. Try to avoid the use of DISTINCT clause, where ever possible. As the DISTINCT clause will result in performance degradation, we should use this clause only when it is necessary or unavoidable.
h. Try to add SET NOCOUNT ON statement into your stored procedures as it stops the message indicating the number of rows affected by a SQL statement. It also reduces network traffic, because our client will not receive any message indicating the number of rows affected by a SQL statement.
i. Try to use TOP keyword or the SET ROWCOUNT statement in the select statements, if we need to return only the first n rows. This can improve performance of our queries, as the smaller result set will be returned. It can also reduce the traffic between the server and the clients.
j. Try to use user-defined functions to keep the encapsulated code for reuse in future. The user-defined functions (UDFs) contain one or more SQL statements that can be used to encapsulate code for reuse. Using UDFs can also reduce the network traffic.
k. If possible move the logic of UDF to SP as well.
l. If you need to delete all rows of a table, try to use TRUNCATE TABLE command instead of DELETE command. Using the TRUNCATE TABLE is a faster way to delete all rows of a tables, because it removes all rows from a table without logging each row delete.
m. Remove any unnecessary joins from tables.
n. If there is cursor used in query, see if there is any other way to avoid the usage of this (either by SELECT … INTO or INSERT … INTO, etc). Try to avoid using cursors whenever possible. As SQL Server cursors can result in some performance degradation as compared to select statements. Try to use correlated sub-queries or derived tables for row-by-row operations on tables.
o. When writing a sub-query (a SELECT statement within the WHERE or HAVING clause of another SQL statement):
1. Try to use a correlated (refers to at least one value from the outer query) sub-query when the return is relatively small and/or other criteria are efficient i.e. if the tables within the sub-query have efficient indexes.
2. Try to use a non-correlated (does not refer to the outer query) sub-query when dealing with large tables from which you expect a large return (many rows) and/or if the tables within the sub-query do not have efficient indexes.
3. Ensure that multiple subqueries are in the most efficient order.
4.Remember that rewriting a sub-query as a join can sometimes increase efficiency.
p. Use char/varchar columns data type, instead of nchar/nvarchar data type if we do not need to store Unicode data. The char/varchar data value uses only one byte to store one character; whereas the nchar/nvarchar value uses two bytes to store one character, so the char/varchar columns use two times less space to store data as compared to nchar/nvarchar data columns.
q. Try to use stored procedures instead of heavy queries as they can reduce network traffic, because our client will send to server only stored procedure name (along with some parameters) instead of heavy and lengthy queries text. Stored procedures can be used to enhance security. For example, we can give different users, different set of permissions to execute the stored procedure to work with the restricted set of the columns and data.
r. We should try to return an integer value from a RETURN statement instead of returning an integer value as a part of a record set. As the RETURN statement exits from a stored procedure unconditionally, so the statements following the RETURN statement are not executed. The RETURN statement is generally used for error checking, but we can also use this statement to return an integer value for any other reason. Using RETURN statement can improve performance as SQL Server will not create a record set.
s. Try to drop indexes that are not being used. Because each index takes up disk space and slow the DML operations, we should drop indexes that are not used. We can use Index Wizard to identify indexes that are not being used in our SQL queries.
t. We should try to create indexes on columns that have integer values rather than character values. Because the integer values have less size than the size of characters values, so we can reduce the number of index pages which are being used to store the index keys. This finally reduces the number of reads required to read the index and enhances the overall index performance.
u. If we need to join several tables very frequently, then we should consider creating index on the joined columns which can significantly improve performance of the queries against the joined tables.
v. Try to avoid any operations on the fields, where ever possible. Some operations will prevent the use of index on a field even if it exists—for example, ltrim(rtrim(FieldColumnName)) as such operations will degrade the performance. For example, instead of using the condition cast(DateColumn as varchar(20)) = @dateStr, we should try to convert @dateStr to an expression of datetime type and then compare it to DateColumn value.
CONCLUSION Query optimization has a very big impact on the performance of a DBMS and it continuously evolves with new, more sophisticated optimization strategies. Query optimization is a common task performed by database administrators and application designers in order to tune the overall performance of the database system. Even if you have a powerful infrastructure, the performance can be significantly degraded by inefficient queries. So, we should try to follow the general tips as mentioned above to get a better performance of queries. Optimization can be achieved with some efforts if we make it a general practice to follow the rules. The techniques described in this paper allow basic optimization of queries, tables, indexes and stored procedures for performance gains. The main focus was on query optimizations.
