The SQL Dude!

On-Disk & In-Memory Pages that teach you the SQL way of life!

Posts Tagged ‘performance’

Seek & Evaluate Unused Indexes in your SQL database

Posted by Sudarshan Narasimhan on June 16, 2016


This post is Part 3 of my series of posts on Index Maintenance for a DBA. So here goes.

As a regular part of your DBA work, it is very important to keep evaluating the need for indexes in your database environment. Any DBA worth his salt, will always look to “maintain” the indexes by ensuring the fragmentation is kept low. Most of you probably have maintenance tasks/scripts to take care of this. An often under-looked aspect in index maintenance is to also consider whether some indexes are an over-kill a.k.a unnecessary and potentially drop them if it’s not really utilized. This is especially important when dealing with 3rd party vendor databases, where the DBA does not have full control on what comes in and what goes out in terms of index changes.

I like to think of the database as my “house” and take care of it the same way one would do to your own home. You always consider who/what comes in to your house. The same care needs to be applied to the databases you maintain. As gatekeepers to the database, even if you don’t have complete control on what comes in, it is good to be aware & have the stats handy so that you can work with your application teams or vendors, to keep the house tidy.

If you link this data back to my previous posts Part 1 – Seek & Destroy Duplicate Indexes in your SQL Database and Part 2 – Seek & Understand Top Indexes by Size in your SQL database , you will be on your way to hopefully free up some space and realize space cost and performance gains.

Here is a handy script that I use to:-

1) Lists the indexes on each table in the database, that are completely unused.

This scripts lists those indexes where not even a single Read operation has used these indexes. Keep in mind, write operations still need to update this index as data is being updated/inserted contributing to overall time of the Write operations. When no-one is using the index for Read it becomes a system overhead.

2) Lists the indexes on each table in the database, that are having more Writes compared to Reads

The maintenance cost of having these indexes from insert, update, or delete operations on the underlying table are more compared to the benefits for Read operations. These indexes needs to be investigated to see if they are still required. A cost benefit analysis on these indexes needs to be performed.
Note: I’m not advocating that you remove these indexes, but just that you carefully consider the benefits vs the cost, before making any decision.



Script #1 – TOTALLY UNUSED INDEXES

USE [DBNAME] /* Replace with your Database Name */
GO
--TOTALLY UN-USED INDEXES
SELECT DB_NAME(s.database_id) as [DB Name], OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id,
    i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor, i.is_unique,
    s.user_updates AS [Total Writes],
    (s.user_seeks + s.user_scans + s.user_lookups) AS [Total Reads],
    s.user_updates - (s.user_seeks + s.user_scans + s.user_lookups) AS [Difference],
    (partstats.used_page_count / 128.0) AS [IndexSizeinMB]
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
    ON s.[object_id] = i.[object_id]
    AND s.index_id = i.index_id
    AND s.database_id = DB_ID()
INNER JOIN sys.dm_db_partition_stats AS partstats
    ON i.object_id = partstats.object_id AND i.index_id = partstats.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
    AND user_updates > (user_seeks + user_scans + user_lookups)
    AND (s.user_lookups=0 AND s.user_scans=0 AND s.user_seeks=0)
    AND i.index_id > 1
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION (RECOMPILE);
GO

Script #1 Result
Here is a sample output from my test system showing all unused indexes on the MSDB database.

UnusedIndexes

 

 

 

 

 

 



Script #2 – INDEXES WITH MORE WRITES THAN READS

USE [DBNAME] /* Replace with your Database Name */
GO
--INDEXES WITH WRITES > READS
SELECT DB_NAME(s.database_id) as [DB Name], OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id,
    i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor, i.is_unique,
    s.user_updates AS [Total Writes],
    (s.user_seeks + s.user_scans + s.user_lookups) AS [Total Reads],
    s.user_updates - (s.user_seeks + s.user_scans + s.user_lookups) AS [Difference],
    (partstats.used_page_count / 128.0) AS [IndexSizeinMB]
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
    ON s.[object_id] = i.[object_id]
    AND s.index_id = i.index_id
    AND s.database_id = DB_ID()
INNER JOIN sys.dm_db_partition_stats AS partstats
    ON i.object_id = partstats.object_id AND i.index_id = partstats.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
    AND (s.user_lookups<>0 OR s.user_scans<>0 OR s.user_seeks<>0)
    AND s.user_updates > (s.user_seeks + s.user_scans + s.user_lookups)
    AND i.index_id > 1
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION (RECOMPILE);
GO

Script #2 Result
Here is a sample output from my test system showing all indexes having more Writes compared to Reads on the MSDB database.

WritesMoreThanReadsIndexes

 

 

 

 

 

Well, that’s all folks.

-Sudarshan (TheSQLDude)

Posted in DB Maintenance, Indexing, T-SQL | Tagged: , , , , , , | 1 Comment »

Data Collector (MDW) slow purge issue also affects SQL Server 2008

Posted by Sudarshan Narasimhan on June 5, 2012


I had posted on this topic before in my previous blog. SQL Server 2008 is also affected by slow purge procedure core.sp_purge_data as well as deadlocks in the purge procedure. This issue was fixed in SQL Server 2008 R2 SP1. For SQL Server 2008, the fix to the purge procedure has been done in Service Pack 3. Those of you using MDW to collect performance data, should apply SP3 as soon as possible. This will greatly reduce the purge procedure run time and your jobs will run faster.

The KB 2584903 has been updated to include SQL 2008 as an affected product. I’ve posted this on the Connect Bug as well here –> https://connect.microsoft.com/SQLServer/feedback/details/657148/management-data-warehouse-mdw-purge-job-proc-performance

Just applying SP1 for R2 or SP3 for SQL 2008 will not be enough to get the MDW purge procedures updated. Once the service pack has been successfully installed, you will have to run through the Configure Management Data Warehouse Wizard from SSMS to get the purge procedures updated with the new T-SQL code.

image

Posted in Data Collector | Tagged: , , , , , , | Leave a Comment »

IO Cost in an execution plan–What it actually means?

Posted by Sudarshan Narasimhan on May 11, 2012


I/O is probably one the slowest of the resources. The information below gives a quick meaning of what these operators mean when you look at them in an execution plan. This specific concentrates on the “Estimated I/O Cost” value in an execution plan. I’ll post some other day in detail on how to decipher a SQL Server query execution plan.

Thanks to Niraj Mehta for putting together this content.

Table Scan

  • The total number of data pages in the table

Clustered Index Scan

  • The number of levels in the index plus the number of data pages to scan (data pages = #rows / #rows per page)

Non-Clustered Index Seek on a Heap (Bookmark Lookup)

  • The number of levels in the index plus the number of leaf pages to read for qualifying rows plus the number of qualifying rows (1 I/O for each row on the heap)

Non-Clustered Index (I1) Seek on a Clustered index (I2) (Bookmark Lookup)

  • The number of levels in the I1 index plus the number of leaf pages to read for qualifying rows plus the number of qualifying rows times the cost of searching for a clustered index (I2) key

Covering Non-Clustered index

  • The number of levels in the index plus the number of leaf index pages to read for qualifying rows (#qualifying rows / # rows per leaf page).

Posted in Performance | Tagged: , , , , , , , , , , | Leave a Comment »

Shedding light on SQL Server Statistics and Auto Update Statistics

Posted by Sudarshan Narasimhan on May 4, 2012


SQL Server collects statistical information about indexes and column data in the database. These statistics play a very important role in coming up with an efficient query plan. Statistical information is used by the Query Optimizer when it decides to choose Seek vs. Scan or using Index A vs. Index B, etc. So it’s important as a DBA to pay careful attention to statistics and making sure they are updated and maintained well.

Just like Chandler Bing’s job is “Statistical analysis and data reconfiguration”, its a good DBA’s job to take care of maintaining the statistics Smile.


SQL Server 2005 statistics features allow you to:

  • Implicitly create and update statistics with the default sampling rate (in the SELECT, INSERT, DELETE, and UPDATE commands, use of a column in a query condition such as a WHERE or JOIN clause causes statistics to be created or updated if necessary when automatic statistics creation and update is enabled)
  • Manually create and update statistics with any desired sampling rate, and drop statistics (CREATE STATISTICS, UPDATE STATISTICS, DROP STATISTICS, CREATE INDEX, DROP INDEX)
  • Manually create statistics in bulk for all columns of all tables in a database (sp_createstats)
  • Manually update all existing statistics in the database (sp_updatestats)
  • List statistics objects that exist for a table or database (sp_helpstats, catalog views sys.stats, sys.stats_columns)
  • Display descriptive information about statisticsobjects (DBCC SHOW_STATISTICS)
  • Enable and disable automatic creation and update of statistics database-wide or for a specific table or statistics object (ALTER DATABASE options AUTO_CREATE_STATISTICS and AUTO_UPDATE_STATISTICS, sp_autostats, and the NORECOMPUTE option on CREATE STATISTICS and UPDATE STATISTICS)
  • Enable and disable asynchronous automatic update ofstatistics (ALTER DATABASE option AUTO_UPDATE_STATISTICS_ASYNC

Behaviour change from SQL 2005 onwards

In SQL Server 2000, statistics update was determined by the number of row changes. Now, changes are tracked at the column level, and auto update of statistics can now be avoided on columns which have not changed enough to warrant statistics update.  This is a behaviour change between SQL 2000 and SQL 2005 onwards when it comes to auto update statistics. SQL Server 2005 and onwards determines whether to update statistics based on the colmodctrs (column modification counters) value.

A statistics object is considered out of date in the following cases:

1. The table size has gone from 0 to > 0 rows.
2. The number of rows in the table when the statistics were gathered was 500 or less, and the colmodctr of the leading column of the statistics object has changed by more than 500 since then.
3. The table had more than 500 rows when the statistics were gathered, and the colmodctr of the leading column of the statistics object has changed by more than 500 + 20% of the number of rows in the table when the statistics were gathered.
4. If the statistics object is defined on a temporary table, it is out of date as above, except that there is an additional threshold Recomputation after insertion of 6 rows.


Q: How is the COLMODCTR maintained by SQL Server?
Ans: The colmodctr values that SQL Server keeps track of are continually modified as the data in the table changes. Depending on the amount of INSERT/UPDATE/DELETE/TRUNCATE/BULK OPS that are occurring on the table, this counter value keeps changing. The below table, describes what the effect on colmodctr value are for each operation.

Cc293623.table_C05621961_2(en-us,TechNet.10).jpg
Reference: http://technet.microsoft.com/en-us/library/cc293623.aspx

 

Controlling STATISTICS Behaviour

You have options like AUTO_UPDATE_STATISTICS, AUTO_CREATE_STATISTICS and AUTO_UPDATE_STATISTICS_ASYNC etc, which are all database level options. So you need to configure this per each database. You can find if your user databases have these options by using sp_helpdb and looking at the status column or by using a query like below.

select name as [DB_NAME], is_auto_create_stats_on, is_auto_update_stats_on, is_auto_update_stats_async_on 
from sys.databases

How to find out which indexes or statistics needs to be updates?

You can use the following query on any SQL 2005+ instance to find out the % of rows modified and based on this decide if any indexes need to be rebuilt or statistics on the indexes need to be updated.

select
schemas.name as table_schema,
tbls.name as Object_name,
i.id as Object_id,
i.name as index_name,
i.indid as index_id,
i.rowmodctr as modifiedRows,
(select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2) as rowcnt,
convert(DECIMAL(18,8), convert(DECIMAL(18,8),i.rowmodctr) / convert(DECIMAL(18,8),(select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2))) as ModifiedPercent,
stats_date( i.id, i.indid ) as lastStatsUpdateTime
from sysindexes i
inner join sysobjects tbls on i.id = tbls.id
inner join sysusers schemas on tbls.uid = schemas.uid
inner join information_schema.tables tl
on tbls.name = tl.table_name
and schemas.name = tl.table_schema
and tl.table_type='BASE TABLE'
where 0 < i.indid and i.indid < 255
and table_schema <> 'sys'
and i.rowmodctr <> 0
and i.status not in (8388704,8388672)
and (select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2) > 0
order by modifiedRows desc

Updating Statistics on all the table in any given database

I often get this often as to how you update all the tables in a database. You can use a script like below to achieve this.

Disclaimer: Do not run this unless you know its implications on a production server. Statistics update on all the tables will use CPU resources and depending on the size of the table take its own time.

use <YOUR_DB_NAME>
go
select identity(int,1,1) as rownum,table_name into table_count from information_schema.tables where table_type='base table'
declare @count int,@stmt varchar (255),@maxcount int,@tblname varchar(50)
set @count=1
select @maxcount=count(*) from table_count
while @count < @maxcount+1
begin 
      select @tblname=table_name from table_count where rownum=@count
      set @stmt = 'UPDATE STATISTICS '+ '[' +@tblname+ ']' + ' WITH FULLSCAN'
      PRINT ('Updating statistics for table :'+@tblname)
      EXEC(@stmt)
      PRINT ('Finished Updating statistics for table :'+@tblname)
      print ''
      set @count=@count+1
      set @stmt=''
End
drop table table_count

Some tips on Statistics

1. Table variables do not have statistics at all.

Table variables are meant for operations on a small number of rows, a few thousand rows at max. This is a good scenario where you need to think about temporary tables (#tbl), because unlike table variables, temp tables can have indexes created on them, which means they can have statistics.

2. Multi-Statement Table Value Functions (TVF’s) also do not have any statistics

So if you have a complex query logic implemented in a function in SQL Server, think again! This function does not have any statistical information present, so the SQL optimizer must guess the size of the results returned. The reason for this is a multi-statement TVF returns you a TABLE as an output and table does not have any statistics on it.

3. You can find out from the Execution Plan aka SET STATISTICS PROFILE statement if any statistics would help a particular query

When you enable STATISTICS PROFILE ON and execute any query/batch it displays the execution plan. In this output look for the column called “Warnings”. During the course of compiling the plan, if the SQL Server optimizer felt that some statistics on column A would have helped the query, it displays this warning in the execution plan as “NO STATS”. If you see any such warning, consider creating some column statistics or indexes on the particular object in the row.

4. Avoid creating indexes on very frequently updated columns as the statistics also will have to keep up with the amount of data modifications.

5. Viewing Statistics

You can use DBCC SHOW_STATISICS (‘tablename’ , ‘index name’) to view the statistics on any given index’/column stats along with the histogram. The system DMV sys.stats stores information on each statistics available in a particular database.

Any statistics having the name prefixed as _WA_Sys_ is a auto-created statistics, which means SQL Server itself created them. User created statistics will have a given name or have the index name, e.g. PK_TBL1

Posted in Performance | Tagged: , , , , , , , , | Leave a Comment »