The SQL Dude!

On-Disk & In-Memory Pages that teach you the SQL way of life!

Azure Automation – How to get your 1st Runbook working properly…

Posted by Sudarshan Narasimhan on September 9, 2016


Been playing around with Azure Automation this past month and had a requirement to Up-scale & Down-scale Azure SQL Database tiers in an automated manner. I’m talking Azure SQL DB (DBaaS) and not SQL Server on Azure (IaaS). To get this automation, we were using the Azure Runbook Automation PowerShell workflow.

Basically, we needed to up-scale the database tier from Premium P1 to P2 in the morning and down-scale it back down to P1 in the night. To save costs you know 😉
There is a lot of information on the WWW showing different ways one can write a PS automation workflow. This post is to show you just 1 way that I was able to get it to work.

  1. Login to your Azure Subscription and go to the Resource Group that contains the SQL Database you need to work with.
  2. Create a new Automation Account
  3. Since I was trying to automate just our Cloud stuff in Azure (SQL DB), I ended up choosing the authentication method for the Automation account as “Azure Run As Account”
    Note: you can also use an Azure AD Account depending on your organization’s architecture
  4. Once the automation account has been setup, open up it’s blade in the Azure Portal and got to Runbooks. You will notice that there are already 3-4 tutorial runbooks present there by default.
  5. Add a Runbook -> Create a new Runbook. Give it some meaningful name for e.g. ResizeSQLDatabase
  6. For the runbook type, I chose PowerShell workflow. (You can use the Graphical workflow too).
  7. You should now see a new PS workflow window with the workflow name that you specified. Just like a regular PS workflow, you can define input parameters in the param () block.

param
(
[parameter(Mandatory=$true)]
[string]$ResourceGroupName
)

The below step is important. For Azure automation to work, it needs to use some form of credentials to perform your automation work. In my case up-scale & down-scale database tiers on a schedule. Copy & paste the below code section right after the param block in the workflow


# Get the connection "AzureRunAsConnection"
try
{
$connectionName = "AzureRunAsConnection"
$servicePrincipalConnection=Get-AutomationConnection -Name $connectionName

    "Logging in to Azure..."
    Add-AzureRmAccount `
        -ServicePrincipal `
        -TenantId $servicePrincipalConnection.TenantId `
        -ApplicationId $servicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
}
catch
{
    if (!$servicePrincipalConnection)
    {
        $ErrorMessage = "Connection $connectionName not found."
        throw $ErrorMessage
    } 
    else
    {
        Write-Error -Message $_.Exception
        throw $_.Exception
    }
}

What the above code is doing is basically using the built-in automation account called “AzureRunAsConnection” and it’s credentials to run the PS workflow.

Now, coming to the actual SQL DB scaling, you can use the below cmdlet to do that
Set-AzureRMSqlDatabase -DatabaseName $DatabaseName -ResourceGroupName $ResourceGroupName -ServerName $ServerName -Edition $NewEdition -RequestedServiceObjectiveName $NewPricingTier

Each of the $variables in the above line are input parameters to my runbook. $NewPricingTier is the one that controls the SQL database tier & DTU’s
You can find more information about the parameters values here -> https://azure.microsoft.com/en-us/documentation/articles/sql-database-service-tiers/?rnd=1

Once done, save the runbook and create job schedules for the new runbook with the appropriate input parameters and let Azure Automation do the work for you! 😀

Just a quick post to help anyone out there with their first Azure Automation runbook.

Posted in Azure, Code Samples, PowerShell | Tagged: , , , , | Leave a Comment »

Scan, Find & Understand Missing Indexes in your SQL Database

Posted by Sudarshan Narasimhan on June 17, 2016


This post is Part 4 of my series of posts on Index Maintenance for a DBA. So here goes.

Enough has been said & talked about regarding missing indexes in SQL Server. Before you read any further, it is very important to understand that “missing indexes” that your colleagues & friends talk about are really missing index “recommendations“. The keyword is the recommendations part. Take the information in this post as a whole and don’t go crazy creating every missing index recommended when you run the query given below. Refer to my earlier posts in the indexing series (duplicate, unused indexes) to know why it is a bad idea to go “index-crazy”. You need to carefully evaluate the benefits of having an index vs the cost of maintaining it.

Some Background

SQL Server provides exposes index recommendations via DMV’s (sys.dm_db_missing_index_detailssys.dm_db_missing_index_groups and sys.dm_db_missing_index_group_stats). These dynamic views contain information collected by the SQL Server Query optimizer engine as your DB workloads and queries execute on the database engine. Since this is “dynamic” information, it is only as relevant as your last SQL Service restart. So this is something that needs to get into your blood as a DBA to check for periodically and is NOT a one-off activity.

 

Things to Consider & Understand before implementing Missing Index Recommendations

Your boss asks you if this index thingy will improve the performance of the database. The favourite DBA response “It Depends”  😀

MrBeanItDepends

 

 

 

 

 

 

 

 

 

 

Well, it really does! I’ll explain why… Like any other recommendations you need to take into account additional information along with this data to come to a conclusion. Consider for example the following:

  1. Has your DB workload changed since the time you collected the missing index recommendations? (i.e. Has there been an application release or DB schema upgrade since)
  2. Are there similar existing indexes on the table where it’s recommending you create another index? (Maybe you just need to add a few additional columns to the existing index as included columns to “cover” it…)
  3. Is the query batch or workload very frequently run, enough that the impact is very high if you create this index? (Refer to the last_user_seek and last_user_scan columns to find out when a particular query last used these columns for seek/scan operations)
  4. Do you have enough space in the DB/disk to afford yet another index? Especially important, if it’s a large table. (DBA 101 question, I know!)
  5. If you create this index and it’s rarely used, you are increase the maintenance overhead and also slowly down WRITE operations on the table.

 

IT DEPENDS, really it does!

 

Missing Index – Improvement Measure Calculation Logic

To help answer some of these questions, I calculate an arbitrary number to determine the “usefulness” of a new index, called improvement measure. The higher this number, the more useful it is to create it. The improvement measure is calculated based on 4 factors

  1. Average cost of the user queries that could be reduced by the new index. This is the current cost of the query.
  2. Average impact on the query if this index were created. i.e. The query cost would on average drop by this percentage value.
  3. Number of Seek operations that this index could have been used by user queries.
  4. Number of Scan operations that this index could have been used by user queries.

 

Now that you understand what missing indexes are, factors to consider and the logic behind the recommendations, go ahead and run the below query on your databases. The output provides you the “CREATE INDEX” T-SQL statement that you can use to create the index, if you see fit after considering all factors mentioned so far. This saves you time having to write the T-SQL yourself. The CREATE INDEX statement will include the index key columns in order, suggest included columns where applicable.

 



Script – Find Missing Indexes Recommendations in SQL Server

USE [msdb]  /*Replace with your Database Name */
GO
SELECT TOP 50 
    GETDATE() AS [RunTime],
    DB_NAME(mid.database_id) AS [DBNAME], 
    OBJECT_NAME(mid.[object_id]) AS [ObjectName], mid.[object_id] AS [ObjectID],
    CONVERT (decimal (28,1), migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)) AS [Improvement_Measure],
    'CREATE INDEX missing_index_' + CONVERT (varchar, mig.index_group_handle) + '_' + CONVERT (varchar, mid.index_handle) 
    + ' ON ' + mid.statement 
    + ' (' + ISNULL (mid.equality_columns,'') 
    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END + ISNULL (mid.inequality_columns, '')
    + ')' 
    + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS [CREATE_INDEX_Statement],
    migs.user_seeks, migs.user_scans, migs.last_user_seek, migs.last_user_scan, migs.avg_total_user_cost, migs.avg_user_impact, migs.avg_system_impact,
    mig.index_group_handle, mid.index_handle
FROM sys.dm_db_missing_index_groups mig
    INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle
    INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle
WHERE CONVERT (decimal (28,1), migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)) > 10
    AND mid.database_id = DB_ID()
ORDER BY [Improvement_Measure] DESC
GO

 

Script Result
Here is a sample output from my test system showing all missing indexes recommended on the MSDB database.

MissingIndexes

 

 

 

Test, Test, Test!

Did I mention that you must test these index recommendations on your Test system? If you know your application workload/query, you must test to gauge the impact of having this index.

Keep in mind, your testing must not be to run just a single batch/query, as 95% of the time this will yield a positive result. But, you must include other sample workloads along with your specific query. Why? Because, indexes are structures on the table not specific to just your query. Other queries can use it too and SQL has to keep the index up to date when data changes happen to the table. Always test any index changes with as close to a real world production like workload as possible. You don’t want any surprises on Production now, do you?  🙂

Well SQL folks, this concludes my indexing series of posts for DBA’s. Leave your comments down below if you find this useful and would like more material on these lines.

 

-Sudarshan (TheSQLDude)

Posted in DB Maintenance, Indexing, Performance, T-SQL | Tagged: , , , , , , | 1 Comment »

Seek & Evaluate Unused Indexes in your SQL database

Posted by Sudarshan Narasimhan on June 16, 2016


This post is Part 3 of my series of posts on Index Maintenance for a DBA. So here goes.

As a regular part of your DBA work, it is very important to keep evaluating the need for indexes in your database environment. Any DBA worth his salt, will always look to “maintain” the indexes by ensuring the fragmentation is kept low. Most of you probably have maintenance tasks/scripts to take care of this. An often under-looked aspect in index maintenance is to also consider whether some indexes are an over-kill a.k.a unnecessary and potentially drop them if it’s not really utilized. This is especially important when dealing with 3rd party vendor databases, where the DBA does not have full control on what comes in and what goes out in terms of index changes.

I like to think of the database as my “house” and take care of it the same way one would do to your own home. You always consider who/what comes in to your house. The same care needs to be applied to the databases you maintain. As gatekeepers to the database, even if you don’t have complete control on what comes in, it is good to be aware & have the stats handy so that you can work with your application teams or vendors, to keep the house tidy.

If you link this data back to my previous posts Part 1 – Seek & Destroy Duplicate Indexes in your SQL Database and Part 2 – Seek & Understand Top Indexes by Size in your SQL database , you will be on your way to hopefully free up some space and realize space cost and performance gains.

Here is a handy script that I use to:-

1) Lists the indexes on each table in the database, that are completely unused.

This scripts lists those indexes where not even a single Read operation has used these indexes. Keep in mind, write operations still need to update this index as data is being updated/inserted contributing to overall time of the Write operations. When no-one is using the index for Read it becomes a system overhead.

2) Lists the indexes on each table in the database, that are having more Writes compared to Reads

The maintenance cost of having these indexes from insert, update, or delete operations on the underlying table are more compared to the benefits for Read operations. These indexes needs to be investigated to see if they are still required. A cost benefit analysis on these indexes needs to be performed.
Note: I’m not advocating that you remove these indexes, but just that you carefully consider the benefits vs the cost, before making any decision.



Script #1 – TOTALLY UNUSED INDEXES

USE [DBNAME] /* Replace with your Database Name */
GO
--TOTALLY UN-USED INDEXES
SELECT DB_NAME(s.database_id) as [DB Name], OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id,
    i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor, i.is_unique,
    s.user_updates AS [Total Writes],
    (s.user_seeks + s.user_scans + s.user_lookups) AS [Total Reads],
    s.user_updates - (s.user_seeks + s.user_scans + s.user_lookups) AS [Difference],
    (partstats.used_page_count / 128.0) AS [IndexSizeinMB]
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
    ON s.[object_id] = i.[object_id]
    AND s.index_id = i.index_id
    AND s.database_id = DB_ID()
INNER JOIN sys.dm_db_partition_stats AS partstats
    ON i.object_id = partstats.object_id AND i.index_id = partstats.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
    AND user_updates > (user_seeks + user_scans + user_lookups)
    AND (s.user_lookups=0 AND s.user_scans=0 AND s.user_seeks=0)
    AND i.index_id > 1
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION (RECOMPILE);
GO

Script #1 Result
Here is a sample output from my test system showing all unused indexes on the MSDB database.

UnusedIndexes

 

 

 

 

 

 



Script #2 – INDEXES WITH MORE WRITES THAN READS

USE [DBNAME] /* Replace with your Database Name */
GO
--INDEXES WITH WRITES > READS
SELECT DB_NAME(s.database_id) as [DB Name], OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id,
    i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor, i.is_unique,
    s.user_updates AS [Total Writes],
    (s.user_seeks + s.user_scans + s.user_lookups) AS [Total Reads],
    s.user_updates - (s.user_seeks + s.user_scans + s.user_lookups) AS [Difference],
    (partstats.used_page_count / 128.0) AS [IndexSizeinMB]
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
    ON s.[object_id] = i.[object_id]
    AND s.index_id = i.index_id
    AND s.database_id = DB_ID()
INNER JOIN sys.dm_db_partition_stats AS partstats
    ON i.object_id = partstats.object_id AND i.index_id = partstats.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
    AND (s.user_lookups<>0 OR s.user_scans<>0 OR s.user_seeks<>0)
    AND s.user_updates > (s.user_seeks + s.user_scans + s.user_lookups)
    AND i.index_id > 1
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION (RECOMPILE);
GO

Script #2 Result
Here is a sample output from my test system showing all indexes having more Writes compared to Reads on the MSDB database.

WritesMoreThanReadsIndexes

 

 

 

 

 

Well, that’s all folks.

-Sudarshan (TheSQLDude)

Posted in DB Maintenance, Indexing, T-SQL | Tagged: , , , , , , | 1 Comment »

Checklist for SQL Server Database Audit & Security Hardening

Posted by Sudarshan Narasimhan on June 8, 2016


TheAuditIsComing
I’ve been involved in a lot of IT security audits, many times due to legislative requirements. A few good examples of these are PCI, SOX, HIPAA, GLBA audits. Each of these US legislative acts have a set of defined security standards & configurations that organizations have to adhere to.  Audits for these systems happen both internally and externally to ensure the company is compliant to these standards and any remediation’s are acted upon.

Being a DBA, you’re usually involved in these IT audits. The scope for these audits should be very well defined per your internal IT security policies. As a database administrator, you are responsible for the secure management of the company’s data, vis-à-vis compliance data. The systems in scope for these databases are usually ones (but not limited to) that contain the following data:

  1. Payment Card Data or database systems used for Payment processing (Online payment web systems, retail POS etc)
  2. Financial Data (like Bank, Insurance, Stocks etc)
  3. Health Information (like medical, patient records etc)
  4. Personally Identifiable Information (PII) data (like client/member data including DOB, Tax Number, SSN etc)

While each of these compliance regulations are very-well documented (especially PCI DSS), you must follow certain best practices or standards for any DBMS system, irrespective of the requirements. In today’s world, data leaks are becoming increasing prevalent and within the DB world, it is very essential to adhere to strict & enforceable data security practices.

All that said, coming to SQL Server security good practices, I’ve listed what I feel are the bare-minimum security checks that need to be performed. You can add stuff to this, but in my view these are essential and non-negotiable settings that you must check as a DBA. I’ll probably share some scripts & code to monitor these setting centrally in an automated way, in another post. This post is to list down and share these settings so that you are aware of the various things to consider when looking at SQL Server Security Hardening & Audits.

 

CHECKLIST: Security Audit of a SQL Server Database Instance

 

Instance Level:-
1. Server Role Membership
2. Server Permissions e.g. CONTROL SERVER, ALTER ANY DATABASE
3. Public Role Permissions (harden this as much as possible on user database, leaving aside system databases)
4. BUILTIN Admins present as a login?
5. Default Port changed?
6. Orphaned AD Users & Groups must be removed
7. [NT AUTHORITY\ANONYMOUS LOGON] must be removed.

Database Level:-
1. Database Role Membership (Elevated)
2. Database Role Membership (Standard)
3. Database Permissions e.g. DELETE, ALTER, SCHEMA
4. Guest User Disabled?
5. DBO User ownership (you can read my previous post on DBO ownership HERE (point #2) to understand why this is important)
6. No Orphaned Users should exist

Windows OS Level:-
1. Local Administrators Group Members
2. Local Security Policy Privileges (following this MSDN  doc to grant only those privileges required to run SQL Services to the SQL service accounts)
3. Guest User enabled?
4. Windows Firewall configured to allow connections via a specific Port Only (I hate systems setup to allow all incoming connections to sqlservr.exe. Duh!)
5. SQL Browser Service Disabled?

Configuration:-
1. XP_CMDSHELL Disabled?
2. Password Complexity & Lockout for SQL Logins (SQL logins inherit the account & password policies of the server on which the instance is running. Work with your IT admins to implement a stronger policy using GPO if required. Otherwise make sure CHECK_POLICY setting is turned ON for all SQL logins [including SA])
3. SA password rotated frequently (if you can disable the SA account, even better)
4. Cross database ownership chaining Disabled?

Patch Management
1. Are all the security bulletin patches for known vulnerabilities applied on your SQL Server instance. (Refer to https://technet.microsoft.com/en-us/security/bulletins.aspx and filter on your specific version & patch level to get the list of known security patches available).
2. Are all known Windows security patches applied (Depending on your company, this might be handled by another team, but it’s still a DB Server, so better be prepared if someone else isn’t doing their job properly 😉 )

Compliance:-
1. Access Control mechanisms in place for Sensitive/PII Data?
2. Access Control mechanisms in place for Regulatory Data?
3. SQL Server Audit enabled to track permissions/role/users changes to compliance data?

Penetration Tests
1. SA Login checks. Is anyone trying to login by brute-force using the SA account?
2. Main App login. Is anyone trying to login by brute-force using the application login (especially if it’s a SQL login)

 

I was almost about to add managed service accounts (MSA/GMSA) to this list, but considering SQL 2016 just released a week ago and most of you are probably running SQL 2012/2014 with either Clustering/AlwaysON, I’m not going to make it a must-do yet. (Damn you Microsoft, you promised GMSA support for SQL 2014 but didn’t follow through 😦 )

That’s all for now. I’ll continue this in another post with some sample code to help you through all these items.

Until then, stay safe and may the force be with you.

-TheSQLDude (Sudarshan)

Posted in Auditing, General Thoughts, Security | Tagged: , , , , , , | Leave a Comment »

Seek & Understand Top Indexes by Size in your SQL database

Posted by Sudarshan Narasimhan on June 3, 2016


This post is Part 2 of my series of posts on Index Maintenance for a DBA. So here goes.

Often, when you are performing capacity planning or upgrades or working with 3rd party vendor on database schema upgrades, you need to look into the existing indexes on the database. One of the key things you need to consider is disk space. Aside from the actual “data” in the tables, indexes also occupy space depending on the type of index. Here is a handy script that I use to quickly understand how many indexes are present for each table and the size of each index, in any given SQL database.

If you link this data back to my post “Seek & Destroy Duplicate Indexes in your SQL Database“, you will be on your way to hopefully free up some space and realize space cost savings.

This query only reports on user objects (object_id>99) and filters out HEAP tables. There are some exceptions to the filter on user objects (like service broker tables, CDC tables etc) which will still be considered as “user objects” though it supports a system functionality.

USE [msdb]  /*Replace with your Database Name */
GO
SELECT 
    DB_NAME() DatabaseName,
    OBJECT_SCHEMA_NAME(i.OBJECT_ID) SchemaName,
    OBJECT_NAME(i.OBJECT_ID ) TableName,
    i.name IndexName,
    i.type_desc as [IndexType],
    i.is_primary_key as [PrimaryKeyIndex],
    i.is_unique as [UniqueIndex],
    i.is_disabled as [Disabled],
    SUM(s.used_page_count) / 128.0 IndexSizeinMB,
    CAST((SUM(s.used_page_count)/128.0)/1024.0 as decimal(10,6))  IndexSizeinGB
FROM sys.indexes AS i
INNER JOIN sys.dm_db_partition_stats AS S
    ON i.OBJECT_ID = S.OBJECT_ID AND I.index_id = S.index_id
WHERE i.object_id>99 and i.type_desc  'HEAP'
GROUP BY i.OBJECT_ID, i.name, s.used_page_count, i.type_desc, i.is_primary_key, i.is_unique, i.is_disabled
ORDER BY s.used_page_count DESC
GO

 

Script Result
Here is a sample output from my test system showing all the indexes on MSDB database.

Index Size

 

Stay tuned for more index related scripts in this series.

-Sudarshan (TheSQLDude)

Posted in DB Maintenance, Indexing, T-SQL | Tagged: , , , | 1 Comment »

Seek & Destroy Duplicate Indexes in your SQL Database

Posted by Sudarshan Narasimhan on May 20, 2016


This post is Part 1 of my series of upcomingposts on Index Maintenance for a DBA. So here goes.

There are all sorts of indexing tuning & optimizations that can be done on a SQL database. One of the no-brainer things a DBA must do is “seek & destroy” (pun intended) duplicate indexes. Duplicate indexes are indexes exactly similar in structure, including key columns and their base tables. These indexes serve no useful purpose and in fact contribute to slowness for WRITE database operations as the WRITER (INSERT/UPDATE/DELETE) must do more work to update the index as well as this directly contributes to the slowness of a write operation. If AlwaysON or other HA, then it has to do this on all the replicas as well, adding additional unnecessary network overhead.

You’d be surprised at the results. Given this script a run on any of your databases especially Vendor application databases. If you do see any duplicates, time to talk to your vendor or DB developer.

Here is query that I use to identify duplicate indexes. This query considers 2 or more indexes to be Exact Duplicates when all the key columns & included columns (if any) match another index on the same table.

USE [tempdb]  /* Change to your DB Name */
GO
DECLARE @i int=1, @count int=0, @tblname nvarchar(250), @indxname nvarchar(250), @cmd nvarchar(4000)
DECLARE @DuplicateIndexes TABLE(id int IDENTITY(1,1), table_name nvarchar(250), index_name nvarchar(250), dup_index_name nvarchar(250))
DECLARE @IndexInfo TABLE(id int IDENTITY(1,1), table_name nvarchar(250), index_name nvarchar(250), index_description nvarchar(250),index_keys nvarchar(250), included_columns nvarchar(250) null)

-- exact duplicates
;WITH indexcols AS
(
    select object_id as id, index_id as indid, name,
    (select case keyno when 0 then NULL else colid end as [data()]
    from sys.sysindexkeys as k
    where k.id = i.object_id
    and k.indid = i.index_id
    order by keyno, colid
    for xml path('')) as cols,
    (select case keyno when 0 then colid else NULL end as [data()]
    from sys.sysindexkeys as k
    where k.id = i.object_id
    and k.indid = i.index_id
    order by colid
    for xml path('')) as inc
    from sys.indexes as i
),
DuplicateIndexes AS
(
    SELECT
    object_schema_name(c1.id) + '.' + object_name(c1.id) as [TableName],
    c1.name as [IndexName],
    c2.name as [DuplicateIndexName]
    FROM indexcols as c1
    join indexcols as c2
    ON c1.id = c2.id
    and c1.indid < c2.indid
    and c1.cols = c2.cols
    and c1.inc = c2.inc
)
INSERT INTO @DuplicateIndexes
SELECT * FROM DuplicateIndexes

SELECT @count=count(*) FROM @DuplicateIndexes
WHILE (@i <= @count)
BEGIN
    SELECT @tblname = [table_name], @indxname = [index_name] FROM @DuplicateIndexes WHERE [id] = @i
    DECLARE @tmptbl TABLE (index_name nvarchar(250), index_description nvarchar(250),index_keys nvarchar(250))
    DELETE FROM @tmptbl
    SET @cmd = 'EXEC sp_helpindex ''' +@tblname+ ''''

    INSERT INTO @tmptbl (index_name, index_description, index_keys)
        EXEC (@cmd)

    INSERT INTO @IndexInfo (table_name, index_name, index_description, index_keys)
    SELECT @tblname, index_name, index_description, index_keys FROM @tmptbl

    DECLARE @incld_cols sysname
    SET @cmd = '
    SELECT @tmpstr=b.IncludedColumns
    FROM (
    SELECT LEFT(a.indx_cols_included, LEN(a.indx_cols_included)-1) as [IncludedColumns]
    FROM (
    SELECT STUFF (( SELECT C.Name + '','' AS [text()] 
    FROM sys.indexes I 
     INNER JOIN sys.index_columns IC 
      ON  I.object_id = IC.object_id AND I.index_id = IC.index_id 
     INNER JOIN sys.columns C 
      ON IC.object_id = C.object_id and IC.column_id = C.column_id 
    WHERE IC.object_id=object_id('''+@tblname+''') and I.name = '''+@indxname+''' and IC.is_included_column=1
    FOR XML PATH('''')
    ), 1, 0, '''') AS indx_cols_included) as a) as b'
    
    EXEC sp_executesql @cmd, N'@tmpstr sysname OUTPUT', @incld_cols OUTPUT
    UPDATE @IndexInfo SET included_columns = @incld_cols WHERE [id]=@i

SET @i = @i + 1
END

--DUPLICATE INDEX INFORMATION SUMMARY
SELECT DB_NAME() AS [DB Name], d.table_name as [TableName], d.index_name as [IndexName], d.dup_index_name as [DuplicateIndexName], i.index_keys as [IndexKeyColumns], i.included_columns as [IncludedColumns]
FROM @DuplicateIndexes d
JOIN @IndexInfo i
    ON d.table_name = i.table_name
    AND d.index_name = i.index_name
GO

 

Script Result
Here is a sample output from my test system where I’ve created 2 indexes having the same structure
DuplicateIndexes

 

 

 

Note: The original source script was from HERE. I’ve modified & added to this to include index key column information as well as included columns to present a more clear output on the duplicate indexes.

 

Stay tuned for more index related scripts in this series.

-Sudarshan (TheSQLDude)

Posted in DB Maintenance, Indexing, Query Optimization, T-SQL | 4 Comments »

SQL Server 2016 – General Availability Date Announced

Posted by Sudarshan Narasimhan on May 3, 2016


Extra Extra – Read all about it.

Exciting news as Microsoft just announced the general availability date for SQL Server 2016 – June 1st, 2016

You can read the announcement and more on the Data Platform Insider Blog
https://blogs.technet.microsoft.com/dataplatforminsider/2016/05/02/get-ready-sql-server-2016-coming-on-june-1st/

 

SQL 2016 comes in Enterprise, Standard, Express and Developer Editions. You can compare the editions using the datasheet published here
http://download.microsoft.com/download/8/A/2/8A2BC8C5-BBA0-4A9C-90BC-AC957D3454D9/SQL_Server_2016_Editions_datasheet.pdf?wt.mc_id=WW_CE_DM_OO_BLOG_NONE

 

Stay tuned for more as the release date approaches. For those looking to move away from Oracle Databases, if you have/get Software Assurance then the SQL 2016 Editions costs are free. You can read more about this offer here,
https://www.microsoft.com/en-us/server-cloud/sql-license-migration.aspx?wt.mc_id=WW_CE_DM_OO_BLOG_NONE

-TheSQLDude

Posted in SQL News | Leave a Comment »

Quick SQL Server Info – Script

Posted by Sudarshan Narasimhan on April 19, 2016


Here is a handy script that I used to quickly get high-level information about any SQL Server Instance. It provides basic information like machine name, number of CPU’s, memory, patch level, HA configuration and the user databases hosted on this instance.

/*SCRIPT START*/
DECLARE @cpu int, @memory decimal(2), @dbcount int, @dbs varchar(2000)
SELECT @cpu=cpu_count FROM sys.dm_os_sys_info
SELECT @memory= CONVERT(decimal(2),ROUND([total_physical_memory_kb]/1024.0/1024.0,1)) FROM [sys].[dm_os_sys_memory]
SELECT @dbcount=COUNT(*) FROM sys.databases where database_id>4
SELECT @dbs = (SELECT STUFF(
    (
    SELECT ', ' + DB.name
    FROM sys.databases AS DB
    WHERE DB.database_id>4
    FOR XML PATH('')), 1, 2, '') AS Databases)
SELECT 
@@SERVERNAME as [SQLServerName]
,SERVERPROPERTY('ComputerNamePhysicalNetBIOS') as [MachineName]
,[CPU] = @cpu
,[Memory (GB)] = @memory
,[Num.Databases] = @dbcount
,[Version] = 
    CASE WHEN CONVERT(varchar(50), SERVERPROPERTY('ProductVersion')) LIKE '9%' THEN 'SQL 2005'
        WHEN CONVERT(varchar(50), SERVERPROPERTY('ProductVersion')) LIKE '10.0%' THEN 'SQL 2008'
        WHEN CONVERT(varchar(50), SERVERPROPERTY('ProductVersion')) LIKE '10.5%' THEN 'SQL 2008R2'
        WHEN CONVERT(varchar(50), SERVERPROPERTY('ProductVersion')) LIKE '11%' THEN 'SQL 2012'
        WHEN CONVERT(varchar(50), SERVERPROPERTY('ProductVersion')) LIKE '12%' THEN 'SQL 2014'
        WHEN CONVERT(varchar(50), SERVERPROPERTY('ProductVersion')) LIKE '14%' THEN 'SQL 2016'
        WHEN CONVERT(varchar(50), SERVERPROPERTY('ProductVersion')) LIKE '15%' THEN 'SQL vNext'
    ELSE 'UNKNOWN' END
,SERVERPROPERTY('ProductLevel') as [ServicePackLevel]
,ISNULL(SERVERPROPERTY('ProductUpdateLevel'),'N/A') as [UpdateLevel]
,ISNULL(SERVERPROPERTY('ProductUpdateReference'),'N/A') as [UpdateKBNumber]
,SERVERPROPERTY('Edition') as [Edition]
,SERVERPROPERTY('IsClustered') as [Clustered]
,SERVERPROPERTY('IsHadrEnabled') as [isAlwaysON]
,[UserDatabasesHosted] = @dbs
GO
/*SCRIPT END*/

Posted in T-SQL | Tagged: , , | Leave a Comment »

theSQLDude – Back from Down Under..

Posted by Sudarshan Narasimhan on February 22, 2016


Greetings Fellow SQL Dudes,

It’s been a really long (long) time since I last posted here. After a long hiatus, I have tons of new material, scripts, scenarios & solutions lined up to share with the world.

Expect to see new material to start flowing through this site, with a Page Life Expectancy of close to 100% 😀

TheSQLDude is back!

Stay tuned for more SQL’lly stuff from the land Down Under…

-TheSQLDude

Posted in Dude Musings | 2 Comments »

A scenario on Query Tuning–Included Columns are a good thing!

Posted by Sudarshan Narasimhan on December 20, 2012


Well, this post has been due for quite some time. I had opened up a survey/opinion poll on https://thesqldude.com/2012/07/10/user-opinion-poll-what-topics-would-you-like-to-see-posted-on-thesqldude-weblog/ sometime back and as expected most of you out there want to see more on performance tuning and query optimization.

image

  Figure 1: Opinion Poll Results

Without much ado and by popular demand here goes a simple scenario on query tuning.

Tuning


One of my engineers came to me with the following scenario…

Problem Statement

We have this Update query that is taking much longer to run compared to before. No schema changes have been made though its quite possible data changes have been happening. The update query goes like this,

update msgboard set mb_status =10, mb_changed_by ='SATCOM'  
from msgboard 
where mb_type = 210 

Table Schema

I was easily able to reproduce this issue on my instance, here is how my table setup looks.

/****** Object:  Table [dbo].[msgboard]    Script Date: 12/20/2012 01:52:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[msgboard](
    [mb_id] [int] NOT NULL,
    [mb_type] [int] NOT NULL,
    [mb_status] [int] NOT NULL,
    [mb_changed_by] [varchar](50) NULL,
    [mb_changed_dt]  AS (getdate())
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
CREATE UNIQUE CLUSTERED INDEX [pk_msgboard] ON [dbo].[msgboard] 
(
    [mb_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 100) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_msgboard_type] ON [dbo].[msgboard] 
(
    [mb_type] ASC,
    [mb_status] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

I populated this table with some sample data of 1000 rows using the following script.

declare @counter int
set @counter=0
declare @mbid int
set @mbid=1
declare @mbtype int
set @mbtype=200
declare @mbstatus int
set @mbstatus=10
while(@counter<1000)
begin
INSERT INTO [dbo].[msgboard]
           ([mb_id]
           ,[mb_type]
           ,[mb_status]
           ,[mb_changed_by])
     VALUES
           (@mbid,@mbtype,@mbstatus,'THESQLDUDE')
set @mbid+=1
set @counter+=1
end
GO
INSERT INTO [test].[dbo].[msgboard]
           ([mb_id]
           ,[mb_type]
           ,[mb_status]
           ,[mb_changed_by])
     VALUES
           (1001,210,10,'MANWITHNONAME')

So, I have 2 indexes on this table:-

1) A unique Clustered Index pk_msgboard on the mb_id column

2) A composite non-clustered index IX_msgboard_type on the columns mb_type and mb_status (in this order).

Query Execution Details

Here is what I observed when I ran the following update statement.

update msgboard set mb_status =10, mb_changed_by ='SATCOM'  
from msgboard 
where mb_type = 210 

To actual execution plan looks like this,

image

Total Cost: 0.03124485

As you can see, this query is using the Clustered Index pk_msgboard and its performing a Scan. At this point, two questions should pop into your head

1. Why is the query using the clustered index?

2. Why is it performing a Scan operation?

Answer to question #1

Let’s look at the update query carefully. The query is updating the columns mb_status and mb_changed_by, and there is a filter on column mb_type. You might think, well, I have an index IX_msgboard_type on the mb_type column, why is SQL Server not using this non-clustered index?

To prove I was smarter than SQL Server optimizer, I forced it to use the non-clustered index by adding an index hint. The query looks like this,

update msgboard set mb_status =10, mb_changed_by ='SATCOM'  
from msgboard WITH(INDEX(IX_msgboard_type))
where mb_type = 210 

image

Total Cost: 0.03651106

  • Notice that the Cost for the query with index hint is higher than the one without! If the table had more rows, we would have seen a bigger difference in cost Smile.
  • Additionally, a Table Spool operator is now part of the execution plan. The table spool operator caches the row to be updated (In tempdb) and then feeds it to the Clustered Index update operator.
  • This proves that SQL Server optimizer was actually choosing the cheapest and best plan it could come up with.

The reason SQL Server picked the clustered index is because the query is updating the columns mb_status =10 and mb_changed_by. This column mb_status is not the leading column in the NC index IX_msgboard_type, but the 2nd column. Why does this matter? It does because, the statistics for this NC index would be built on the leading column which is mb_type. Here is how the histogram looks.

dbcc show_statistics ('dbo.msgboard','IX_msgboard_type')

image

To avoid a bookmark lookup (Key Lookup operator from SQL 2005 onwards), SQL Server decided to use the clustered index since it covers all the columns in the table. If SQL Server were to use the NC index, it would be do a lookup or cache the results (ahem, Table Spool operator!) and then find the matching row in the clustered index to actually update the 2 columns. This is an additional cost, and this is why SQL decided not to go with this plan.

Coming to the 2nd question of why a Scan? It might surprise some of you, but a Seek does not imply good performance always. My favourite answer – “It depends”. In this case, the table only had 1001 rows and of which only 1 row (mb_id=1001) qualified for the update. When using the CI Scan, SQL Server was applying the filter on mb_type=210 and that returned 1 row back. A Clustered Index Seek in this case would not make any positive difference in query execution time.

OBJECT:([test].[dbo].[msgboard].[pk_msgboard]), WHERE:([test].[dbo].[msgboard].[mb_type]=(210)) ORDERED

To test this, I added the FORCESEEK table hint to this update query and as expected the query did not run faster.

update msgboard set mb_status =10, mb_changed_by ='SATCOM'  
from msgboard WITH (FORCESEEK) where mb_type = 210 

In fact it has the same cost as the query with index hint –> 0.03651106
 

Solution

Instead of trying to second guess the query optimizer and trying to tweak the execution plan it is producing, this issue needs a different approach –> back to basics. Based on the query design, if the column being updated mb_status was “covered” then it would help. In this case that was also also take care of, since IX_msgboard_type also includes the column mb_status, though as a non-key column. This is an important thing and this was why the “included columns” feature was introduced.

So, I modified the index IX_msgboard_type to remove the column mb_status. Instead, I added the column back as an included column

DROP INDEX [IX_msgboard_type] ON [dbo].[msgboard] WITH ( ONLINE = OFF )
GO
CREATE NONCLUSTERED INDEX [IX_msgboard_type] ON [dbo].[msgboard] 
(
    [mb_type] ASC
)
INCLUDE ( [mb_status]) WITH (STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

The biggest advantage of having an included column in a non-clustered index is that, the query optimizer can locate all the column values within the index. The base table or clustered index data is not accessed resulting in fewer disk I/O operations and hence faster query execution. Also, the index size is lesser since non-key columns are only added to the leaf level. Many advantages indeed.

Here is how the update query plan looked after added mb_status as an included column.

image

My oh my, this plan looks so simple, clean and yeah faster. The actual cost is now 0.0132843, which is a big improvement over the original plan of 0.03651106

So we have achieved our original goals of

a) Getting SQL Server to use the Non-clustered index IX_msgboard_type, instead of the clustered index.

b) Getting an Seek operation instead of a Scan.

c) Query cost is cheaper.



So, hope everyone understood how beneficial included columns are on non-clustered indexes. For more information on guidelines and restriction with included columns refer this MSDN article.

That’s all for now folks. Keep tuned to this site for more.

-TheSQLDude (Sudarshan)

Posted in Performance, Query Optimization | Tagged: , , , , , , , , , , | 6 Comments »