The SQL Dude!

On-Disk & In-Memory Pages that teach you the SQL way of life!

Archive for the ‘Startup’ Category

Guide to Troubleshooting SQL Server Service startup issues

Posted by Sudarshan Narasimhan on August 9, 2011


I did post on this same topic on MSDN SQLServerFAQ and I am re-posting this here for easy reference. This is a handy guide for every SQL DBA, as they would have come across this issue at one point or another in their DBA life. Read on…


This is one of those challenging situations when customers call us at times. I can understand the gravity of the situation – your boss is behind your back, your business is down, your application is down, your website cannot accept orders or your phone is ringing off the hook with calls from the helpdesk. Don’t worry; I will guide you through some basic steps & actions that you can take to make some sense amidst all this madness.

clip_image002[4] Warning:

This blog is only a basic guide and by no means covers all possible troubleshooting techniques and meant only as a starting point.

 

Nothing in life is to be feared. It is only to be understood. – Marie Curie

clip_image004[4]

Startup Options
Have you ever called SQL CSS for a startup issue and seen the engineer use some cool trace flags or parameters to get SQL started. Well, it’s all documented (well, most of it that’s worth knowing)

Using the SQL Server Service Startup Options
http://msdn.microsoft.com/en-us/library/ms190737.aspx

The most common ones by far that I use on SQL startup issues are:-

S.no

Parameter

Description

1.

-c

Start as a console application, not as a service.

2.

-m

Tries to start the SQL service in single user mode, i.e. only a single user can connect. This single user connection can be either  a sysadmin or a regular user connection

3.

-f

Tries to start the SQL service in Minimal configuration mode. This implicitly puts SQL Server in single-user mode and this also means only the system databases master, model, tempdb & mssqlsystemresource are recovered and started.

4.

-T XXXX

Tries to start the SQL Server should be started with the specified trace flag which follows after –T. Again this is case sensitive.

5.

-g

Specifies the number of megabytes (MB) of memory that SQL Server leaves available for memory allocations within the SQL Server process, but outside the SQL Server buffer pool. The default value for this is 256MB.

6.

-m”ClientApp Name”

You can limit the connections to the specified client application. For example, -m"SQLCMD" limits connections to a single connection and that connection must identify itself as the SQLCMD client program. You can use this option when you are starting SQL Server in single-user mode and an unknown client application is taking the only available connection. Cool option J

7.

-k  123

Limits the number of checkpoint I/O requests per second to the value specified e.g. 123 MB/sec. Refer http://support.microsoft.com/kb/929240 for more info.

 

Trace Flags
For more information some common trace flags see here
http://msdn.microsoft.com/en-us/library/ms188396.aspx. Few of the common ones that come handy during startup issues are:

1.       3608 Starts SQL and recovers only the master database.

2.       4010 Allows only shared memory connections to the SQL Server. Meaning, you will only be able to connect from the server machine itself. Client connections over TCP/IP or named pipes will not happen.

3.       4606 Disables password policy check during server startup.

4.       4022 Skips launching automatic stored procedures when SQL Server starts. This is equivalent to setting "scan for startup procs" configuration option to 0.

 

Here are the steps to start SQL Server from command prompt:-

1.      Right-click on the SQL Server service name in services.msc

 

clip_image006[4]

2.  Copy the path that you see here.

3.  Open a Command Prompt window and navigate to the path from step2.

E:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn>sqlservr.exe -c -m -sSQL2005

 

 

clip_image008[8] Note:

If you are using Windows Server 2008 or above, you need to open an elevated command prompt by launching cmd.exe with “Run as Administrator” option.

A common mistake that people sometime make is to try and start SQL using sqlservr.exe and you get this error,

clip_image010[4]

Text: “Your SQL Server installation is either corrupt or has been tampered with. Please uninstall then re-run setup to correct this problem”

Okay, you see this message and you are thinking oh no, there goes my dinner plans!
Timeout!
The above message can be misleading at times. You will get the above error if you try to start SQL server from command prompt for a named instance and do not specify an instance name. Without the -s switch the default instance will try to start, but using the sqlservr.exe file from another instance binn folder. You will also get this error if you specify an invalid parameter to sqlservr.exe. So, the installation is not corrupt after all (Dinner plans back on?)

Here’s how to start a named instance,
sqlservr.exe -c -s INSTNAME

Here’s how to start a default instance,
sqlservr.exe -c

Here’s how to start a named instance recovering only master database
sqlservr.exe -c -sMATRIX –T3608

clip_image008[9] Note:

Only specify the instance name and not the servername\InstanceName. E.g. If your machine name is YODA and your instance is called MATRIX, you only specify MATRIX. Also the “-s” is case sensitive. It does not matter if there is/isn’t a blank space after the “-s” parameter.


Some Common Scenarios for Startup Failures

Scenario 1

Windows could not start the SQL Server (MSSQLSERVER) on Local Computer. For more information, review the System Event Log.

OR

“The service failed to respond in a timely fashion”.

OR

“The service failed to start”.

1.      Locate the SQL Errorlog as that is the 1st place to look if a SQL service is not starting.

2.   This can be done by looking up the -e startup parameter value for the SQL Server instance in question.


clip_image012[4]

 

3.  Verify that the files pointed to by -d (master data file location),  -e (Errorlog location) and  -l (master log file location) actually exist and have the correct file paths and that the SQL service startup account has proper permissions. Alternately you can also get this from the registry by looking at the values SQLArg0, SQLArg1 and SQLArg2 in the following locations.

SQL 2000
Default: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQLServer\MSSQLServer\Parameters
Named: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\INSTANCENAME\MSSQLServer\Parameters

SQL 2005

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\Parameters

SQL 2008/R2
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQLServer\Parameters

 

4.  Once you have located the errorlog open it in notepad and look for any errors. An example would be like this,

2010-11-20 07:50:58.250 Server Error: 17113, Severity: 16, State: 1.
2010-11-20 07:50:58.250 Server Error 2(The system cannot find the file specified.) occurred while opening file ‘C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\master.mdf’ to obtain configuration information at startup. An invalid startup option might have caused the error. Verify your startup options, and correct or remove them if necessary.

The above message clearly indicates that the master.mdf was not found in the location it was expecting it to be found.

 

Scenario 2

You can also look into the Windows Application log for any events related to SQL Server.

initerrlog: Could not open error log file ‘C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG1\ERRORLOG’. Operating system error = 3(error not found).

The above error indicates that SQL could not find the errorlog file in the above location. What does this mean? This means that since SQL could not find the errorlog it has no place to log messages, hence it did not start.

When you are faced with this error, then if you look into the errorlog location, you will notice log files whose date/time stamp is not current. Do not make the mistake of opening these files and troubleshooting based on messages that are not for the current startup failure. This is where looking into the application log gives you the current entries.

I hope cleared up any confusion you might have had regarding this rule, the warning status and the simple logic behind the rule. As always stay tuned for more SQL tips…

Scenario 3

2011-04-16 07:52:49.32 spid5s      Error: 17204, Severity: 16, State: 1.
2011-04-16 07:52:49.32 spid5s      FCB::Open failed: Could not open file C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\mastlog.ldf for file number 2.  OS error: 32(error not found).

OS Error 32 translates to “The process cannot access the file because it is being used by another process.” How did I know this? Simple. You can find the text for any windows error by running this from a command prompt à net helpmsg XXX, where XXX is the error number

In the above scenario, some other process/application has a lock on master.mdf and this isn’t allowing SQL to take access, hence the service fails to start. To find out which process it is you can use either Process Explorer or Handle.exe from sysinternals. The usual culprits are Anti-Virus or Anti-Malware services. Once you identify the process, after stopping that, attempt to start SQL again. If it fails, verify from the errorlog that the error code is still 32 and has not changed. Most people miss this trick and assume the error is the same after making some changes. Don’t take it for granted, since the error might have changed but the service still doesn’t start. This indicates that the changes done have been successful.

 

Some Common Causes for SQL Service startup failures

1.                   Access denied on the folders for the SQL Server Service Account, specifically on the DATA folder which contains the SQL Server system databases.

2.                   Due to some other processes like anti-virus holding a lock on the SQL database files.

3.                   Insufficient permissions granted to the SQL Server Service account.

clip_image008[10] Important:

Always make SQL service account changes via the SQL Configuration manager since only this tool sets the proper permissions or ACL’s to the new service account. Services.msc does not set ACL’s.

 

4.                   Server side protocols for SQL not set correctly or disabled.

5.                   Critical system database files not found due to either accidental deletion of files or disk failures.

6.                   System database files having inconsistencies preventing SQL Server startup.

7.                   Password of the service account was changed but not updated on the server that has the SQL instance installed.

8.                   Startup parameters have incorrect file path locations.

9.                   The SQL Server service is disabled in Service Control Manager.



clip_image014[4] Critical:

If you are not a SQL expert and have followed the steps above but still stuck, my personal advice would be to call SQL Support rather than doing something you will regret later. Better safe, than sorry, eh?

Additional Resources

Error Messages of SQL Server 2005 Start Up Failure
http://blogs.msdn.com/b/sql_protocols/archive/2006/04/28/585835.aspx

An error occurred during encryption
http://blogs.msdn.com/b/blakhani/archive/2009/11/24/sql-server-2005-express-setup-failure-with-error-an-error-occurred-during-encryption.aspx

 

Posted in Startup | Tagged: , , , , , , | Leave a Comment »

SQL Server service does not start after enabling SSL encryption

Posted by Sudarshan Narasimhan on August 3, 2011


I recently had a customer who came up to me with a SQL Service start-up issue. This issue started after they had enabled SSL encryption using a personal certificate. Now, before I begin let me confirm a couple of things :-

  1. The correct SSL certificate has been chosen in SQL Configuration Manager.
  2. The ForceEncryption flag is set to Yes.

There are some forum posts on this issue and I don’t entirely agree with the solutions proposed there, because they did not actually solve the real problem but worked around it without understanding how/why the problem occurred, and how to get a complete solution.

http://social.msdn.microsoft.com/Forums/en-US/sqlexpress/thread/b308d9c8-6aba-4b3b-9b32-f919816d1be2/

http://social.msdn.microsoft.com/Forums/en-US/sqldatabaseengine/thread/63c15cb5-0d12-4ea8-bf84-e3ea8a42a866/

I am going to be bold and say this; please do not follow the above posts since they are workarounds. Most people in enterprise environments cannot run SQL Service under the Local System account, which is what the above posts are asking you do to, yikes!!!

As always, the first place to check when a SQL Service does not start is the SQL ERRORLOG (yes, ERRORLOG and NOT the EVENTLOGS – I said 1st place).

For a complete step-by-step guide to approach SQL Service start-up issues, you can read the blog post I wrote on MSDN SQL Server FAQ –>
HERE

Here is what I found in the ERRORLOG,

2011-06-08 12:35:30.56 Server Unable to load user-specified certificate [Cert Hash(sha1) “XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX“]. The server will not accept a connection. You should verify that the certificate is correctly installed. See “Configuring Certificate for Use by SSL” in Books Online.

2011-06-08 12:35:30.59 Server Error: 17182, Severity: 16, State: 1.
2010-06-08 12:35:30.59 Server TDSSNIClient initialization failed with error 0x80092004, status code 0x80. Reason: Unable to initialize SSL support. Cannot find object or property.

2011-06-08 12:35:30.62 Server Error: 17182, Severity: 16, State: 1.
2011-06-08 12:35:30.62 Server TDSSNIClient initialization failed with error 0x80092004, status code 0x1. Reason: Initialization failed with an infrastructure error. Check for previous errors. Cannot find object or property.

So, it appears the certificate chosen for SSL encryption has some issues. Here are the things I checked next:-
1. The certificate itself is valid and has not expired (using certmgr.msc).
2. The certificate hash matched that of the certificate associated with the instance but it wasn’t loading.

The certificate appears to be fine. Well, what next?

The way SQL Server loads the certificate happens through the credentials of the SQL Server service account defined for this instance. So, I started to check what the SQLSvc account was and whether it had permissions to access the certificate. At this point I cam across this forum post for a similar issue

http://social.msdn.microsoft.com/Forums/en/sqlexpress/thread/db68b3bf-33a8-42d6-8dac-f7229a2f9cde

The solution listed there was to ensure that the SQL service account had a valid user profile on the server machine. This profile can sometimes be missing if the service account was changed to a new account, which doesn’t yet have a user profile created. You can verify the user profile on any machine by going to
My Computer –> Properties –> Advanced –> Settings (under User Profiles section).

image

Solution #1 (as applicable)

Identify the SQL service account and log on to the server machine that is running SQL Server, with the service account. During logon, Windows will create a new user profile for the service account.

Unfortunately, this wasn’t the scenario in my case. Another thing I have to caution people is regarding service account change. When changing the SQL Svc account ALWAYS ALWAYS use the SQL Server Configuration Manager, and NEVER do it through services.msc. Here are some reasons why:-

1. Permissions & ACL’s of SQL Server registry keys are done only by SSCM
2. SQL Server User Groups are updated with the new service account only by SSCM.
3. WMI namespace permissions are only set through SSCM.

So, if you changed the service account through services.msc, it is possible that the new service account does not have permissions to access the certificate. The certificate hash is stored in the following registry key. If the new service account cannot access it, then you “can” get the error above, since SQL cannot load the certificate due to insufficient privileges.

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.SQL2008\MSSQLServer\SuperSocketNetLib\Certificate

I started to check if this was happening in my case and found that the SQL Service account was not even present in the local group SQLServerMSSQLUser$[Computer_Name]$[Instance_Name]. This confirms that the service account change wasn’t done through SQL Configuration Manager.

This means the SQL Service account does not have permissions to open the private key for the certificate. i.e. SQL Server failed to load this specific certificate due to insufficient permissions. Once we give the service account permission to read the private key, the SQL Server should start up without a problem.

Here is what I did to resolve it,

Solution #2

1. Identify the SQL service account (could be a domain account or any of the built-in accounts like LocalSystem/LocalService/NetworkService).

2. If the service account was set properly using SQL Configuration Manager, then it should have been added to the SQL Security Groups. Look for this on the local machine groups. e.g. SQLServerMSSQLUser$[Computer_Name]$[Instance_Name]

3. Open the Microsoft Management Console (MMC) by click Start -> Run, entering mmc and pressing Enter.

4. Add the Certificates snap-in by clicking File -> Add/Remove Snap-in… and double clicking the Certificates item.

Note:
Select computer account and Local computer in the two pages on the wizard that appears.
Click OK.

5. Expand Certificates (Local Computer) -> Personal -> Certificates and find the SSL certificate you imported or created.

6. Right click on the imported certificate (the one you selected in the SQL Server Configuration Manager) and click All Tasks -> Manage Private Keys.

Note:
You will only see this option for SSL certificates imported from a certifying authority. E.g. you will not see this for EFS, RDP certificates

7. Click the Add button under the Group or user names list box.

8. Enter the SQL service security group that you identified in step #2 and click OK.

You can also directly add the service account itself here, but for any future changes you need to repeat these steps to add that individual account. Instead its better to add the security group, since the service account will always be present here if account change was done using SSCM.

Now, when I started the SQL service you can double-check that the certificate is being loaded and used by looking for this entry in the errorlog. In my case, it started fine and SSL encryption was working.

Open-mouthed smile

2011-06-08 12:48:47.88 Server The certificate [Cert Hash(sha1) “XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX“] was successfully loaded for encryption.


Problem Solved!

Now, to check if the connection is actually encrypted you can use Netmon or Wireshark to check the network packet. Refer to the screenshot posted in this blog on how to do that. An easier way for DBA’s to verify this is to use the DMV sys.dm_exec_connections, look at the encrypt_option column. This DMV also tells you if the connection is using Kerberos/NTLM/SQL Authentication -> auth_scheme column has this value.

Hope this saves you some time.

Posted in Startup | Tagged: , , , , , , , | 17 Comments »