Friday, September 5, 2008

“Cannot Generate SSPI Context” error message, more comments for SQL Server

The “Cannot generate SSPI context” issue is described by http://support.microsoft.com/?id=811889 in general. In this post I will discuss one daunting case of “Cannot generate SSPI context” error message when failing to connect to SQL server. In most related cases, customers report this issue as "I can connect to my local SQL Server, but once I connect to my network, I can't connection to my local SQL Server". Such issue is reported against MSDE and SQLExpress. But actually, it can happen with any SKU of SQL Server, including SQL Server 2000 and SQL Server 2005, that support NT integrated authentication. The error message for the failed connection that we discussed here is




[SNAC] “[SQL Native Client]SQL Network Interfaces: The Local Security Authority cannot be contacted.[SQL Native Client]Cannot generate SSPI context”
[MDAC] “Cannot generate SSPI context”;
[.Net1.0/2.0]” Failed System.Data.SqlClient.SqlException: Cannot generate SSPI context”



It can happen when all of followings are true:
(1) The hosting machine of SQL Server is connected to a network, including home network or dialup connection, but it is disconnected from its domain.
(2) The OS of the hosting machine is Windows XP or 2000. Not windows 2003.
(3) The connection is to a local SQL Server.
(4) Connection configuration causes network library to choose TCP/IP provider.


A scenario that meets all of (1) (2) and (3) looks like an extreme corner case. But the reality is that it is quit often if the hosting machine is a laptop computer. One solution, of course, is to avoid condition (1) by connecting to your corporate domain through VPN or disconnecting from network completely. The reason why they work is subtle and I’ll discuss it later. From user’s perspective, however, in many cases, either connecting over VPN or disconnecting from network might prevent you from accessing some valuable resources, so I want to discuss solutions that do not depend on (1) first.



In most cases, users do not explicitly require TCP/IP as the connection provider. For example connection strings in form of “.\<instance>”, “(local)\<instance>”, “<servername>\<instancename>” are among them. Users might wonder why network library chooses TCP/IP provider instead of Shared Memory provider, if the connection string is not prefixed with “tcp” and the server is local. A simple answer is that it can happen if the TCP/IP provider is in front of other providers in the client protocol order list, or/and the local server is not listening on Share Memory and Name Pipe. As described above, only TCP/IP provider has the issue; hence, configuring network library not to choose TCP/IP is a solution. To do that, first, on the server side, make sure your server is listening on Shared Memory or/and Named Pipe connection requests; then, on the client side, change the protocol order list such that Shared Memory and/or Named Pipe are in front of TCP/IP, or prefixing your connection strings with “lpc” or “np” to force Shared Memory or Named Pipe, or using alias that prefix Named Pipe in connection strings, whichever you feel most comfortable with. Note that certain SKUs of SQL Server have named pipe connection turned off by default.



In very rare case, however, if you really in need of TCP/IP connection, the option is to use TCP/IP loop-back address, i.e. “127.0.0.1”, as your <servername>. For example, if your connection string has form of “<servername>\<instancename>” and is not prefixed with “tcp”, without modifying the connection string, you can configure an alias with alias name as <servername>\<instancenane>, protocol as TCP/IP, server as “127.0.0.1\<instancename>” or “127.0.0.1,<port>”. Remember that the “Cannot Generate SSPI context” problem described in this post only happens when connecting to a local server; thus, the “127.0.0.1” is applicable. If the connection string is prefixed with “tcp”, then you do need to modify your connection string to specify “127.0.0.1” as <servername>.



If these workarounds described above do not fit your needs, we would like to hear more from you.



The reason that we didn’t fix this subtle issue is because the limitation is rooted in a behavior of an integrated authentication module (SPNEGO) in XP and windows 2000, i.e. whether to fallback to NTLM if KDC is not available when the target SPN points to local machine. KDC, normally, is part of your domain controller. For this specific case, SPNEGO chooses not to fallback, hence connection fail. This issue is not a security issue though. Reader might ponder why avoiding using TCP/IP provider can solve the problem while explaining it is because certain behavior of SPNEGO in Windows. Not going too deep, the simple answer is that only TCP/IP provider, with an exception of loop-back connection, uses SPNEGO while other providers use NTLM. Be aware that only TCP/IP provider can provides the benefits of Kerberos authentication as discussed in http://blogs.msdn.com/sql_protocols/archive/2005/10/12/479871.aspx



Back to the questions we left before, the reason that disconnected from network (no network media) works is because, in such case, local <servername> is resolved to “127.0.0.1” by windows network layer and NTLM is used directly. When connected over VPN, the SPNEGO issue goes away because the KDC is accessible in this case.



From the error message reported by SNAC ODBC/OLEDB, you can differentiated the issue described by this post from another case of “Cannot generate SSPI context”, in which the root cause is because, in Active Directory, the Service Principle Name (SPN) of SQL Server is registered for a domain account different from the SQL Server is actually running under. The error message for the other case is “[SQL Native Client]SQL Network Interfaces: The target principal name is incorrect.[SQL Native Client]Cannot generate SSPI context. The “Cannot generate SSPI context” issue is described by http://support.microsoft.com/?id=811889 in general

6 comments:

  1. SSPI is protocol/interface which is used by SQL Server/Windows OS for the Windows Authentication in Client Server networking

    ReplyDelete
  2. John sums up telepresence from a network perspective, "Telepresence is an interactive real- time application, which means it is delay sensitive, loss sensitive and jitter sensitive. This sounds familiar: it is just like VoIP, with the one difference being that it has huge bandwidth requirements." It's that last part that makes things more difficult. No form of QoS can allocate bandwidth that doesn't exist and it doesn't have provisions to force the application to downscale the experience based on realtime metrics. ...

    ReplyDelete
  3. It is not a software, it is authentication protocol

    ReplyDelete
  4. our client has reported this issue with our IE based system. our system connects to the local sql server express. i've been trying to reproduce this error but could not. but i could reproduce the error if i disable the share memory protocol, leaving Named Pipes and TCP/IP enabled. has someone got an explanation on this, please?

    ReplyDelete
  5. I disagree with your comments, I just don't believe all the "truths" are researched correctly. I did enjoy reading it, so keep at it!

    ReplyDelete