Using Integrated Windows Authentication over a Google Cloud load balancer
Modern web applications typically use OAuth or OpenID Connect to authenticate users, but older intranet applications often still rely on Integrated Windows Authentication (IWA) to deliver a single sign-on experience for users.
The authentication protocol used for IWA is not always clear: IWA can use NTLM, Kerberos, or both, depending on how the application is configured. This ambiguity can become a problem when we try to migrate the application to Google Cloud and want to deploy it behind a load balancer. Choosing a load balancer that isn’t compatible with the specific IWA configuration can cause IWA to become unreliable or break, sometimes in subtle ways, and it can even expose us to session hijacking threats.
Let’s take a closer look at how IWA works, and how load balancers can affect its operation.
Protocol negotiation
IWA lets clients and servers choose the protocol they want to use. This process is typically referred to as negotiation and the basic idea is:
- The server tells the client which authentication protocols it supports.
- The client looks at the list and picks its favorite protocol.
There are actually two ways in which web servers like IIS support negotiation:
Using multiple authentication schemes: we can configure IIS to offer more than one authentication scheme. When the server returns HTTP 401 response to a client, it then includes multiple
WWW-Authenticate
headers. For example:HTTP/1.1 401 Unauthorized Content-Type: text/html WWW-Authenticate: NTLM WWW-Authenticate: Negotiate Date: … Content-Length: … Connection: keep-alive <!DOCTYPE html …
To a client, this signals that the server supports both
NTLM
andNegotiate
, and that the client is free to choose.In IIS, this works by enabling multiple providers:
Using the Negotiate authentication scheme: we can configure IIS to use the Negotiate or Nego2 authentication scheme. This causes clients to negotiate a protocol using the SPNEGO protocol.
In IIS, this works by enabling the Negotiate provider:
There is no dedicated authentication scheme for Kerberos. For Kerberos, we must use the Negotiate provider. But we can constrain Negotiate so that it only offers clients a single option: to use Kerberos.
In IIS, we do that by enabling the Negotiate:Kerberos provider:
NTLM
NTLM doesn’t have a great reputation, and that’s mostly for good reason. But dropping NTLM support can be difficult in scenarios such as the following:
- We have to support clients that don’t support Kerberos, or aren’t configured to allow Kerberos. Such clients might include legacy applications or embedded devices.
- We have to support clients that don’t have network connectivity to an Active Directory domain controller and are therefore unable to obtain Kerberos tickets.
- The server doesn’t have a DNS name, or we have to support clients that access the server in ways that prevent them from determining the server’s service principal name (SPN).
- Users authenticate using local Windows accounts instead of Active Directory user accounts.
NTLM is a challenge-response protocol and it takes a client at least 2-3 HTTP requests to complete an authentication handshake and receive the response payload from the server. These requests have to be exchanged over the same HTTP connection, after which the connection is considered authenticated.
The following diagram illustrates an example NTLM authentication handshake:
- The client sends an unauthenticated request to the server.
- The server responds with a HTTP 401 status, indicating that it supports the NTLM authentication scheme.
- The client prompts the user to enter credentials or uses the user’s existing credentials to retry the request,
adding a NTLM NEGOTIATE_MESSAGE
in the
Authorization
header. - The server uses the data from the
Authorization
header, constructs an authentication challenge, and returns a CHALLENGE_MESSAGE in theWWW-Authenticate
header. - The client processes the challenge and retries the request, adding a NTLM
AUTHENTICATE_MESSAGE
in the
Authorization
header. - The server verifies the data from the
Authorization
header and, if successful, processes the request.
NTLM authenticates HTTP connections, not individual HTTP requests. This is crucial because it’s at odds with the fact that HTTP is supposed to be a stateless protocol. And it becomes a problem once we place a HTTP/Level-7 load balancer between the client and server:
HTTP load balancers like Google Cloud’s Application Load Balancer take advantage of HTTP’s assumed statelessness and handle each request individually – regardless of whether the requests arrived on the same TCP connection or on different connections.
For example, given 3 HTTP requests, a load balancer might choose to send two of them to one server, and one to another:
Application Load Balancers let us disable this behavior by configuring session affinity. But for NTLM, that’s not enough: Application Load Balancers, like most other HTTP load balancers, also implement connection pooling to reduce the number of connections to backend servers. Even if a client’s requests are all sent to the same backend server, they might arrive on different (pooled) TCP connections:
For NTLM, the likely result is that when we try to use it over an Application Load Balancer, it sometimes works and sometimes doesn’t. Which can be frustrating to debug.
If our application requires NTLM, the only solution is to not use an Application Load Balancer, and to use a network load balancer instead:
Load balancer | Compatible with NTLM |
---|---|
Application Load Balancers | No |
Passthrough Network Load Balancers | Yes |
Proxy Network Load Balancers | Yes |
Kerberos
From a security perspective, there are many reasons to prefer Kerberos over NTLM. However, for Kerberos to work reliably, clients and servers must meet certain prerequisites, including the following:
- Servers must be joined to an Active Directory domain.
- Servers must have DNS names that let clients determine a server’s SPN. Alternatively, client and servers must be configured to use IP address-based SPNs.
- All servers behind a load balancer must use a common Active Directory service account for Kerberos authentication, and the service account must have a SPN that matches the DNS name of the load balancer.
- Clients must have network connectivity to the domain controller so that they can obtain Kerberos tickets.
- Web browsers must be configured to allow IWA for the domain name of the load balancer. For Chrome, we can enable IWA by setting a policy.
- Users must use Active Directory accounts to authenticate, as opposed to local Windows user accounts.
To let clients authenticate by using Kerberos, we must use the Negotiate
authentication scheme. The client and server then use the
SPNEGO protocol to negotiate an authentication mechanism to use.
This negotiation works by exchanging negotiation tokens in the
Authorization
and WWW-Authenticate
headers:
- The client sends an unauthenticated request to the server.
- The server responds with a HTTP 401 status, indicating that it supports the Negotiate authentication scheme.
The server retries the request, adding a
negTokenInit
token in theAuthorization
header. This token contains the list of authentication mechanisms supported by the client, such as Kerberos, NTLM, and other mechanisms.To avoid an extra roundtrip, the client selects its preferred authentication mechanism and includes data for this mechanism in the
negTokenInit
token. This is called the optimistic mechanism token and in the case of Kerberos, it includes a KerberosKRB_AP_REQ
message with a service ticket.If the client uses Kerberos for its optimistic mechanism token and the server allows Kerberos, then the server verifies the service ticket. If successful, the server authenticates and processes the request and returns a
negTokenResp
token in theWWW-Authenticate
with aKRB_AP_REP
message.In other cases, such as when the client uses NTLM for its optimistic mechanism token, additional round trips might be required to complete the authentication handshake. The server then responds with a HTTP 401 status and a
negTokenResp
token that contains additional information for the client to resume the authentication handshake.
We can configure the server so that the only possible negotiation outcome is that the client and server use Kerberos. The authentication handshake is then stateless and doesn’t require any additional round trips, making such a configuration compatible with connection pooling and all load balancer types:
Load balancer | Compatible with NTLM |
---|---|
Application Load Balancers | Yes |
Passthrough Network Load Balancers | Yes |
Proxy Network Load Balancers | Yes |
In contrast, if we configure the server to allow a fallback to other protocols, we must assume that the authentication handshake is stateful as in the case of NTLM, and we must not use an Application Load Balancer.
Implementation patterns
There are two patterns for using a Google Cloud load balancer for web applications that use Integrated Windows Authentication, and they each have their advantages and disadvantages.
1. Using a Network Load Balancer to support NTLM and Negotiate
In this pattern, we use a (Passthrough or Proxy) Network Load Balancer to ensure compatibility with NTLM and allow clients to choose between NTLM and Kerberos. we can configure IIS to support both NTLM and Kerberos by doing one of the following:
- Enable the Negotiate provider. This approach allows clients to choose between Kerberos and NTLM, but only works for clients that support the Negotiate authentication scheme.
- Enable the NTLM and Negotiate provider. This approach provides a second way for clients to fall back to NTLM and works for clients that don’t support the Negotiate authentication scheme.
Advantages of this pattern:
- Deploying this pattern doesn’t require any changes to client configuration.
- If clients don’t meet the requirements for using Kerberos, they can fall back to using NTLM.
Disadvantages of this pattern:
- To use TLS, we must terminate TLS on the backend server or use an external SSL proxy load balancer.
- We can’t take advantage of Cloud Armor and other features that are only available for Application Load Balancers.
- Load might be distributed unevenly across backend servers because the load balancing operates on the basis of TCP connections, not individual HTTP requests.
- The additional HTTP requests required to perform NTLM handshakes can have a negative impact on performance if clients frequently open new connections.
- We’re exposed to the security shortcomings of NTLM.
2. Using a Application Load Balancer and enforcing Kerberos
In this pattern, we use an Application Load Balancer and force clients to use Kerberos. we can configure IIS to only support Kerberos and reject NTLM by doing all of the following:
- Enable the Negotiate:Kerberos provider. Unlike the Negotiate provider, Negotiate:Kerberos doesn’t permit NTLM.
- If necessary, we adjust the token binding settings so that they allow TLS termination on the load balancer.
Advantages of this pattern:
- We can use the Application Load Balancer to terminate TLS.
- We can take advantage of Cloud Armor and other features that are only available for Application Load Balancers.
- Kerberos requires fewer HTTP requests, which can result in better overall performance.
- We’re not exposed to the security shortcomings of NTLM.
Disadvantages of this pattern:
- We must configure clients and servers to meet the requirements for using Kerberos, resulting in a more complex configuration.
- Clients that don’t support Kerberos can’t access the application. Web browsers that support Kerberos, but aren’t configured to allow the use of Kerberos might present the user an endless loop of login prompts.
- Kubernetes Engine currently doesn’t support Application Load Balancers for Windows Server node pools.
- If we accidentally misconfigure the server to allow NTLM in addition to Kerberos, IWA might break and we might expose users to session hijacking threats.