Understanding Lock Upgrading Deadlocks in Go: A Common Concurrency Pitfall

When working with concurrent Go programs, deadlocks can be a subtle and frustrating issue to debug. One common scenario that can lead to deadlocks is attempting to upgrade a read lock to a write lock while still holding the read lock. Let’s dive into this problem and explore how to solve it.

The Problem: Lock Upgrading

Consider this seemingly innocent code that manages an admin token with concurrent access:

func AdminToken() string {
    adminTokenMu.RLock()
    defer adminTokenMu.RUnlock()

    token := adminToken

    // If token isn't in memory, try loading from SQLite
    if token == "" {
        cachedToken := persist.ReadAdminToken(getBackend())
        if cachedToken != "" {
            adminTokenMu.Lock()
            adminToken = cachedToken
            adminTokenMu.Unlock()
            return cachedToken
        }
    }

    return adminToken
}

At first glance, this code might look reasonable. It:

Acquires a read lock to safely read the token
If the token is empty, loads it from SQLite
Acquires a write lock to update the shared token variable
Returns the token

However, this code contains a subtle but serious issue: it attempts to acquire a write lock while still holding a read lock on the same mutex.

Why This Causes a Deadlock

The deadlock occurs due to a circular dependency:

The goroutine holds a read lock
It attempts to acquire a write lock
The write lock waits for all read locks to be released
The read lock won’t be released until the function returns (due to defer)
The function won’t return until the write lock is acquired

This creates a situation where the goroutine is waiting for itself to release the read lock before it can acquire the write lock. Classic deadlock!

The Solution: Proper Lock Management

Here’s how to fix this issue:

func AdminToken() string {
    adminTokenMu.RLock()
    token := adminToken
    adminTokenMu.RUnlock()

    // If token isn't in memory, acquire write lock and try loading from SQLite
    if token == "" {
        adminTokenMu.Lock()
        // Need to check again after acquiring lock in case another goroutine updated it
        if adminToken == "" {
            cachedToken := persist.ReadAdminToken(getBackend())
            if cachedToken != "" {
                adminToken = cachedToken
                token = cachedToken
            }
        } else {
            token = adminToken
        }
        adminTokenMu.Unlock()
    }

    return token
}

The fixed version includes several important improvements:

Release Before Upgrade: We release the read lock before attempting to acquire the write lock.
Double-Check Pattern: We check the token again after acquiring the write lock. This is necessary because another goroutine might have updated the token between our read lock release and write lock acquisition.
Local Variable: We use a local variable token to store the result, which helps avoid race conditions and makes the code’s intent clearer.
Fine-grained Lock Control: We’ve removed the defer statement and instead manage our locks explicitly, which gives us more precise control over when locks are released.

Best Practices for Lock Management

When working with mutexes in Go, keep these principles in mind:

Never try to upgrade a read lock to a write lock while holding the read lock
Keep locked sections as small as possible
Be wary of defer when you need fine-grained lock control
Use local variables to store results that need to survive lock releases
Consider using the double-checked locking pattern when lazily initializing shared data

Conclusion

Lock upgrading is a common source of deadlocks in concurrent programs. By understanding how mutexes work and following proper lock management practices, you can avoid these issues and write more reliable concurrent code. Remember: when in doubt, release your locks before trying to acquire new ones, and always strive to keep your locked sections as small and simple as possible.