How cgo silently disables Go's deadlock detector
I recently ran into a Go test timeout that turned out to be a deadlock. That was surprising because Go has a built-in deadlock detector. The simplest possible deadlock is a goroutine that locks the same mutex twice. Running the following code exits with fatal error: all goroutines are asleep - deadlock! as expected.
1package main
2
3import "sync"
4
5func main() {
6 var mu sync.Mutex
7 mu.Lock()
8 mu.Lock() // blocks forever on the same goroutine
9}
What would happen if we add an import "net/http"?
1package main
2
3import (
4 "fmt"
5 "net/http"
6 "sync"
7)
8
9func main() {
10 fmt.Println(http.StatusOK)
11 var mu sync.Mutex
12 mu.Lock()
13 mu.Lock()
14}
go run main.go now hangs forever. But CGO_ENABLED=0 go run main.go exits with the deadlock panic. The only difference is the net/http import. Since CGO_ENABLED defaults to 1, importing net/http produces a cgo-linked binary, which effectively disables Go's deadlock detector. In this post we'll look at why.
How the deadlock detector works
The Go deadlock detector is not a goroutine scanner. It's an OS thread count check. It lives in runtime.checkdead and runs whenever an M parks itself. Two conditions must both hold for the panic to fire:
- No M is actively running work
- There is at least one blocked user goroutine
The first condition is this formula from proc.go:
1run := mcount() - sched.nmidle - sched.nmidlelocked - sched.nmsys
2// run0 is 0 when cgo is used.
3if run > run0 {
4 return
5}
Look at each term:
mcount(): total number of OS threads (Ms) the runtime knows about.sched.nmidle: Ms parked in the idle pool waiting for a goroutine.sched.nmidlelocked: Ms locked to a goroutine viaruntime.LockOSThreadand currently blocked.sched.nmsys: system Ms likesysmonand the template thread.
If run is positive, the runtime assumes some M is still doing work and returns without declaring a deadlock.
The second condition scans all goroutines but skips system ones via isSystemGoroutine. GC workers, finalizers, and cleanup goroutines are present in every Go binary but don't count. So they're not what's saving the cgo binary from the panic. The difference has to be in the thread math.
The extra Ms created by cgo are the problem. When a Go binary has cgo enabled, the runtime reserves spare Ms to handle callbacks from C code (see needm, extraM). These Ms are sleeping, but they are not in the nmidle list. They're in a separate extraM list. mcount() includes them, none of the subtracted terms do. Result: run stays above zero, checkdead returns early, deadlock goes undetected.
Confirming it with dlv
Build and attach to the hanging process:
1go build -o mu-cgo .
2./mu-cgo &
3dlv attach $(pidof mu-cgo)
threads lists every OS thread the process owns, and goroutines lists every goroutine. In the cgo build there are 7 OS threads but only 8 goroutines, and every thread is parked in runtime.futex:
1(dlv) threads
2* Thread 12900 at 0x489543 .../runtime/sys_linux_amd64.s:570 runtime.futex
3 Thread 12901 at 0x489543 .../runtime/sys_linux_amd64.s:570 runtime.futex
4 Thread 12902 at 0x489543 .../runtime/sys_linux_amd64.s:570 runtime.futex
5 Thread 12903 at 0x489543 .../runtime/sys_linux_amd64.s:570 runtime.futex
6 Thread 12904 at 0x489543 .../runtime/sys_linux_amd64.s:570 runtime.futex
7 Thread 12905 at 0x489543 .../runtime/sys_linux_amd64.s:570 runtime.futex
8 Thread 12906 at 0x489543 .../runtime/sys_linux_amd64.s:570 runtime.futex
9
10(dlv) goroutines
11 Goroutine 1 - main.main [unknown wait reason 22]
12 Goroutine 2 - runtime.gopark [unknown wait reason 11]
13 Goroutine 3 - runtime.gopark [unknown wait reason 8]
14 Goroutine 4 - runtime.gopark [unknown wait reason 9]
15 Goroutine 5 - runtime.gopark [unknown wait reason 10]
16 Goroutine 6 - runtime.gopark [unknown wait reason 46]
17 Goroutine 17 - runtime.goexit
18 Goroutine 18 - runtime.gopark [unknown wait reason 12]
19[8 goroutines]
Switch to goroutine 1 and walk its stack:
1(dlv) goroutine 1
2Switched from 0 to 1 (thread 12900)
3(dlv) bt
4 0 0x000000000048142e in runtime.gopark
5 at .../runtime/proc.go:463
6 1 0x00000000004620d2 in runtime.goparkunlock
7 at .../runtime/proc.go:468
8 2 0x00000000004620d2 in runtime.semacquire1
9 at .../runtime/sema.go:192
10 3 0x0000000000482465 in internal/sync.runtime_SemacquireMutex
11 at .../runtime/sema.go:95
12 4 0x000000000048cefd in internal/sync.(*Mutex).lockSlow
13 at .../internal/sync/mutex.go:149
14 5 0x00000000005274dd in internal/sync.(*Mutex).Lock
15 at .../internal/sync/mutex.go:70
16 6 0x00000000005274dd in sync.(*Mutex).Lock
17 at .../sync/mutex.go:46
18 7 0x00000000005274dd in main.main
19 at ./main.go:13
20 8 0x000000000044df75 in runtime.main
21 at .../runtime/proc.go:290
22 9 0x0000000000487a01 in runtime.goexit
23 at .../runtime/asm_amd64.s:1771
The deadlock is right there. main.main at frame 7 called sync.(*Mutex).Lock (line 13 of main.go), which went into lockSlow and parked the goroutine in semacquire1. This is the second mu.Lock(); the first one held the mutex already and nothing on this goroutine (or any other) will ever release it. The runtime has all the evidence it needs, but checkdead never gets past the run > 0 guard.
Counting the threads
Goroutine 1 is the only blocked user goroutine. The rest are runtime internals that isSystemGoroutine filters out of checkdead. To see exactly how the thread math works, run with GODEBUG=schedtrace=1000,scheddetail=1 ./mu-cgo:
1SCHED 1001ms: gomaxprocs=16 ... threads=7 idlethreads=4 nmidlelocked=0 ...
2 M6: p=nil curg=nil locks=0 spinning=false blocked=true lockedg=nil
3 M5: p=nil curg=nil locks=0 spinning=false blocked=true lockedg=nil
4 M4: p=nil curg=nil locks=0 spinning=false blocked=true lockedg=nil
5 M3: p=nil curg=nil locks=0 spinning=false blocked=true lockedg=nil
6 M2: p=nil curg=nil locks=2 spinning=false blocked=false lockedg=nil
7 M1: p=nil curg=17 locks=0 spinning=false blocked=false lockedg=17
8 M0: p=nil curg=nil locks=0 spinning=false blocked=true lockedg=nil
9 G17: status=11() m=1 lockedm=1
10 G1: status=4(sync.Mutex.Lock) m=nil lockedm=nil
Five Ms show blocked=true (M0, M3, M4, M5, M6), all parked in futex. M1 is the template thread, locked to system goroutine G17; it calls sched.nmsys++ when it starts. M2 has no P, no curg, and is not blocked, the classic sysmon profile; sysmon also increments nmsys. So nmsys = 2.
idlethreads=4 in the header is literally sched.nmidle, so nmidle = 4. That accounts for only four of the five blocked Ms. The fifth is the extra M the runtime keeps for C callbacks. It's stored in a separate extraM list, not the idle pool, so it doesn't bump nmidle and isn't treated as a system M. It is still counted by mcount() via allm.
Plug the numbers into the detector formula:
1run = mcount - nmidle - nmidlelocked - nmsys
2 = 7 - 4 - 0 - 2
3 = 1
Exactly one uncounted M: the extra M for cgo. run > 0, checkdead returns, no panic.
Now run the same trace on the nocgo binary. It shows threads=5 idlethreads=4 and no template thread. startTemplateThread is only called from two places: inside the iscgo branch of runtime.main, and inside LockOSThread. Neither runs in the nocgo build. The scheddetail dump confirms it: five Ms, no goroutine locked to any M, no template thread goroutine. So nmsys = 1 (just sysmon), nmidle = 4, and:
1run = 5 - 4 - 0 - 1 = 0
Zero. checkdead proceeds to the goroutine scan, finds goroutine 1 blocked on sync.Mutex.Lock, and fires the panic immediately:
1fatal error: all goroutines are asleep - deadlock!
2
3goroutine 1 [sync.Mutex.Lock]:
4internal/sync.runtime_SemacquireMutex(...)
5 .../runtime/sema.go:95 +0x25
6internal/sync.(*Mutex).lockSlow(...)
7 .../internal/sync/mutex.go:149 +0x15d
8sync.(*Mutex).Lock(...)
9 .../sync/mutex.go:46
10main.main()
11 .../main.go:13 +0x9d
Same code, one flag, different outcome. The deadlock was always there. The cgo threads were hiding it from the detector.
Conclusion
The Go deadlock detector only fires when every OS thread is either idle or a system thread, and there are blocked user goroutines. Linking cgo, even without actually calling into C, adds two Ms: the template thread (counted in nmsys) and the extra M for C callbacks (counted by nothing in checkdead's formula). That one uncounted M keeps run > 0 forever.
In practice this means most production Go binaries silently swallow deadlocks. Anything that imports net/http directly or indirectly (SQL drivers, cloud SDKs, gRPC, OpenTelemetry exporters) pulls in cgo on Linux. When a Go process hangs with no output, reach for SIGQUIT, GODEBUG=schedtrace, and dlv attach rather than waiting for a panic that isn't coming. If your binary doesn't actually need C at runtime, build with CGO_ENABLED=0. That removes both the template thread and the extra M, and the detector starts working again.