Avoid using 2D map for transition table in Go
This post is part 1 of a series of learnings from nilaway. nilaway is a static analysis tool that detects potential nil panics in Go code. It does report false positives, but it's far from naive. One limitation is that nilaway is flow-sensitive (it understands if x != nil) but not value-correlation-sensitive (it doesn't understand "if y == SomeConst, then x is non-nil"). From my experience, most false positives point to code smells or code that is fragile to maintain. A change in one place breaks another function several jumps away.
Consider a simple state machine, say a container manager. It has a handful of statuses and a manager that drives it from its current (known) status toward a desired one:
1type Status int
2
3const (
4 StatusNone Status = iota
5 StatusRunning
6 StatusStopped
7 StatusFailed
8)
9
10type Manager struct {
11 known, desired Status
12 transitionFunc map[Status]map[Status]func()
13}
The transitionFunc field is a 2D map: outer key is the current status, inner key is the desired status, value is the function to call. The constructor pre-populates it:
1func newManager() *Manager {
2 m := &Manager{}
3 m.transitionFunc = make(map[Status]map[Status]func())
4
5 // pre-populate every outer key so inner map is never nil
6 for _, s := range allStatuses {
7 m.transitionFunc[s] = make(map[Status]func())
8 }
9
10 // assign real transitions
11 m.transitionFunc[StatusNone][StatusRunning] = m.start
12 m.transitionFunc[StatusRunning][StatusStopped] = m.stop
13 m.transitionFunc[StatusStopped][StatusRunning] = m.restart
14
15 // fill remaining slots with a fallback
16 for _, known := range allStatuses {
17 for _, desired := range allStatuses {
18 if m.transitionFunc[known][desired] == nil {
19 m.transitionFunc[known][desired] = func() {
20 log.Printf("invalid transition: %v -> %v", known, desired)
21 }
22 }
23 }
24 }
25 return m
26}
The main loop dispatches like this:
1if m.known != m.desired {
2 m.transitionFunc[m.known][m.desired]()
3}
This pattern looks safe: every [known][desired] slot is filled before the loop runs, so the double map lookup can never return nil. But nilaway reports a potential nil dereference at every call site.
1status_manager.go:100:2: error: Potential nil panic detected. Observed nil flow from source to dereference point:
2 - foo.go:308:2: deep read from field `transitionFunc` lacking guarding; written to at an index
Why nilaway complains
nilaway sees m.transitionFunc[m.known] as a map read at an arbitrary key and cannot prove the result is non-nil, even though the for-loop pre-populates every outer key. The loop covers all enum values, but nilaway doesn't model that invariant. That's the flow-sensitivity limitation. So nilaway flags m.transitionFunc[m.known][m.desired]() as a potential nil call.
The false positive points to a real smell
The nilaway complaint is technically a false positive: the code is safe at runtime. But the fact that nilaway can't reason about it is worth paying attention to.
- The initialization is verbose: two loops, one to pre-allocate, one to fill fallbacks.
- Adding or removing a transition means touching the init function, not the place where the logic lives.
Replacing the map field and init method with a single transition method is more direct and simpler:
1func (m *Manager) transition(known, desired Status) {
2 switch {
3 case known == StatusNone && desired == StatusRunning:
4 m.start()
5 case known == StatusRunning && desired == StatusStopped:
6 m.stop()
7 case known == StatusStopped && desired == StatusRunning:
8 m.restart()
9 case known == desired:
10 // already there, nothing to do
11 default:
12 log.Printf("invalid transition: %v -> %v", known, desired)
13 }
14}
The dispatch becomes:
1if m.known != m.desired {
2 m.transition(m.known, m.desired)
3}
nilaway is happy because there's nothing to be nil. Some benefits of dropping the 2D map:
- Direct is better than indirect without benefit. The indirection added by the map,
map[known][desired] = func() { foo() }, adds no value overcase (known, desired): foo(). - No struct field needed for the 2D map.
- No pre-population with nested for loops.