Avoid using 2D map for transition table in Go

This post is part 1 of a series of learnings from nilaway. nilaway is a static analysis tool that detects potential nil panics in Go code. It does report false positives, but it's far from naive. One limitation is that nilaway is flow-sensitive (it understands if x != nil) but not value-correlation-sensitive (it doesn't understand "if y == SomeConst, then x is non-nil"). From my experience, most false positives point to code smells or code that is fragile to maintain. A change in one place breaks another function several jumps away.

Consider a simple state machine, say a container manager. It has a handful of statuses and a manager that drives it from its current (known) status toward a desired one:

 1type Status int
 2
 3const (
 4    StatusNone Status = iota
 5    StatusRunning
 6    StatusStopped
 7    StatusFailed
 8)
 9
10type Manager struct {
11    known, desired Status
12    transitionFunc map[Status]map[Status]func()
13}

The transitionFunc field is a 2D map: outer key is the current status, inner key is the desired status, value is the function to call. The constructor pre-populates it:

 1func newManager() *Manager {
 2    m := &Manager{}
 3    m.transitionFunc = make(map[Status]map[Status]func())
 4
 5    // pre-populate every outer key so inner map is never nil
 6    for _, s := range allStatuses {
 7        m.transitionFunc[s] = make(map[Status]func())
 8    }
 9
10    // assign real transitions
11    m.transitionFunc[StatusNone][StatusRunning] = m.start
12    m.transitionFunc[StatusRunning][StatusStopped] = m.stop
13    m.transitionFunc[StatusStopped][StatusRunning] = m.restart
14
15    // fill remaining slots with a fallback
16    for _, known := range allStatuses {
17        for _, desired := range allStatuses {
18            if m.transitionFunc[known][desired] == nil {
19                m.transitionFunc[known][desired] = func() {
20                    log.Printf("invalid transition: %v -> %v", known, desired)
21                }
22            }
23        }
24    }
25    return m
26}

The main loop dispatches like this:

1if m.known != m.desired {
2    m.transitionFunc[m.known][m.desired]()
3}

This pattern looks safe: every [known][desired] slot is filled before the loop runs, so the double map lookup can never return nil. But nilaway reports a potential nil dereference at every call site.

1status_manager.go:100:2: error: Potential nil panic detected. Observed nil flow from source to dereference point: 
2	- foo.go:308:2: deep read from field `transitionFunc` lacking guarding; written to at an index

Why nilaway complains

nilaway sees m.transitionFunc[m.known] as a map read at an arbitrary key and cannot prove the result is non-nil, even though the for-loop pre-populates every outer key. The loop covers all enum values, but nilaway doesn't model that invariant. That's the flow-sensitivity limitation. So nilaway flags m.transitionFunc[m.known][m.desired]() as a potential nil call.

The false positive points to a real smell

The nilaway complaint is technically a false positive: the code is safe at runtime. But the fact that nilaway can't reason about it is worth paying attention to.

  • The initialization is verbose: two loops, one to pre-allocate, one to fill fallbacks.
  • Adding or removing a transition means touching the init function, not the place where the logic lives.

Replacing the map field and init method with a single transition method is more direct and simpler:

 1func (m *Manager) transition(known, desired Status) {
 2    switch {
 3    case known == StatusNone && desired == StatusRunning:
 4        m.start()
 5    case known == StatusRunning && desired == StatusStopped:
 6        m.stop()
 7    case known == StatusStopped && desired == StatusRunning:
 8        m.restart()
 9    case known == desired:
10        // already there, nothing to do
11    default:
12        log.Printf("invalid transition: %v -> %v", known, desired)
13    }
14}

The dispatch becomes:

1if m.known != m.desired {
2    m.transition(m.known, m.desired)
3}

nilaway is happy because there's nothing to be nil. Some benefits of dropping the 2D map:

  1. Direct is better than indirect without benefit. The indirection added by the map, map[known][desired] = func() { foo() }, adds no value over case (known, desired): foo().
  2. No struct field needed for the 2D map.
  3. No pre-population with nested for loops.