Deep dive in "context canceled" errors on Go web servers

At iheartjane, we use Go web server to serve Ad requests. After some time in production, we noticed a lot of “context canceled” error logs. The following screenshot of CloudWatch Log Insights query shows the frequency of “context canceled” errors. It left us puzzled about the underlying causes of these context cancels. Should we worry about it? If yes, how should we reduce context cancels.

What are “Context Canceled” in Go?

Before we talk about “context cancel”, we need to understand “context” first. The Context is a type that carries deadlines, cancellation signals, and other request-scoped values across API boundaries and between processes. Context was introduced in Go 1.7 and since then it has been used widely. For example, the following method from the standard SQL library.

1func (c *Conn) QueryContext(ctx context.Context, query string, args ...any) (*Rows, error)

When a context is canceled, it indicates that the requester is no longer interested in the result, and any code scoped by the context should be aborted. Here is an example.

 1// The code outputs: QueryRow failed: context canceled
 2 
 3func main() {
 4    conn, _ := pgx.Connect(context.Background(), "postgresql://postgres@localhost:15432/example")
 5    defer conn.Close(context.Background())
 6 
 7    ctx, cancel := context.WithCancel(context.Background())
 8    cancel() // immediately cancel the context.
 9    if err := conn.QueryRow(ctx, "SELECT 1;").Scan(); err != nil {
10        fmt.Printf("QueryRow failed: %v\n", err) 
11        return
12    }
13}

Hypothesis

We suspected the browser cancels Ads requests when the shopper navigates away from the page issued the request. In order to verify the hypothesis, we designed the following experiment.

  1. Record network activities using chrome://net-export/.

  2. A Go web server listens on port 8080 and 8443. This allows us to verify both HTTP/1.1 and HTTP/2 protocols. There are two endpoints at each port.

    • /ping. Blocks for 5 seconds and then returns 200 OK with a message “pong”
    • /hello. Immediately returns 200 OK with a message “world”.
  3. A static web page that has a button that calls /ping and a hyperlink to /hello. See the full page here.

Reproduce with HTTP/1.1 and HTTP/2

Download code and follow steps below.

  1. Start recording network activities in Chrome.
  2. Start Go server and load the static page in Chrome.
  3. Click buttons and links in the following order. Wait 1 second between click button and the link.
  4. Verify Go web server output.
  5. Stop recording network activity and load the recorded file into viewer. HTTP/1.1
    HTTP/2

What we learned and what can we do about context cancels?

  1. When a user closes a page, or navigates to a different page, the browser cancels requests originated from the current page.
  2. HTTP protocols (HTTP/1.1, HTTP/2, HTTP/3) are client-server style. Before the server response, the connection between the client and server is active. And the client can cancel work-in-progress requests.

The context cancel makes the client-server communication efficient. We cannot change how browsers behave when user navigates away from a page. And we should respect cancels on the servers. So, are those context cancels always harmless?

I don't think so. Not only the context cancel errors is noisy, but also a high level of context cancels may indicate necessary improvements. Let’s look at examples from the advertising domain. When a shopper clicks an Ad and the Ad takes the user to a different page, we need to make sure the click tracking calls are not canceled. Let the Go web server responds to browser immediately and process the requests outside the request-response path.