open-menu closeme
Engineering
github linkedin rss
  • How cgo silently disables Go's deadlock detector

    calendar May 4, 2026 · 7 min read · Go  ·
    Share on: twitter copy

    I recently ran into a Go test timeout that turned out to be a deadlock. That was surprising because Go has a built-in deadlock detector. The simplest possible deadlock is a goroutine that locks the same mutex twice. Running the following code exits with fatal error: all goroutines are asleep - deadlock! as expected. …


    Read More
  • Replacing socat with systemd-socket-proxyd

    calendar Mar 30, 2026 · 3 min read · Linux Systemd Bottlerocket  ·
    Share on: twitter copy

    Bottlerocket v11.0.0 dropped the socat package from bottlerocket-core-kit (#742). If you had a service that relied on socat to bridge a Unix Domain Socket (UDS) to a TCP port, you need a replacement. This post shows how to use systemd-socket-proxyd instead. Why socat was used soci-snapshotter exposes its metrics …


    Read More
  • You probably want to disable cgo: Go's stdlib has pure-Go implementations

    calendar Mar 25, 2026 · 3 min read  ·
    Share on: twitter copy

    CGO_ENABLED defaults to 1. That means a standard go build produces a binary that links against C libraries (e.g., glibc) at runtime. For many parts of the Go standard library, there is a C-backed implementation and a pure-Go implementation. CGO_ENABLED selects which one gets compiled in. Pure-Go alternatives also exist …


    Read More
  • Who killed my service: collecting kernel kill logs with OTEL

    calendar Mar 10, 2026 · 6 min read · Linux  ·
    Share on: twitter copy

    We run a container platform. For privacy and security reasons, we do not collect kernel logs because customer workloads use the same kernel as the host and kernel messages can contain sensitive customer data, such as command-line arguments surfaced in audit logs. However, we recently hit a blind spot: foo.service was …


    Read More
  • Avoid using 2D map for transition table in Go

    calendar Feb 26, 2026 · 4 min read · Go  ·
    Share on: twitter copy

    This post is part 1 of a series of learnings from nilaway. nilaway is a static analysis tool that detects potential nil panics in Go code. It does report false positives, but it's far from naive. One limitation is that nilaway is flow-sensitive (it understands if x != nil) but not value-correlation-sensitive (it …


    Read More
  • cached-imds-client: cache static IMDS responses to avoid linklocal_allowance_exceeded on EC2

    calendar Feb 15, 2026 · 1 min read · AWS Go  ·
    Share on: twitter copy

    I recently encountered IMDS (Instance Metadata Service) request failures with this error: 1"caller": "actor/actor.go:101", 2"error": "operation error ec2imds: GetMetadata, failed to get rate limit token, retry quota exceeded, x available, y requested The root cause: the aws-sdk-go-v2 IMDS …


    Read More
  • Using WaitGroup to Track Work Items, Not Workers: A Multi-threaded BFS Example

    calendar Feb 15, 2026 · 5 min read · Go Concurrency  ·
    Share on: twitter copy

    WaitGroup and channels are two powerful primitives in Go for synchronizing goroutines. A common pattern uses a WaitGroup to wait for goroutines completion: 1wg.Add(1) 2go func() { 3 defer wg.Done() 4 for { 5 select { 6 case <- done: 7 return 8 case task <- tasks: 9 handle(task) 10 } 11 } 12}() 13wg.Wait() In this …


    Read More
  • Simplify device path on boot with udev

    calendar Feb 2, 2026 · 4 min read · Linux Bottlerocket  ·
    Share on: twitter copy

    While prototyping Bottlerocket, I discovered it doesn't recognize additional EBS volumes specified through Block device mappings on Xen. For example, launching the same AMI on t2.medium (Xen) and t3.medium (Nitro) with "DeviceName=/dev/xvdcz": On Nitro, the device appears at /dev/nvme1n1 and …


    Read More
  • Use KillMode=process with caution: restart loop could deplete resources

    calendar Dec 12, 2025 · 4 min read · Linux systemd  ·
    Share on: twitter copy

    I recently debugged a resource leak where a systemd service kept restarting while leaving a process behind after each restart. The root cause isn't particularly interesting: a backward-incompatible third-party dependency upgrade. But the debugging process and lessons learned are. Thousands of zombie processes from a …


    Read More
  • Spawning a New Process for Socket-Activated Daemons is Error-Prone

    calendar Dec 10, 2025 · 4 min read · Linux systemd Container  ·
    Share on: twitter copy

    I recently debugged a mysterious latency issue: after migrating a systemd service from path-activation to socket-activation, there was a consistent ~1 second time-to-available latency. The culprit was a bad practice—starting the daemon program as a new process in socket-activation. Let's dive into the details. Starting …


    Read More
    • ««
    • «
    • 1
    • 2
    • 3
    • 4
    • 5
    • »
    • »»

Peng Zhang

Software Engineer

Recent Posts

  • How cgo silently disables Go's deadlock detector
  • Replacing socat with systemd-socket-proxyd
  • You probably want to disable cgo: Go's stdlib has pure-Go implementations
  • Who killed my service: collecting kernel kill logs with OTEL
  • Avoid using 2D map for transition table in Go
  • cached-imds-client: cache static IMDS responses to avoid linklocal_allowance_exceeded on EC2
  • Using WaitGroup to Track Work Items, Not Workers: A Multi-threaded BFS Example
  • Simplify device path on boot with udev

Tags

GO 21 LINUX 20 ALGORITHMS 8 BOTTLEROCKET 7 INTERVIEW 7 CONTAINER 5 CONCURRENCY 3 GUIDE 3 SYSTEMD 3 AWS 2 DISTRIBUTED SYSTEM 2 SELINUX 2 WEB 2 CRYPTOGRAPHY 1
All Tags
ALGORITHMS8 AWS2 BOTTLEROCKET7 CONCURRENCY3 CONTAINER5 CRYPTOGRAPHY1 DATABASES1 DISTRIBUTED SYSTEM2 DOCKER1 EC21 GO21 GUIDE3 INTERVIEW7 LINUX20 SELINUX2 SHELL1 SYSTEMD3 TESTING1 WEB2
[A~Z][0~9]
Peng Zhang

Copyright 2022-  PENG ZHANG. All Rights Reserved

to-top