SIGTERM-ing SIGSTOP-ed process

Table Of Contents ↓

Problem

One of projects I’m working on is responsible for coordinating/running multiple processes. It’s a simple real-time web app that allows starting, stopping, pausing and terminating processes.

Since I’m developing and testing on OSX and the service is running on Linux I came across a surprising behaviour how those platforms handle particular signals.

Description

Common scenario running a process looks like this:

  1. Run a process to do some work
  2. Pause it (with SIGSTOP)
  3. Terminate paused process (with SIGTERM)

I’ve observed 2 different behaviours:

Here’s the Go app to illustrate the problem:

package main //sigtest.go

import (
	"fmt"
	"os/exec"
	"syscall"
	"time"
)

func main() {
	// NOTE: err handling omitted for brevity

  // 1. start a proces to do some work
	cmd := exec.Command("bash", "-c", "sleep 10000")
	cmd.Start()

	<-time.After(100 * time.Millisecond)
  // 2. "pause" the process
	cmd.Process.Signal(syscall.SIGSTOP)

	<-time.After(100 * time.Millisecond)
  // 3. terminate the process
	cmd.Process.Signal(syscall.SIGTERM)

	var (
		errc = make(chan error)

		slow    = time.After(2000 * time.Millisecond)
		timeout = time.After(5000 * time.Millisecond)
	)

  // wait for the process to terminate
	go func() { errc <- cmd.Wait() }()

retry:
	select {
	case err := <-errc:
		fmt.Println(err)
		fmt.Println("Done")
	case <-slow:
		fmt.Println("Taking longer than it should...")
		slow = nil
		goto retry
	case <-timeout:
		fmt.Println("Timeout")
	}
}

Build the program for both platforms

GOOS=darwin GOARCH=amd64 go build -o sigtest_darwin sigtest.go
GOOS=linux GOARCH=amd64 go build -o sigtest_linux sigtest.go

Running on Linux produces:

$ ./sigtest_linux
Taking longer than it should...
Timeout

Works as expected on OSX:

$ ./sigtest_darwin
signal: terminated
Done

I’ve not found the official docs/explanation yet, but my assumption is that SIGTERM is the signal that must be handled by a process itself, yet the process is unable to do so after being is SIGSTOP-ed on Linux.

Workarounds

  1. send SIGCONT right before SIGTERM for SIGSTOP-ed process on Linux
  2. send SIGKILL instead SIGTERM but it removes opportunity for the process to shut down gracefully

PS: Let me know if you have more info on this.

Thanks!

Related Posts
Read More
Go's testing package side-effects
IoT sensor metrics with Go - Part 1
Comments
read or add one↓