SIGTERM-ing SIGSTOP-ed process

Problem

One of projects I’m working on is responsible for coordinating/running multiple processes. It’s a simple real-time web app that allows starting, stopping, pausing and terminating processes.

Since I’m developing and testing on OSX and the service is running on Linux I came across a surprising behaviour how those platforms handle particular signals.

Description

Common scenario running a process looks like this:

  1. Run a process to do some work
  2. Pause it (with SIGSTOP)
  3. Terminate paused process (with SIGTERM)

I’ve observed 2 different behaviours:

Here’s the Go app to illustrate the problem:

package main //sigtest.go

import (
	"fmt"
	"os/exec"
	"syscall"
	"time"
)

func main() {
	// NOTE: err handling omitted for brevity

  // 1. start a proces to do some work
	cmd := exec.Command("bash", "-c", "sleep 10000")
	cmd.Start()

	<-time.After(100 * time.Millisecond)
  // 2. "pause" the process
	cmd.Process.Signal(syscall.SIGSTOP)

	<-time.After(100 * time.Millisecond)
  // 3. terminate the process
	cmd.Process.Signal(syscall.SIGTERM)

	var (
		errc = make(chan error)

		slow    = time.After(2000 * time.Millisecond)
		timeout = time.After(5000 * time.Millisecond)
	)

  // wait for the process to terminate
	go func() { errc <- cmd.Wait() }()

retry:
	select {
	case err := <-errc:
		fmt.Println(err)
		fmt.Println("Done")
	case <-slow:
		fmt.Println("Taking longer than it should...")
		slow = nil
		goto retry
	case <-timeout:
		fmt.Println("Timeout")
	}
}

Build the program for both platforms

GOOS=darwin GOARCH=amd64 go build -o sigtest_darwin sigtest.go
GOOS=linux GOARCH=amd64 go build -o sigtest_linux sigtest.go

Running on Linux produces:

$ ./sigtest_linux
Taking longer than it should...
Timeout

Works as expected on OSX:

$ ./sigtest_darwin
signal: terminated
Done

I’ve not found the official docs/explanation yet, but my assumption is that SIGTERM is the signal that must be handled by a process itself, yet the process is unable to do so after being is SIGSTOP-ed on Linux.

Workarounds

  1. send SIGCONT right before SIGTERM for SIGSTOP-ed process on Linux
  2. send SIGKILL instead SIGTERM but it removes opportunity for the process to shut down gracefully

PS

Thanks for reading! Let me know if you have more info on this.

Comments