linux-kernel - [ptrace, rt] erratic behaviour in PTRACE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181119194619.GA32121@zipoli.concurrent-rt.com>
Date:   Mon, 19 Nov 2018 19:46:23 +0000
From:   Joe Korty <Joe.Korty@...current-rt.com>
To:     "julia@...com" <julia@...com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "bigeasy@...utronix.de" <bigeasy@...utronix.de>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "oleg@...hat.com" <oleg@...hat.com>,
        "linux-rt-users@...r.kernel.org" <linux-rt-users@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Joe Korty <Joe.Korty@...current-rt.com>
Subject: [ptrace, rt] erratic behaviour in PTRACE_SINGLESET on 4.13-rt and
 later.

Hi Julia & the RT team,

The following program might make a good addition to the rt
test suite.  It tests the reliability of PTRACE_SINGLESTEP.
It does by default 10,000 ssteps against a simple,
spinner tracee.  Also by default, it spins off ten of these
tracer/tracee pairs, all of which are to run concurrently.

Starting with 4.13-rt, this test occasionally encounters a
sstep whose waitpid returns a WIFSIGNALED (signal SIGTRAP)
rather than a WIFSTOPPED.  This usually happens after
thousands of ssteps have executed.  Having multiple
tracer/tracee pairs running dramatically increases the
chances of failure.

The is what the test output looks like for a good run:

  #forks: 10
  #steps: 10000
  
  forktest#0/22872: STARTING
  forktest#7/22879: STARTING
  forktest#8/22880: STARTING
  forktest#6/22878: STARTING
  forktest#5/22877: STARTING
  forktest#3/22875: STARTING
  forktest#4/22876: STARTING
  forktest#9/22882: STARTING
  forktest#2/22874: STARTING
  forktest#1/22873: STARTING
  forktest#0/22872: EXITING, no error
  forktest#8/22880: EXITING, no error
  forktest#3/22875: EXITING, no error
  forktest#7/22879: EXITING, no error
  forktest#6/22878: EXITING, no error
  forktest#5/22877: EXITING, no error
  forktest#2/22874: EXITING, no error
  forktest#4/22876: EXITING, no error
  forktest#9/22882: EXITING, no error
  forktest#1/22873: EXITING, no error
  All tests PASSED.

This is what the test output looks like for a failing run:

  #forks: 10
  #steps: 10000
  
  forktest#0/22906: STARTING
  forktest#1/22907: STARTING
  forktest#2/22909: STARTING
  forktest#3/22911: STARTING
  forktest#4/22912: STARTING
  forktest#5/22915: STARTING
  forktest#6/22916: STARTING
  forktest#7/22919: STARTING
  forktest#8/22920: STARTING
  forktest#9/22923: STARTING
  forktest#2/22909: EXITING, ERROR: wait on PTRACE_SINGLESTEP #9: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#5/22915: EXITING, no error
  forktest#3/22911: EXITING, ERROR: wait on PTRACE_SINGLESTEP #7953: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#0/22906: EXITING, ERROR: wait on PTRACE_SINGLESTEP #5072: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#9/22923: EXITING, ERROR: wait on PTRACE_SINGLESTEP #7992: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#4/22912: EXITING, ERROR: wait on PTRACE_SINGLESTEP #9923: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#1/22907: EXITING, ERROR: wait on PTRACE_SINGLESTEP #7723: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#7/22919: EXITING, ERROR: wait on PTRACE_SINGLESTEP #5036: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#8/22920: EXITING, ERROR: wait on PTRACE_SINGLESTEP #4943: wanted STATE_STOPPED, saw STATE_SIGNALED instead (and saw signo 5 too)
  forktest#6/22916: EXITING, no error
  One or more tests FAILED.

Here are the observations from my testing so far:

  - It has never failed when confined to a single cpu.
  - It has never failed on !rt kernels.
  - It has never failed on any kernel prior to 4.13.
  - More failures than what chance would dictate happen
    near the end of a test run (ie, if a test of 10,000 
    steps is run, the failure will be at the 9,xxx'th step,
    if 100,000 steps are run, the failure will be at
    the 9x,xxx'th step).

The above results are from kernels 4.{4,9,11,13,14,19}-rt
and some !rt's of these as well.

I have yet to see or hear of this bug, if it is a bug,
giving anyone a problem in a debug session.  It might not
even be a bug but merely expected behaviour. And of course
there is the possibility of a misuse of ptrace/waitpid in
my test program. Its API, after all, is rather convoluted.

Regards,
Joe




/*
 * Have a tracer do a bunch of PTRACE_SINGLESTEPs against
 * a tracee as fast as possible.  Create several of these
 * tracer/tracee pairs and see if they can be made to
 * interfere with each other.
 *
 * Usage:
 *   ssdd nforks niters
 * Where:
 *   nforks - number of tracer/tracee pairs to fork off.
 *            default 10.
 *   niters - number of PTRACE_SINGLESTEP iterations to
 *            do before declaring success, for each tracer/
 *            tracee pair set up.  Default 10,000.
 *
 * The tracee waits on each PTRACE_SINGLESTEP with a waitpid(2)
 * and checks that waitpid's return values for correctness.
 */
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <unistd.h>
#include <string.h>
#include <signal.h>
#include <errno.h>

#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>

/* do_wait return values */
#define STATE_EXITED	1
#define STATE_STOPPED	2
#define STATE_SIGNALED	3
#define STATE_UNKNOWN	4
#define STATE_ECHILD	5
#define STATE_EXITED_TSIG	6	/* exited with termination signal */
#define STATE_EXITED_ERRSTAT	7	/* exited with non-zero status */

char *state_name[] = {
	[STATE_EXITED] = "STATE_EXITED",
	[STATE_STOPPED] = "STATE_STOPPED",
	[STATE_SIGNALED] = "STATE_SIGNALED",
	[STATE_UNKNOWN] = "STATE_UNKNOWN",
	[STATE_ECHILD] = "STATE_ECHILD",
	[STATE_EXITED_TSIG] = "STATE_EXITED_TSIG",
	[STATE_EXITED_ERRSTAT] = "STATE_EXITED_ERRSTAT"
};

const char *get_state_name(int state)
{
	if (state < STATE_EXITED || state > STATE_EXITED_ERRSTAT)
		return "?";
	return state_name[state];
}

#define unused __attribute__((unused))

static int got_sigchld;

static int do_wait(pid_t *wait_pid, int *ret_sig)
{
	int status, child_status;

	*ret_sig = -1; /* initially mark 'nothing returned' */

	while (1) {
		status = waitpid(-1, &child_status, WUNTRACED | __WALL);
		if (status == -1) {
			if (errno == EINTR)
				continue;
			if (errno == ECHILD) {
				*wait_pid = (pid_t)0;
				return STATE_ECHILD;
			}
			printf("do_wait/%d: EXITING, ERROR: "
			       "waitpid() returned errno %d\n",
			       getpid(), errno);
			exit(1);
		}
		break;
	}
	*wait_pid = (pid_t)status;

	if (WIFEXITED(child_status)) {
		if (WIFSIGNALED(child_status))
			return STATE_EXITED_TSIG;
		if (WEXITSTATUS(child_status))
			return STATE_EXITED_ERRSTAT;
		return STATE_EXITED;
	}
	if (WIFSTOPPED(child_status)) {
		*ret_sig = WSTOPSIG(child_status);
		return STATE_STOPPED;
	}
	if (WIFSIGNALED(child_status)) {
		*ret_sig = WTERMSIG(child_status);
		return STATE_SIGNALED;
	}
	return STATE_UNKNOWN;
}

int check_sigchld(void)
{
	int i;
	/*
	 * The signal is asynchronous so give it some
	 * time to arrive.
	 */
	for (i = 0; i < 10 && !got_sigchld; i++)
		usleep(1000); /* 10 msecs */
	for (i = 0; i < 10 && !got_sigchld; i++)
		usleep(2000); /* 20 + 10 = 30 msecs */
	for (i = 0; i < 10 && !got_sigchld; i++)
		usleep(4000); /* 40 + 30 = 70 msecs */
	for (i = 0; i < 10 && !got_sigchld; i++)
		usleep(8000); /* 80 + 40 = 150 msecs */
	for (i = 0; i < 10 && !got_sigchld; i++)
		usleep(16000); /* 160 + 150 = 310 msecs */
	for (i = 0; i < 10 && !got_sigchld; i++)
		usleep(32000); /* 320 + 310 = 630 msecs */

	return got_sigchld;
}

pid_t parent;
int nforks = 10;
int nsteps = 10000;

static void sigchld(int sig, unused siginfo_t * info, unused void *arg)
{
	got_sigchld = 1;
}

static void child_process(void)
{
	unused volatile int i;

	/* wait for ptrace attach */
	usleep(100000);
	while (1)
		i = 0;
}

static int forktests(int testid)
{
	int i, status, ret_sig;
	long pstatus;
	pid_t child, wait_pid;
	struct sigaction act, oact;

	parent = getpid();
	printf("forktest#%d/%d: STARTING\n", testid, parent);

	child = fork();
	if (child == -1) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "fork returned errno %d\n", testid, parent, errno);
		exit(1);
	}
	if (!child)
		child_process();

	act.sa_sigaction = sigchld;
	sigemptyset(&act.sa_mask);
	act.sa_flags = SA_SIGINFO;
	status = sigaction(SIGCHLD, &act, &oact);
	if (status) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "sigaction returned %d, errno %d\n",
		       testid, parent, status, errno);
		exit(1);
	}

	/* give both our child and parent time to set things up */
	usleep(125000);

	/*
	 * Attach to the child.
	 */
	pstatus = ptrace(PTRACE_ATTACH, child, NULL, NULL);
	if (pstatus == ~0l) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "attach failed.  errno %d\n",
		       testid, getpid(), errno);
		exit(1);
	}

	/*
	 * The attach should cause the child to receive a signal.
	 */
	status = do_wait(&wait_pid, &ret_sig);
	if (wait_pid != child) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "attach: Unexpected wait pid %d\n",
		       testid, getpid(), wait_pid);
		exit(1);
	}
	if (status != STATE_STOPPED) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "attach: wait on PTRACE_ATTACH returned %d "
		       "[%s, wanted STATE_STOPPED], signo %d\n",
		       testid, getpid(), status, get_state_name(status),
		       ret_sig);
		exit(1);
	}
	else if (!check_sigchld()) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "wait on PTRACE_ATTACH saw a SIGCHLD count of %d, should be 1\n",
		       testid, getpid(), got_sigchld);
		exit(1);
	}
	got_sigchld = 0;


	/*
	 * Generate 'nsteps' PTRACE_SINGLESTEPs, make sure they all actually step
	 * the tracee.
	 */
	for (i = 0; i < nsteps; i++) {
		pstatus = ptrace(PTRACE_SINGLESTEP, child, NULL, NULL);

		if (pstatus) {
			printf("forktest#%d/%d: EXITING, ERROR: "
			       "PTRACE_SINGLESTEP #%d: returned status %ld, "
			       "errno %d, signo %d\n",
			       testid, getpid(), i, pstatus, errno, ret_sig);
			exit(1);
		}

		status = do_wait(&wait_pid, &ret_sig);
		if (wait_pid != child) {
			printf("forktest#%d/%d: EXITING, ERROR: "
			       "wait on PTRACE_SINGLESTEP #%d: returned wrong pid %d, "
			       "expected %d\n",
			       testid, getpid(), i, wait_pid, child);
			exit(1);
		}
		if (status != STATE_STOPPED) {
			printf("forktest#%d/%d: EXITING, ERROR: "
			       "wait on PTRACE_SINGLESTEP #%d: wanted STATE_STOPPED, "
			       "saw %s instead (and saw signo %d too)\n",
			       testid, getpid(), i,
			       get_state_name(status), ret_sig);
			exit(1);
		}
		if (ret_sig != SIGTRAP) {
			printf("forktest#%d/%d: EXITING, ERROR: "
			       "wait on PTRACE_SINGLESTEP #%d: returned signal %d, "
			       "wanted SIGTRAP\n",
			       testid, getpid(), i, ret_sig);
			exit(1);
		}
		if (!check_sigchld()) {
			printf("forktest#%d/%d: EXITING, ERROR: "
			       "wait on PTRACE_SINGLESTEP #%d: no SIGCHLD seen "
			       "(signal count == 0), signo %d\n",
			       testid, getpid(), i, ret_sig);
			exit(1);
		}
		got_sigchld = 0;
	}

	/*
	 * We are all done, kill the tracee and wait for it to die.  We test
	 * the killing results as much as the above attach and sstep results,
	 * though failure here really isn't the point of this test.
	 */
	status = kill(child, SIGKILL);

	if (status) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "kill of child %d returned %d\n",
		       testid, getpid(), child, errno);
		exit(1);
	}

	status = do_wait(&wait_pid, &ret_sig);
	if (wait_pid != child) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "kill: Unexpected wait pid %d\n",
		       testid, getpid(), wait_pid);
		exit(1);
	}
	if (status != STATE_SIGNALED) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "kill: Unexpected child, do_wait status %d "
		       "[%s, wanted STATE_SIGNALED]\n",
		       testid, getpid(), status, get_state_name(status));
		exit(1);
	}
	if (ret_sig != SIGKILL) {
		printf("forktest#%d/%d: EXITING, ERROR: "
		       "kill: Unexpected child signal %d\n",
		       testid, getpid(), ret_sig);
		exit(1);
	}

	printf("forktest#%d/%d: EXITING, no error\n", testid, parent);
	exit(0);
}

int main(int argc, char **argv)
{
	int i, ret_sig, status;
	pid_t child = 0, wait_pid;
	int error = 0;

	setbuf(stdout, NULL);

	argc--, argv++;
	if (argc) {
		nforks = atoi(*argv);
		argc--, argv++;
		if (argc)
			nsteps = atoi(*argv);
	}
	printf("#forks: %d\n", nforks);
	printf("#steps: %d\n", nsteps);
	printf("\n");

	for (i = 0; i < nforks; i++) {
		child = fork();
		if (child == -1) {
			printf("main: fork returned errno %d\n", errno);
			exit(1);
		}
		if (!child)
			forktests(i);
	}

	for (i = 0; i < nforks; i++) {
		status = do_wait(&wait_pid, &ret_sig);
		if (status != STATE_EXITED) {
			if (0) printf("main/%d: ERROR: "
			       "forktest#%d unexpected do_wait status %d "
			       "[%s, wanted STATE_EXITED]\n",
			       getpid(), wait_pid, status,
			       get_state_name(status));
			error = 1;
		}
	}

	printf("%s.\n", error ?
		"One or more tests FAILED" :
		"All tests PASSED");
	exit(error);
}