linux-kernel - Re: ptrace() hangs on attempt to seize/attach stopped & frozen task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151116184516.GJ18894@mtj.duckdns.org>
Date:	Mon, 16 Nov 2015 13:45:16 -0500
From:	Tejun Heo <tj@...nel.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Jan Kratochvil <jan.kratochvil@...hat.com>,
	Pedro Alves <palves@...hat.com>,
	Andrey Ryabinin <aryabinin@...tuozzo.com>,
	Roland McGrath <roland@...k.frob.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: ptrace() hangs on attempt to seize/attach stopped & frozen task

Hello, Oleg.

Sorry about the delay.

On Tue, Nov 10, 2015 at 09:20:17PM +0100, Oleg Nesterov wrote:
> > We simply need to reimplement cgroup freezer so that its userland
> > visible state is well defined (most likely jobctl stop).  Right now,
> > it's allowing userland to trigger "stuck somewhere in the kernel"
> > condition, so interactions with frozen tasks are naturally broken.
> 
> I agree, the freezer is not perfect, and it needs changes.
> 
> Still I think this needs a fix in ptrace code. At least we should not
> wait in TASK_UNINTERRUPTIBLE state.
> 
> And perhaps we can simply remove this logic? I forgot why do we hide this
> STOPPED -> RUNNING -> TRACED transition from the attaching thread. But the
> vague feeling tells me that we discussed this before and perhaps it was me
> who suggested to avoid the user-visible change when you introduced this
> transition...

Heh, it was too long ago for me to remember much. :)

> Anyway, now I do not understand why do we want to hide it. Lets consider
> the following "test-case",
> 
> 	void test(int pid)
> 	{
> 		kill(pid, SIGSTOP);
> 		waitpid(pid, NULL, WSTOPPED);
> 
> 		ptrace(PTRACE_ATTACH-OR-PTRACE_SEIZE, pid, 0,0);
> 
> 		assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0);
> 	}
> 
> Yes, it will fail if we remove JOBCTL_TRAPPING. But it can equally fail
> if SIGCONT comes before ATTACH, so perhaps we do not really care?
> 
> Jan, Pedro, do you think the patch below can break gdb somehow? With this
> patch you can never assume that waitpid(WNOHANG) or ptrace(WHATEVER) will
> succeed right after PTRACE_ATTACH/PTRACE_SEIZE, even if you know that the
> tracee was TASK_STOPPED before attach.
> 
> Tejun, do you see any reason to keep JOBCTL_TRAPPING?

Hmmm... It's nasty tho.  We're breaking a guaranteed userland behavior
to mask a deficiency (IMHO it's an outright bug) in a different
subsystem.  The problem here is that cgroup-frozen threads become
un-runnable on a running system and it doesn't make sense to me to
work around that from all the affected places rather than fixing it at
the source especially if that involves breaking a known supported
userland behavior.  This isn't different from the frozen processes
failing to respond to SIGKILL.  I'd be a lot more comfortable stating
that cgroup freezer is currently broken rather than diddling with
subtle ptrace semantics.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/