[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTik1+XVrkwTouERoxF2AWYyv-UCqie_Wq3OLjoXg@mail.gmail.com>
Date: Mon, 28 Feb 2011 14:16:48 +0100
From: Denys Vlasenko <vda.linux@...glemail.com>
To: Tejun Heo <tj@...nel.org>
Cc: Oleg Nesterov <oleg@...hat.com>,
Roland McGrath <roland@...hat.com>, jan.kratochvil@...hat.com,
linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org
Subject: Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
On Mon, Feb 28, 2011 at 1:56 PM, Tejun Heo <tj@...nel.org> wrote:
>> * group-stop state is currently not preserved across ptrace-stop.
>> This makes, in particular, ^Z and SIGSTOP inoperative for straced
>> programs. Everyone agrees this needs to be fixed.
>> (There is a small bug of not notifying real parent about the group-stop,
>> I don't want to go there since it is also non-contentious - everybody
>> is in agreement this also should be fixed in "obvious" way).
>
> Yeap, we do agree on this one, unfortunately not on how yet.
>
>> * HOWEVER, this behavior _is_ indeed used by gdb to run small fragments
>> of tracee even if it's stopped. Jan's example:
>> # gdb -p applicationpid
>> (gdb) print getpid()
>> (gdb) print show_me_your_internal_debug_dump()
>> (gdb) continue
>> gdb people want to preserve this feature.
>> How gdb implements this? I ssume it does this by modifying IP,
>> setting a breakpoint on return address, and issues PTRACE_CONT(0).
>> Currently it works because of "group-stop is ignored under ptrace" bug.
>
> I don't think it works because of "group-stop is ignored under ptrace"
> bug.
How so?
Imagine the following: tracee was stopped (two cases: it was stopped
before we attached to it, or it was stopped by SIGSTOP during debug session),
and we do run on a hypothetical kernel which preserves group-stop.
At this point, in gdb user does this:
(gdb) print getpid()
gdb modifies IP, sets breakpoint on return address, and issues PTRACE_CONT(0).
Kernel has to put the tracee into group-stop, right?
Becuase if it doesn't, if it makes tracee run, then the kernel is
still broken. For example,
stracing a program and sending SIGSTOP on it won't work: the sequence
of events will be
got SIGSTOP because SIGSTOP was delivered
PTRACE_SYSCALL(SIGSTOP) - "inject it"
got SIGSTOP because tracee is in group-stop now
PTRACE_SYSCALL(SIGSTOP) - equivalent to PTRACE_SYSCALL(0)
because we aren't in signal delivery ptrace-stop
and tracee continues.
That's why I think gdb's "print getpid()" today depends on the bug.
If we simply fix the bug (by making PTRACE_CONT/SYSCALL(0)
re-enter group-stop), then "print getpid()" will stop working
for stopped tracees.
> IMO, it's because ptrace is inherently per-task not
> per-task-group, which I think is the right way to do it.
Yes, it is, and I don't propose to change that.
However, I don't see how that is relevant to examples
I just described.
> Yeah, agreed and as I said multiple times I think this is by design
> and actually the better and more useful behavior, albeit slightly less
> intuitive.
As I described, current behavior breaks stracing of programs
which get SIGSTOPed or SIGTSTP'ed (^Z).
Which is pretty lame - ^Z is not exactly rare use case.
--
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists