lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTin1q-r=b3ZVbT3vdJ59OEddrRWfsQ@mail.gmail.com>
Date:	Thu, 2 Jun 2011 17:33:33 +0200
From:	Denys Vlasenko <vda.linux@...glemail.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Oleg Nesterov <oleg@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>, indan@....nu,
	bdonlan@...il.com, linux-kernel@...r.kernel.org,
	jan.kratochvil@...hat.com, akpm@...ux-foundation.org
Subject: thread leader death under strace (was Re: [PATCH 03/10] ptrace:
 implement PTRACE_SEIZE)

On Thu, Jun 2, 2011 at 7:01 AM, Tejun Heo <tj@...nel.org> wrote:
> Maybe I misunderstood the problem but wasn't the problem about not
> being able to wait for the exit of a leader thread and detach it?  We
> have reliable (sans exec but that's a different story) exit
> notification with EVENT_EXIT which even reports the exit_code, so I
> don't see what the problem is.  What am I missing?

The problem is that right now it seems that if tracer doesn't catch
EVENT_EXIT and detach tracee when it sees it, really weird things
happen.

Case 1: tracer traces thread leader. An untraced thread execs.
tracer sees EVENT_EXIT, PTRACE_CONTs tracee, and...
never sees WIFEXITED, waitpid just blocks forever!

Case 2: discovered when I started experiments with current kernel
behavior. No execve involved. I just run two tracees, make
leader exit, then make the other tracee signal itself with fatal signal
(SIGUSR1).
Tracer sees leader's exit, but never sees other tracee's signal!

Please see attached program.
The output on my F15 machine:

4816: thread leader
4816: status:0003057f WIFSTOPPED sig:5 (TRAP) event:CLONE eventdata:0x12d1
4817: status:0000137f WIFSTOPPED sig:19 (STOP) event:none eventdata:0x0
EXITING <=== leader will exit now. tracer sees it:
4816: status:0006057f WIFSTOPPED sig:5 (TRAP) event:EXIT eventdata:0x7700
DYING <=== other thread SIGUSR1's itself. tracer *doesn't see anything*
alarm timed out <=== tracer dies on alarm(1).

Now, if I send a non-fatal signal first, by uncommenting these lines:
//      usleep(100*1000);
//      VERBOSE("WINCH\n");
//      raise(SIGWINCH);

Output:

4834: thread leader
4834: status:0003057f WIFSTOPPED sig:5 (TRAP) event:CLONE eventdata:0x12e3
4835: status:0000137f WIFSTOPPED sig:19 (STOP) event:none eventdata:0x0
EXITING <=== leader will exit now. tracer sees it:
4834: status:0006057f WIFSTOPPED sig:5 (TRAP) event:EXIT eventdata:0x7700
WINCH <=== other thread WINCH's itself. tracer *doesn't see anything*
DYING <=== other thread SIGUSR1's itself. tracer sees it:
4835: status:00000a7f WIFSTOPPED sig:10 (USR1) event:none eventdata:0x0
      tracer sees other thread about to die from signal 10:
4835: status:0006057f WIFSTOPPED sig:5 (TRAP) event:EXIT eventdata:0xa
      tracer sees other thread die from signal 10:
4835: status:0000000a WIFSIGNALED sig:10 (USR1)
      tracer sees leader die from signal 10:
4834: status:0000000a WIFSIGNALED sig:10 (USR1)
      tracer is informed "no more tracees":
waitpid returned -1

This is just plain broken, right?
(1) It should have worked in the first case too.
(2) Where is SIGWINCH notification? Why ptrace didn't report it?

Re (2): if I disable leader exit - these lines:
      VERBOSE("EXITING\n");
      syscall(__NR_exit, 0x77);

then I do see SIGWINCH notification:

4891: thread leader
4891: status:0003057f WIFSTOPPED sig:5 (TRAP) event:CLONE eventdata:0x131c
4892: status:0000137f WIFSTOPPED sig:19 (STOP) event:none eventdata:0x0
WINCH
4891: status:00001c7f WIFSTOPPED sig:28 ((null)) event:none eventdata:0x131c
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DYING
4891: status:00000a7f WIFSTOPPED sig:10 (USR1) event:none eventdata:0x131c
4892: status:0006057f WIFSTOPPED sig:5 (TRAP) event:EXIT eventdata:0xa
4891: status:0006057f WIFSTOPPED sig:5 (TRAP) event:EXIT eventdata:0xa
4892: status:0000000a WIFSIGNALED sig:10 (USR1)
4891: status:0000000a WIFSIGNALED sig:10 (USR1)
waitpid returned -1

Looks like leader's exit throws a wrench into ptrace machinery.

Now I understand why strace DETACHs on exit! :)

-- 
vda

View attachment "thread_leader_exit_with_TRACEEXIT_1.c" of type "text/x-csrc" (7429 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ