lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 09 Apr 2020 12:03:55 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Waiman Long <longman@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Will Deacon <will@...nel.org>,
        Bernd Edlinger <bernd.edlinger@...mail.de>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Alexey Gladkov <gladkov.alexey@...il.com>,
        Oleg Nesterov <oleg@...hat.com>
Subject: Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1


Adding Oleg to the conversation if for no other reason that he can see
it is happening.

Oleg has had a test case where this can happen for years and nothing
has come out as an obvious proper fix for this deadlock issue.

Linus Torvalds <torvalds@...ux-foundation.org> writes:

> On Thu, Apr 9, 2020 at 9:15 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>>
>> may_ptrace_stop() is supposed to stop the blocking exactly so that it
>> doesn't deadlock.
>>
>> I wonder why that doesn't work..
>>
>> [ Goes and look ]
>>
>> Oh. I see.
>>
>> That ptrace_may_stop() only ever considered core-dumping, not execve().
>>
>> But if _that_ is the reason for the deadlock, then it's trivially fixed.
>
> So maybe may_ptrace_stop() should just do something like this
> (ENTIRELY UNTESTED):
>
>         struct task_struct *me = current, *parent = me->parent;
>
>         if (!likely(me->ptrace))
>                 return false;
>
>         /* If the parent is exiting or core-dumping, it's not
> listening to our signals */
>         if (parent->signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP))
>                 return false;
>
>         /* if the parent is going through a execve(), it's not listening */
>         if (parent->signal->group_exit_task)
>                 return false;
>
>         return true;
>
> instead of the fairly ad-hoc tests for core-dumping.
>
> The above is hand-wavy - I didn't think a lot about locking.
> may_ptrace_stop() is already called under the tasklist_lock, so the
> parent won't change, but maybe it should take the signal lock?
>
> So the above very much is *not* meant to be a "do it like this", more
> of a "this direction, maybe"?
>
> The existing code is definitely broken. It special-cases core-dumping
> probably simply because that's the only case people had realized, and
> not thought of the execve() thing.


I don't see how there can be a complete solution in may_ptrace_stop.

a) We must stop in PTRACE_EVENT_EXIT during exec or userspace *breaks*.

   Those are the defined semantics and I believe it is something
   as common as strace that depends on them.

b) Even if we added a test for our ptrace parent blocking in a ptrace
   attach of an ongoing exec, it still wouldn't help.

   That ptrace attach could legitimately come after the thread in
   question has stopped and notified it's parent it is stopped.



None of this is to say I like the semantics of PTRACE_EVENT_EXIT.  It is
just we will violate the no regressions rule if we don't stop there
during exec.

The normal case is that the strace or whomever is already attached to
all of the threads during exec and no deadlock occurs.  So the current
behavior is quite usable.



Maybe my memory is wrong that userspace cares but I really don't think
so.


Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ