linux-kernel - Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AM6PR03MB51708FD4226E07AB7CB0D6A7E4C10@AM6PR03MB5170.eurprd03.prod.outlook.com>
Date:   Thu, 9 Apr 2020 19:17:48 +0200
From:   Bernd Edlinger <bernd.edlinger@...mail.de>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Waiman Long <longman@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Will Deacon <will@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Alexey Gladkov <gladkov.alexey@...il.com>,
        Oleg Nesterov <oleg@...hat.com>
Subject: Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1


On 4/9/20 7:03 PM, Eric W. Biederman wrote:
> 
> Adding Oleg to the conversation if for no other reason that he can see
> it is happening.
> 
> Oleg has had a test case where this can happen for years and nothing
> has come out as an obvious proper fix for this deadlock issue.
> 

Just for reference, I used Oleg's test case,
and improved it a bit.  The test case  anticipates the
EAGAIN return code from PTRACE_ATTACH.  This is likely
to change somehow.
If Linus's idea works, you will probably have to
look at adjusting the test expectations again.

I would still be surprised if any other solution works.


Bernd. 

> Linus Torvalds <torvalds@...ux-foundation.org> writes:
> 
>> On Thu, Apr 9, 2020 at 9:15 AM Linus Torvalds
>> <torvalds@...ux-foundation.org> wrote:
>>>
>>> may_ptrace_stop() is supposed to stop the blocking exactly so that it
>>> doesn't deadlock.
>>>
>>> I wonder why that doesn't work..
>>>
>>> [ Goes and look ]
>>>
>>> Oh. I see.
>>>
>>> That ptrace_may_stop() only ever considered core-dumping, not execve().
>>>
>>> But if _that_ is the reason for the deadlock, then it's trivially fixed.
>>
>> So maybe may_ptrace_stop() should just do something like this
>> (ENTIRELY UNTESTED):
>>
>>         struct task_struct *me = current, *parent = me->parent;
>>
>>         if (!likely(me->ptrace))
>>                 return false;
>>
>>         /* If the parent is exiting or core-dumping, it's not
>> listening to our signals */
>>         if (parent->signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP))
>>                 return false;
>>
>>         /* if the parent is going through a execve(), it's not listening */
>>         if (parent->signal->group_exit_task)
>>                 return false;
>>
>>         return true;
>>
>> instead of the fairly ad-hoc tests for core-dumping.
>>
>> The above is hand-wavy - I didn't think a lot about locking.
>> may_ptrace_stop() is already called under the tasklist_lock, so the
>> parent won't change, but maybe it should take the signal lock?
>>
>> So the above very much is *not* meant to be a "do it like this", more
>> of a "this direction, maybe"?
>>
>> The existing code is definitely broken. It special-cases core-dumping
>> probably simply because that's the only case people had realized, and
>> not thought of the execve() thing.
> 
> 
> I don't see how there can be a complete solution in may_ptrace_stop.
> 
> a) We must stop in PTRACE_EVENT_EXIT during exec or userspace *breaks*.
> 
>    Those are the defined semantics and I believe it is something
>    as common as strace that depends on them.
> 
> b) Even if we added a test for our ptrace parent blocking in a ptrace
>    attach of an ongoing exec, it still wouldn't help.
> 
>    That ptrace attach could legitimately come after the thread in
>    question has stopped and notified it's parent it is stopped.
> 
> 
> 
> None of this is to say I like the semantics of PTRACE_EVENT_EXIT.  It is
> just we will violate the no regressions rule if we don't stop there
> during exec.
> 
> The normal case is that the strace or whomever is already attached to
> all of the threads during exec and no deadlock occurs.  So the current
> behavior is quite usable.
> 
> 
> 
> Maybe my memory is wrong that userspace cares but I really don't think
> so.
> 
> 
> Eric
>