lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 May 2017 08:26:27 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Vovo Yang <vovoy@...gle.com>
Cc:     Guenter Roeck <linux@...ck-us.net>, Ingo Molnar <mingo@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: Threads stuck in zap_pid_ns_processes()

Vovo Yang <vovoy@...gle.com> writes:

> On Fri, May 12, 2017 at 7:19 AM, Eric W. Biederman
> <ebiederm@...ssion.com> wrote:
>> Guenter Roeck <linux@...ck-us.net> writes:
>>
>>> What I know so far is
>>> - We see this condition on a regular basis in the field. Regular is
>>>   relative, of course - let's say maybe 1 in a Milion Chromebooks
>>>   per day reports a crash because of it. That is not that many,
>>>   but it adds up.
>>> - We are able to reproduce the problem with a performance benchmark
>>>   which opens 100 chrome tabs. While that is a lot, it should not
>>>   result in a kernel hang/crash.
>>> - Vovo proviced the test code last night. I don't know if this is
>>>   exactly what is observed in the benchmark, or how it relates to the
>>>   benchmark in the first place, but it is the first time we are actually
>>>   able to reliably create a condition where the problem is seen.
>>
>> Thank you.  I will be interesting to hear what is happening in the
>> chrome perfomance benchmark that triggers this.
>>
> What's happening in the benchmark:
> 1. A chrome renderer process was created with CLONE_NEWPID
> 2. The process crashed
> 3. Chrome breakpad service calls ptrace(PTRACE_ATTACH, ..) to attach to every
>   threads of the crashed process to dump info
> 4. When breakpad detach the crashed process, the crashed process stuck in
>   zap_pid_ns_processes()

Very interesting thank you.

So the question is specifically which interaction is causing this.

In the test case provided it was a sibling task in the pid namespace
dying and not being reaped.  Which may be what is happening with
breakpad.  So far I have yet to see kernel bug but I won't rule one out.

Eric

Powered by blists - more mailing lists