lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA25o9Q-HvjQ_5pFJgYNeutaCoYgPu=e=k7EHq=6-+jeEuhzoA@mail.gmail.com>
Date:	Tue, 5 Nov 2013 23:17:16 -0800
From:	Luigi Semenzato <semenzato@...gle.com>
To:	Sameer Nanda <snanda@...omium.org>, msb@...ebook.com
Cc:	David Rientjes <rientjes@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>, mhocko@...e.cz,
	Johannes Weiner <hannes@...xchg.org>,
	Rusty Russell <rusty@...tcorp.com.au>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm, oom: Fix race when selecting process to kill

Regarding other fixes: would it be possible to have the thread
iterator insert a dummy marker element in the thread list before the
scan?  There would be one such dummy element per CPU, so that multiple
CPUs can scan the list in parallel.  The loop would skip such
elements, and each dummy element would be removed at the end of each
scan.

I think this would work, i.e. it would have all the right properties,
but I don't have a sense of whether the performance impact is
acceptable.  Probably not, or it would have been proposed earlier.



On Tue, Nov 5, 2013 at 8:45 PM, Luigi Semenzato <semenzato@...gle.com> wrote:
> It's interesting that this was known for 3+ years, but nobody bothered
> adding a small warning to the code.
>
> We noticed this because it's actually happening on Chromebooks in the
> field.  We try to minimize OOM kills, but we can deal with them.  Of
> course, a hung kernel we cannot deal with.
>
> On Tue, Nov 5, 2013 at 7:04 PM, Sameer Nanda <snanda@...omium.org> wrote:
>>
>>
>>
>> On Tue, Nov 5, 2013 at 5:27 PM, David Rientjes <rientjes@...gle.com> wrote:
>>>
>>> On Tue, 5 Nov 2013, Luigi Semenzato wrote:
>>>
>>> > It's not enough to hold a reference to the task struct, because it can
>>> > still be taken out of the circular list of threads.  The RCU
>>> > assumptions don't hold in that case.
>>> >
>>>
>>> Could you please post a proper bug report that isolates this at the cause?
>>
>>
>> We've been running into this issue on Chrome OS. crbug.com/256326 has
>> additional
>> details.  The issue manifests itself as a soft lockup.
>>
>> The kernel we've been seeing this on is 3.8.
>>
>> We have a pretty consistent repro currently.  Happy to try out other
>> suggestions
>> for a fix.
>>
>>>
>>>
>>> Thanks.
>>
>>
>>
>>
>> --
>> Sameer
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ