linux-kernel - Re: [PATCH] mm, oom: Fix race when selecting process to kill

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 6 Nov 2013 08:58:44 -0800
From:	Sameer Nanda <snanda@...omium.org>
To:	Luigi Semenzato <semenzato@...gle.com>
Cc:	msb@...ebook.com, David Rientjes <rientjes@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>, mhocko@...e.cz,
	Johannes Weiner <hannes@...xchg.org>,
	Rusty Russell <rusty@...tcorp.com.au>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm, oom: Fix race when selecting process to kill

(adding back context from thread history)
On Tue, Nov 5, 2013 at 5:18 PM, David Rientjes <rientjes@...gle.com> wrote:
> On Tue, 5 Nov 2013, Sameer Nanda wrote:
>
>> The selection of the process to be killed happens in two spots -- first
>> in select_bad_process and then a further refinement by looking for
>> child processes in oom_kill_process. Since this is a two step process,
>> it is possible that the process selected by select_bad_process may get a
>> SIGKILL just before oom_kill_process executes. If this were to happen,
>> __unhash_process deletes this process from the thread_group list. This
>> then results in oom_kill_process getting stuck in an infinite loop when
>> traversing the thread_group list of the selected process.
>>
>> Fix this race by holding the tasklist_lock across the calls to both
>> select_bad_process and oom_kill_process.
>>
>> Change-Id: I8f96b106b3257b5c103d6497bac7f04f4dff4e60
>> Signed-off-by: Sameer Nanda <snanda@...omium.org>
>
> Nack, we had to avoid taking tasklist_lock for this duration since it
> stalls out forks and exits on other cpus trying to take the writeside with
> irqs disabled to avoid watchdog problems.
>

David -- I think we can make the duration that the tasklist_lock is
held smaller by consolidating the process selection logic that is
currently split across select_bad_process and oom_kill_process into
one place in select_bad_process.  The tasklist_lock would then need to
be held only when the thread lists are being traversed.  Would you be
ok with that?  I can re-spin the patch if that sounds like a workable
option.

> What kernel version are you patching?  If you check the latest Linus tree,
> we hold a reference to the task_struct of the chosen process before
> calling oom_kill_process() so the hypothesis would seem incorrect.


On Tue, Nov 5, 2013 at 11:17 PM, Luigi Semenzato <semenzato@...gle.com> wrote:
> Regarding other fixes: would it be possible to have the thread
> iterator insert a dummy marker element in the thread list before the
> scan?  There would be one such dummy element per CPU, so that multiple
> CPUs can scan the list in parallel.  The loop would skip such
> elements, and each dummy element would be removed at the end of each
> scan.
>
> I think this would work, i.e. it would have all the right properties,
> but I don't have a sense of whether the performance impact is
> acceptable.  Probably not, or it would have been proposed earlier.
>
>
>
> On Tue, Nov 5, 2013 at 8:45 PM, Luigi Semenzato <semenzato@...gle.com> wrote:
>> It's interesting that this was known for 3+ years, but nobody bothered
>> adding a small warning to the code.
>>
>> We noticed this because it's actually happening on Chromebooks in the
>> field.  We try to minimize OOM kills, but we can deal with them.  Of
>> course, a hung kernel we cannot deal with.
>>
>> On Tue, Nov 5, 2013 at 7:04 PM, Sameer Nanda <snanda@...omium.org> wrote:
>>>
>>>
>>>
>>> On Tue, Nov 5, 2013 at 5:27 PM, David Rientjes <rientjes@...gle.com> wrote:
>>>>
>>>> On Tue, 5 Nov 2013, Luigi Semenzato wrote:
>>>>
>>>> > It's not enough to hold a reference to the task struct, because it can
>>>> > still be taken out of the circular list of threads.  The RCU
>>>> > assumptions don't hold in that case.
>>>> >
>>>>
>>>> Could you please post a proper bug report that isolates this at the cause?
>>>
>>>
>>> We've been running into this issue on Chrome OS. crbug.com/256326 has
>>> additional
>>> details.  The issue manifests itself as a soft lockup.
>>>
>>> The kernel we've been seeing this on is 3.8.
>>>
>>> We have a pretty consistent repro currently.  Happy to try out other
>>> suggestions
>>> for a fix.
>>>
>>>>
>>>>
>>>> Thanks.
>>>
>>>
>>>
>>>
>>> --
>>> Sameer



-- 
Sameer
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/