linux-kernel - Re: [PATCH] vmalloc: back off only when the current task is OOM killed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 10 Oct 2017 21:47:02 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:     mhocko@...nel.org
Cc:     hannes@...xchg.org, akpm@...ux-foundation.org,
        alan@...yncelyn.cymru, hch@....de, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH] vmalloc: back off only when the current task is OOM killed

Michal Hocko wrote:
> On Tue 10-10-17 19:58:53, Tetsuo Handa wrote:
> > Commit 5d17a73a2ebeb8d1 ("vmalloc: back off when the current task is
> > killed") revealed two bugs [1] [2] that were not ready to fail vmalloc()
> > upon SIGKILL. But since the intent of that commit was to avoid unlimited
> > access to memory reserves, we should have checked tsk_is_oom_victim()
> > rather than fatal_signal_pending().
> > 
> > Note that even with commit cd04ae1e2dc8e365 ("mm, oom: do not rely on
> > TIF_MEMDIE for memory reserves access"), it is possible to trigger
> > "complete depletion of memory reserves"
> 
> How would that be possible? OOM victims are not allowed to consume whole
> reserves and the vmalloc context would have to do something utterly
> wrong like PF_MEMALLOC to make this happen. Protecting from such a code
> is simply pointless.

Oops. I was confused when writing that part.
Indeed, "complete" was demonstrated without commit cd04ae1e2dc8e365.

> 
> > and "extra OOM kills due to depletion of memory reserves"
> 
> and this is simply the case for the most vmalloc allocations because
> they are not reflected in the oom selection so if there is a massive
> vmalloc consumer it is very likely that we will kill a large part the
> userspace before hitting the user context on behalf which the vmalloc
> allocation is performed.

If there is a massive alloc_page() loop it is as well very likely that
we will kill a large part the userspace before hitting the user context
on behalf which the alloc_page() allocation is performed.

I think that massive vmalloc() consumers should be (as well as massive
alloc_page() consumers) careful such that they will be chosen as first OOM
victim, for vmalloc() does not abort as soon as an OOM occurs. Thus, I used
set_current_oom_origin()/clear_current_oom_origin() when I demonstrated
"complete" depletion.

> 
> I have tried to explain this is not really needed before but you keep
> insisting which is highly annoying. The patch as is is not harmful but
> it is simply _pointless_ IMHO.

Then, how can massive vmalloc() consumers become careful?
Explicitly use __vmalloc() and pass __GFP_NOMEMALLOC ?
Then, what about adding some comment like "Never try to allocate large
memory using plain vmalloc(). Use __vmalloc() with __GFP_NOMEMALLOC." ?