linux-kernel - Re: [PATCH] mm,oom: fix oom invocation issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 18 May 2017 11:00:39 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:     guro@...com, hannes@...xchg.org, vdavydov.dev@...il.com,
        kernel-team@...com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm,oom: fix oom invocation issues

On Thu 18-05-17 10:47:29, Michal Hocko wrote:
> On Thu 18-05-17 07:03:36, Tetsuo Handa wrote:
> > Roman Gushchin wrote:
> > > On Wed, May 17, 2017 at 06:14:46PM +0200, Michal Hocko wrote:
> > > > On Wed 17-05-17 16:26:20, Roman Gushchin wrote:
> > > > [...]
> > > > > [   25.781882] Out of memory: Kill process 492 (allocate) score 899 or sacrifice child
> > > > > [   25.783874] Killed process 492 (allocate) total-vm:2052368kB, anon-rss:1894576kB, file-rss:4kB, shmem-rss:0kB
> > > > 
> > > > Are there any oom_reaper messages? Could you provide the full kernel log
> > > > please?
> > > 
> > > Sure. Sorry, it was too bulky, so I've cut the line about oom_reaper by mistake.
> > > Here it is:
> > > --------------------------------------------------------------------------------
> > > [   25.721494] allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null),  order=0, oom_score_adj=0
> > > [   25.725658] allocate cpuset=/ mems_allowed=0
> > 
> > > [   25.759892] Node 0 DMA32 free:44700kB min:44704kB low:55880kB high:67056kB active_anon:1944216kB inactive_anon:204kB active_file:592kB inactive_file:0kB unevictable:0kB writepending:304kB present:2080640kB managed:2031972kB mlocked:0kB slab_reclaimable:11336kB slab_unreclaimable:9784kB kernel_stack:1776kB pagetables:6932kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > 
> > > [   25.781882] Out of memory: Kill process 492 (allocate) score 899 or sacrifice child
> > > [   25.783874] Killed process 492 (allocate) total-vm:2052368kB, anon-rss:1894576kB, file-rss:4kB, shmem-rss:0kB
> > 
> > > [   25.785680] allocate: page allocation failure: order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null)
> > > [   25.786797] allocate cpuset=/ mems_allowed=0
> > 
> > This is a side effect of commit 9a67f6488eca926f ("mm: consolidate GFP_NOFAIL
> > checks in the allocator slowpath") which I noticed at
> > http://lkml.kernel.org/r/e7f932bf-313a-917d-6304-81528aca5994@I-love.SAKURA.ne.jp .
> 
> Hmm, I guess you are right. I haven't realized that pagefault_out_of_memory
> can race and pick up another victim. For some reason I thought that the
> page fault would break out on fatal signal pending but we don't do that (we
> used to in the past). Now that I think about that more we should
> probably remove out_of_memory out of pagefault_out_of_memory completely.
> It is racy and it basically doesn't have any allocation context so we
> might kill a task from a different domain. So can we do this instead?
> There is a slight risk that somebody might have returned VM_FAULT_OOM
> without doing an allocation but from my quick look nobody does that
> currently.

If this is considered too risky then we can do what Roman was proposing
and check tsk_is_oom_victim in pagefault_out_of_memory and bail out.
-- 
Michal Hocko
SUSE Labs