linux-kernel - Re: [PATCH 0/3] OOM detection rework v4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201603120220.GFJ00000.QOLVOtJOMFFSHF@I-love.SAKURA.ne.jp>
Date:	Sat, 12 Mar 2016 02:20:36 +0900
From:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:	mhocko@...nel.org
Cc:	akpm@...ux-foundation.org, torvalds@...ux-foundation.org,
	hannes@...xchg.org, mgorman@...e.de, rientjes@...gle.com,
	hillf.zj@...baba-inc.com, kamezawa.hiroyu@...fujitsu.com,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] OOM detection rework v4

Michal Hocko wrote:
> On Sat 12-03-16 01:49:26, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > What happens without this patch applied. In other words, it all smells
> > > like the IO got stuck somewhere and the direct reclaim cannot perform it
> > > so we have to wait for the flushers to make a progress for us. Are those
> > > stuck? Is the IO making any progress at all or it is just too slow and
> > > it would finish actually.  Wouldn't we just wait somewhere else in the
> > > direct reclaim path instead.
> > 
> > As of next-20160311, CPU usage becomes 0% when this problem occurs.
> > 
> > If I remove
> > 
> >   mm-use-watermak-checks-for-__gfp_repeat-high-order-allocations-checkpatch-fixes
> >   mm: use watermark checks for __GFP_REPEAT high order allocations
> >   mm: throttle on IO only when there are too many dirty and writeback pages
> >   mm-oom-rework-oom-detection-checkpatch-fixes
> >   mm, oom: rework oom detection
> > 
> > then CPU usage becomes 60% and most of allocating tasks
> > are looping at
> > 
> >         /*
> >          * Acquire the oom lock.  If that fails, somebody else is
> >          * making progress for us.
> >          */
> >         if (!mutex_trylock(&oom_lock)) {
> >                 *did_some_progress = 1;
> >                 schedule_timeout_uninterruptible(1);
> >                 return NULL;
> >         }
> > 
> > in __alloc_pages_may_oom() (i.e. OOM-livelock due to the OOM reaper disabled).
> 
> OK, that would suggest that the oom rework patches are not really
> related. They just moved from the livelock to a sleep which is good in
> general IMHO. We even know that it is most probably the IO that is the
> problem because we know that more than half of the reclaimable memory is
> either dirty or under writeback. That is where you should be looking.
> Why the IO is not making progress or such a slow progress.
> 

Excuse me, but I can't understand why you think the oom rework patches are not
related. This problem occurs immediately after the OOM killer is invoked, which
means that there is little reclaimable memory.

  Node 0 DMA32 free:3648kB min:3780kB low:4752kB high:5724kB active_anon:783216kB inactive_anon:6376kB active_file:33388kB inactive_file:40292kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:1032064kB mana\
ged:980816kB mlocked:0kB dirty:40232kB writeback:120kB mapped:34720kB shmem:6628kB slab_reclaimable:10528kB slab_unreclaimable:39068kB kernel_stack:20512kB pagetables:8000kB unstable:0kB bounce:0kB free_pcp:1648kB local_pcp:116kB free_c\
ma:0kB writeback_tmp:0kB pages_scanned:964952 all_unreclaimable? yes
  Node 0 DMA32: 860*4kB (UME) 16*8kB (UME) 1*16kB (M) 0*32kB 1*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3648kB

The OOM killer is invoked (but nothing happens due to TIF_MEMDIE) if I remove
the oom rework patches, which means that there is little reclaimable memory.

My understanding is that memory allocation requests needed for doing I/O cannot
be satisfied because free: is below min: . And since kswapd got stuck, nobody can
perform operations needed for making 2*(writeback + dirty) > reclaimable false.