lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160308090818.GA13542@dhcp22.suse.cz>
Date:	Tue, 8 Mar 2016 10:08:18 +0100
From:	Michal Hocko <mhocko@...nel.org>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:	Hugh Dickins <hughd@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Mel Gorman <mgorman@...e.de>,
	David Rientjes <rientjes@...gle.com>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	Hillf Danton <hillf.zj@...baba-inc.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>,
	Joonsoo Kim <js1304@...il.com>,
	Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re:
 [PATCH 0/3] OOM detection rework v4)

On Tue 08-03-16 12:51:04, Sergey Senozhatsky wrote:
> Hello Michal,
> 
> On (03/07/16 17:08), Michal Hocko wrote:
> > On Mon 29-02-16 22:02:13, Michal Hocko wrote:
> > > Andrew,
> > > could you queue this one as well, please? This is more a band aid than a
> > > real solution which I will be working on as soon as I am able to
> > > reproduce the issue but the patch should help to some degree at least.
> > 
> > Joonsoo wasn't very happy about this approach so let me try a different
> > way. What do you think about the following? Hugh, Sergey does it help
> > for your load? I have tested it with the Hugh's load and there was no
> > major difference from the previous testing so at least nothing has blown
> > up as I am not able to reproduce the issue here.
> 
> (next-20160307 + "[PATCH] mm, oom: protect !costly allocations some more")
> 
> seems it's significantly less likely to oom-kill now, but I still can see
> something like this

Thanks for the testing. This is highly appreciated. If you are able to
reproduce this then collecting compaction related tracepoints might be
really helpful.

> [  501.942745] coretemp-sensor invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
[...]
> [  501.942853] active_anon:151312 inactive_anon:54791 isolated_anon:0
>                 active_file:31213 inactive_file:302048 isolated_file:0
>                 unevictable:0 dirty:44 writeback:221 unstable:0
>                 slab_reclaimable:43570 slab_unreclaimable:5651
>                 mapped:16660 shmem:29495 pagetables:2542 bounce:0
>                 free:10884 free_pcp:214 free_cma:0
[...]
> [  501.942867] DMA32 free:23664kB min:6232kB low:9332kB high:12432kB active_anon:516228kB inactive_anon:129136kB active_file:96508kB inactive_file:954780kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3194880kB managed:3107512kB mlocked:0kB dirty:136kB writeback:440kB mapped:51816kB shmem:91488kB slab_reclaimable:129856kB slab_unreclaimable:13876kB kernel_stack:2160kB pagetables:7888kB unstable:0kB bounce:0kB free_pcp:724kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no
> [  501.942870] lowmem_reserve[]: 0 0 824 824
> [  501.942876] Normal free:4784kB min:1696kB low:2540kB high:3384kB active_anon:89020kB inactive_anon:90028kB active_file:28248kB inactive_file:253308kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:917504kB managed:844512kB mlocked:0kB dirty:40kB writeback:444kB mapped:14700kB shmem:26492kB slab_reclaimable:44396kB slab_unreclaimable:8620kB kernel_stack:1328kB pagetables:2280kB unstable:0kB bounce:0kB free_pcp:244kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:60 all_unreclaimable? no

Both DMA32 and Normal zones are over high watermarks so this OOM is due
to the memory fragmentation.

> [  501.942912] DMA32: 564*4kB (UME) 2700*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23856kB
> [  501.942921] Normal: 959*4kB (ME) 128*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4860kB

There are no order-2+ pages usable even after we know that the
compaction was active and didn't back out early. I might be missing
something of course and the patch might still be tweaked to be more
conservative. Tracepoints should tell us more though.

Thanks!
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ