lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 16 Aug 2016 16:10:10 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	arekm@...en.pl
Cc:	linux-ext4@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm, oom: report compaction/migration stats for higher
 order requests

On Tue 16-08-16 13:18:25, Arkadiusz Miskiewicz wrote:
> On Monday 15 of August 2016, Michal Hocko wrote:
> > [Fixing up linux-mm]
> > 
> > Ups I had a c&p error in the previous patch. Here is an updated patch.
> 
> 
> Going to apply this patch now and report again. I mean time what I have is a 
> 
>  while (true); do echo "XX date"; date; echo "XX SLAB"; cat /proc/slabinfo ; 
> echo "XX VMSTAT"; cat /proc/vmstat ; echo "XX free"; free; echo "XX DMESG"; 
> dmesg -T | tail -n 50; /bin/sleep 60;done 2>&1 | tee log
> 
> loop gathering some data while few OOM conditions happened.
> 
> I was doing "rm -rf copyX; cp -al original copyX" 10x in parallel.
> 
> https://ixion.pld-linux.org/~arekm/p2/ext4/log-20160816.txt

David was right when assuming it would be the ext4 inode cache which
consumes the large portion of the memory. /proc/slabinfo shows
ext4_inode_cache consuming between 2.5 to 4.6G of memory.

			first value	last-first
pgmigrate_success       1861785 	2157917
pgmigrate_fail  	335344  	1400384
compact_isolated        4106390 	5777027
compact_migrate_scanned 113962774       446290647
compact_daemon_wake     17039   	43981
compact_fail    	645     	1039
compact_free_scanned    381701557       793430119
compact_success 	217     	307
compact_stall   	862     	1346

which means that we have invoked compaction 1346 times and failed in
77% of cases. It is interesting to see that the migration wasn't all
that unsuccessful. We managed to migrate 1.5x more pages than failed. It
smells like the compaction just backs off. Could you try to test with
patch from http://lkml.kernel.org/r/20160816031222.GC16913@js1304-P5Q-DELUXE
please? Ideally on top of linux-next. You can add both the compaction
counters patch in the oom report and high order atomic reserves patch on
top.

Thanks
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ