lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c0c067900911231357u16af88cahce74b755df791146@mail.gmail.com>
Date:	Mon, 23 Nov 2009 16:57:09 -0500
From:	Dan Merillat <dan.merillat@...il.com>
To:	Tomasz Chmielewski <mangoo@...g.org>
Cc:	linux-kernel@...r.kernel.org, Rik van Riel <riel@...hat.com>,
	Norbert Preining <preining@...ic.at>,
	Sven-Haegar Koch <haegar@...net.de>,
	Dave Chinner <david@...morbit.com>,
	KOSAKI Motohiro <kosaki.motohiro@...il.com>
Subject: Re: Linux 2.6.31 - very swap-happy with plenty of free RAM

On Thu, Nov 19, 2009 at 1:43 AM, Tomasz Chmielewski <mangoo@...g.org> wrote:

> What is also interesting, that a normal software RAID-1 sync (i.e. from a
> degraded state) does not seem to make any visible effect on system
> responsiveness.
>
> Uncompress a big tar file, or VM writes out lots of data - system becomes
> really unresponsive.

Setup:

XFS -> LV -> md0 (degraded) -> ST3500630AS  ver 3.AA  (500gb)
Basically no other disk activity while doing these tests.

barriers on:
$ time git checkout -f
Checking out files: 100% (29108/29108), done.

real	2m51.913s
user	0m3.128s
sys	0m3.004s
$ time rm -r *

real	1m52.562s
user	0m0.072s
sys	0m2.980s
$ sudo mount /usr/src -o remount,nobarrier
$ time git checkout -f
Checking out files: 100% (29108/29108), done.

real	0m9.782s
user	0m2.944s
sys	0m2.984s
$ time rm -r *

real	0m24.996s
user	0m0.076s
sys	0m2.808s

So XFS + barriers are part of the culprit here, but I was only using
xfs for /usr/src.  Reformatted that to ext4 after I found that little
nugget of joy.   Ext4+barriers isn't anywhere near as dramatic a hit,
it's a much more reasonable speed/safety tradeoff.   Again, that's
only for /usr/src, the rest of my system is using ext4, so it doesn't
explain any other workload problems.  btrfs doesn't seem to have much
of a hit at all using barriers but I'd need to test that properly,
it's below 6 seconds for checkout and the differences would be in the
noise without averaging multiple runs.

On Thu, Nov 19, 2009 at 9:36 AM, KOSAKI Motohiro
<kosaki.motohiro@...il.com> wrote:
> Hi Dan,
>
> Umm, very strange.
> I made two debug patch. can you please apply it and post following
> command output?
>
> % cat /proc/meminfo
> % cat /proc/vmstat
> % cat /proc/zoneinfo
> # cat /proc/filecache | sort -nr -k3 |head -30

Unfortunately not, it doesn't compile on 2.6.31 which means I'd have
to re-port vmware & fglrx just to test that.

  CC      fs/proc/filecache.o
fs/proc/filecache.c: In function ‘iwin_fill’:
fs/proc/filecache.c:108: error: ‘bdi_lock’ undeclared (first use in
this function)
fs/proc/filecache.c:108: error: (Each undeclared identifier is
reported only once
fs/proc/filecache.c:108: error: for each function it appears in.)
fs/proc/filecache.c:109: error: ‘struct backing_dev_info’ has no
member named ‘bdi_list’
fs/proc/filecache.c:109: warning: type defaults to ‘int’ in
declaration of ‘__mptr’
fs/proc/filecache.c:109: error: ‘bdi_list’ undeclared (first use in
this function)
fs/proc/filecache.c:109: error: ‘struct backing_dev_info’ has no
member named ‘bdi_list’
fs/proc/filecache.c:109: error: ‘struct backing_dev_info’ has no
member named ‘bdi_list’

And I can't forward port the patch without major work, since the whole
bdi_writeback structure was introduced post 2.6.31. The recent-rotated
patch works, I'll include that data as soon as I get the memory
pressure back up.

Right now it's behaving correctly after the reboot - the usual sign of
a problem is free memory going way up while swapping like mad.  I'm
putting a lot of memory pressure on the kernel post-reboot but swap is
behaving normally.   I hope this doesn't take multiple days of uptime
to get back into that state.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ