lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090319044751.GA12918@localhost>
Date:	Thu, 19 Mar 2009 12:47:51 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Paul Evans <paul@...elecom.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: Slow long-term increase in dirty pages

On Wed, Mar 18, 2009 at 02:53:47PM +0000, Paul Evans wrote:
> We have a server whose dirty page count keeps increasing all the time,
> to the point where 'sync' takes ages to flush the pages:
> 
>   root@...ehand:~# time sync
> 
>   real    1m15.570s
>   user    0m0.000s
>   sys     0m0.052s
> 
> We have some graphs of the dirty page count, as captured
> from /proc/vmstat's "nr_dirty" entry:
> 
>   http://opensource.mxtelecom.com/tmp/freehand-dirty-day.png
>   http://opensource.mxtelecom.com/tmp/freehand-dirty-week.png
> 
> I have tuned the dirty page flushing sysctls to the following:
> 
>   root@...ehand:~# for F in /proc/sys/vm/dirty_*; do echo -n "$F: "; cat $F; done
>   /proc/sys/vm/dirty_background_ratio: 1
>   /proc/sys/vm/dirty_expire_centisecs: 3000
>   /proc/sys/vm/dirty_ratio: 3
>   /proc/sys/vm/dirty_writeback_centisecs: 500
> 
> The role of the machine itself is that it performings large amount of
> kernel iptables routing/firewalling traffic, and runs a set of apache
> servers as HTTP<->Tomcat gateways.
> 
>   root@...ehand:~# uname -r
>   2.6.27-fes
> 
> (this is a build of stock 2.6.27 source, with some extra iptables
> patches. There shouldn't be anything mm-related here)
> 
> By my understanding of the dirty page flush algorithm, we shouldn't be
> accumulating these pages all the time; any page older than 30 seconds
> ought to be written out, yes?
> 
> If we manually 'sync', as above, then the count drops to zero, but then
> slowly starts ramping up again as observed.
> 
> As a temporary workaround I've put 'sync' in cron every 10 minutes, but
> is there some more tuning I can do; or at least probing to see where
> these pages are being accumulated from?

Hi Paul,

The attached filecache patch shall help identify the dirty files/pages.

Usage:
        # run patched kernel
        modprobe filecache
        cat /proc/filecache

The dirty files will have the 'D' flag in its "state" field.

Thanks,
Fengguang

View attachment "filecache-2.6.27.patch" of type "text/x-diff" (34598 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ