lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090429235623.GA17191@Krystal>
Date:	Wed, 29 Apr 2009 19:56:23 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Linus Torvalds <torvalds@...ux-foundation.org>,
	akpm@...ux-foundation.org, Nick Piggin <nickpiggin@...oo.com.au>
Cc:	Ingo Molnar <mingo@...e.hu>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>, thomas.pi@...or.dea,
	Yuriy Lalym <ylalym@...il.com>, linux-kernel@...r.kernel.org,
	ltt-dev@...ts.casi.polymtl.ca
Subject: Re: [PATCH] Fix dirty page accounting in
	redirty_page_for_writepage()

* Mathieu Desnoyers (mathieu.desnoyers@...ymtl.ca) wrote:
> Basically, the following execution :
> 
> dd if=/dev/zero of=/tmp/testfile
> 
> will slowly fill _all_ ram available without taking into account memory
> pressure.
> 
> This is because the dirty page accounting is incorrect in
> redirty_page_for_writepage.
> 
> This patch adds missing dirty page accounting in redirty_page_for_writepage().
> This should fix a _lot_ of issues involving machines becoming slow under heavy
> write I/O. No surprise : eventually the system starts swapping.
> 
> Linux kernel 2.6.30-rc2
> 
> The /proc/meminfo picture I had before applying this patch after filling my
> memory with the dd execution was :
> 
> MemTotal:       16433732 kB
> MemFree:        10919700 kB

Darn, I have not taken this meminfo snapshot at the appropriate moment.

I actually have to double-check if 2.6.30-rc still shows the bogus
behavior I identified in the 2.6.28-2.6.29 days. Then I'll check with
earlier 2.6.29.x. I know there has been some improvement on the ext3
side since then. I'll come back when I have those informations.

Sorry.

Mathieu

> Buffers:           12492 kB
> Cached:          5262508 kB
> SwapCached:            0 kB
> Active:            37096 kB
> Inactive:        5254384 kB
> Active(anon):      16716 kB
> Inactive(anon):        0 kB
> Active(file):      20380 kB
> Inactive(file):  5254384 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:      19535024 kB
> SwapFree:       19535024 kB
> Dirty:           2125956 kB
> Writeback:         50476 kB
> AnonPages:         16660 kB
> Mapped:             9560 kB
> Slab:             189692 kB
> SReclaimable:     166688 kB
> SUnreclaim:        23004 kB
> PageTables:         3396 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    27751888 kB
> Committed_AS:      53904 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:       10764 kB
> VmallocChunk:   34359726963 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:        3456 kB
> DirectMap2M:    16773120 kB
> 
> After applying my patch, the same test case steadily leaves between 8
> and 500MB ram free in the steady-state (when pressure is reached).
> 
> MemTotal:       16433732 kB
> MemFree:           85144 kB
> Buffers:           23148 kB
> Cached:         15766280 kB
> SwapCached:            0 kB
> Active:            51500 kB
> Inactive:       15755140 kB
> Active(anon):      15540 kB
> Inactive(anon):     1824 kB
> Active(file):      35960 kB
> Inactive(file): 15753316 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:      19535024 kB
> SwapFree:       19535024 kB
> Dirty:           2501644 kB
> Writeback:         33280 kB
> AnonPages:         17280 kB
> Mapped:             9272 kB
> Slab:             505524 kB
> SReclaimable:     485596 kB
> SUnreclaim:        19928 kB
> PageTables:         3396 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    27751888 kB
> Committed_AS:      54508 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:       10764 kB
> VmallocChunk:   34359726715 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:        3456 kB
> DirectMap2M:    16773120 kB
> 
> The pressure pattern I see with the patch applied is :
> (16GB ram total)
> 
> - Inactive(file) fills up to 15.7GB.
> - Dirty fills up to 1.7GB.
> - Writeback vary between 0 and 600MB
> 
> sync() behavior :
> 
> - Dirty down to ~6MB.
> - Writeback increases to 1.6GB, then shrinks down to ~0MB.
> 
> References :
> This insanely huge
> http://bugzilla.kernel.org/show_bug.cgi?id=12309
> [Bug 12309] Large I/O operations result in slow performance and high iowait times
> (yes, I've been in CC all along)
> 
> Special thanks to Linus Torvalds and Nick Piggin and Thomas Pi for their
> suggestions on previous patch iterations.
> 
> Special thanks to the LTTng community, which helped me getting LTTng up to its
> current usability level. It's been tremendously useful in understanding those
> problematic I/O workloads and generating fio test cases.
> 
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
> CC: Linus Torvalds <torvalds@...ux-foundation.org>
> CC: akpm@...ux-foundation.org
> CC: Nick Piggin <nickpiggin@...oo.com.au>
> CC: Ingo Molnar <mingo@...e.hu>
> CC: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> CC: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> CC: thomas.pi@...or.dea
> CC: Yuriy Lalym <ylalym@...il.com>
> ---
>  mm/page-writeback.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> Index: linux-2.6-lttng/mm/page-writeback.c
> ===================================================================
> --- linux-2.6-lttng.orig/mm/page-writeback.c	2009-04-29 18:14:48.000000000 -0400
> +++ linux-2.6-lttng/mm/page-writeback.c	2009-04-29 18:23:59.000000000 -0400
> @@ -1237,6 +1237,12 @@ int __set_page_dirty_nobuffers(struct pa
>  		if (!mapping)
>  			return 1;
>  
> +		/*
> +		 * Take care of setting back page accounting correctly.
> +		 */
> +		inc_zone_page_state(page, NR_FILE_DIRTY);
> +		inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
> +
>  		spin_lock_irq(&mapping->tree_lock);
>  		mapping2 = page_mapping(page);
>  		if (mapping2) { /* Race with truncate? */
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ