lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20091030141811.1c77571b.akpm@linux-foundation.org>
Date:	Fri, 30 Oct 2009 14:18:11 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Jeff Moyer <jmoyer@...hat.com>
Cc:	linux-kernel@...r.kernel.org, linux-aio@...ck.org,
	zach.brown@...cle.com
Subject: Re: [patch] aio: Don't zero out the pages array inside struct dio

On Fri, 30 Oct 2009 09:39:55 -0400
Jeff Moyer <jmoyer@...hat.com> wrote:

> Hi,
> 
> Intel reported a performance regression caused by the following commit:
> 
> commit 848c4dd5153c7a0de55470ce99a8e13a63b4703f
> Author: Zach Brown <zach.brown@...cle.com>
> Date:   Mon Aug 20 17:12:01 2007 -0700
> 
>     dio: zero struct dio with kzalloc instead of manually
> 
>     This patch uses kzalloc to zero all of struct dio rather than
>     manually trying to track which fields we rely on being zero.  It
>     passed aio+dio stress testing and some bug regression testing on
>     ext3.
> 
>     This patch was introduced by Linus in the conversation that lead up
>     to Badari's minimal fix to manually zero .map_bh.b_state in commit:
> 
>       6a648fa72161d1f6468dabd96c5d3c0db04f598a
> 
>     It makes the code a bit smaller.  Maybe a couple fewer cachelines to
>     load, if we're lucky:
> 
>        text    data     bss     dec     hex filename
>     3285925  568506 1304616 5159047  4eb887 vmlinux
>     3285797  568506 1304616 5158919  4eb807 vmlinux.patched
> 
>     I was unable to measure a stable difference in the number of cpu
>     cycles spent in blockdev_direct_IO() when pushing aio+dio 256K reads
>     at ~340MB/s.
> 
>     So the resulting intent of the patch isn't a performance gain but to
>     avoid exposing ourselves to the risk of finding another field like
>     .map_bh.b_state where we rely on zeroing but don't enforce it in the
>     code.
> 
> Zach surmised that zeroing out the page array was what caused most of
> the problem, and suggested the approach taken in the attached patch for
> resolving the issue.  Intel re-tested with this patch and saw a 0.6%
> performance gain (the original regression was 0.5%).
> 
> Comments, as always, are appreciated.
> 

You forgot something:

--- a/fs/direct-io.c~aio-dont-zero-out-the-pages-array-inside-struct-dio-fix
+++ a/fs/direct-io.c
@@ -130,6 +130,12 @@ struct dio {
 	unsigned head;			/* next page to process */
 	unsigned tail;			/* last valid page + 1 */
 	int page_errors;		/* errno from get_user_pages() */
+
+	/*
+	 * pages[] (and any fields placed after it) are not zeroed out at
+	 * allocation time.  Don't add new fields after pages[] unless you
+	 * wish that they not be zeroed.
+	 */
 	struct page *pages[DIO_PAGES];	/* page buffer */
 };
 
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ