linux-kernel - Re: patch: improve generic_file_buffered

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200709081731.51263.nickpiggin@yahoo.com.au>
Date:	Sat, 8 Sep 2007 17:31:50 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Goswin von Brederlow <brederlo@...ormatik.uni-tuebingen.de>
Cc:	Bernd Schubert <bs@...eap.de>,
	Randy Dunlap <randy.dunlap@...cle.com>,
	linux-kernel@...r.kernel.org,
	"J. Bruce Fields" <bfields@...ldses.org>, brian@...sterfs.com
Subject: Re: patch: improve generic_file_buffered_write() (2nd try 1/2)

On Saturday 08 September 2007 17:25, Nick Piggin wrote:
> On Saturday 08 September 2007 07:12, Goswin von Brederlow wrote:
> > Nick Piggin <nickpiggin@...oo.com.au> writes:
> > > On Saturday 08 September 2007 06:01, Goswin von Brederlow wrote:
> > >>   b) a segment boundary
> > >
> > > This is done, as I said, because of the deadlock issue. While the issue
> > > is more completely fixed in -mm, a special case for kernel memory (eg.
> > > nfsd) is in the latest mainline kernels.
> >
> > Can you tell me where to get the fix from -mm? If it is completly
> > fixed there then that could make our patch obsolete.
>
> In the latest -mm series file, they start at
> mm-revert-kernel_ds-buffered-write-optimisation.patch
> ...
> and go to
> ocfs2-convert-to-new-aops.patch
>
> > >> What actually locks the page? Is it __grab_cache_page or
> > >> a_ops->prepare_write?
> > >
> > > prepare_write must be given a locked page.
> >
> > Then that means __grab_cache_page does return a locked page because
> > there is nothing between the two calls that would.
>
> That's right.
>
> > > No it would be included earlier. The "segment_eq" check should be
> > > allowing kernel writes (nfsd) to write multiple segments. If you have a
> > > patch which changes this significantly, then it would indicate the
> > > existing logic has a problem (or you've got a userspace application
> > > doing the writev, which should be fixed by the write_begin patches in
> > > -mm).
> >
> > I've got userspace application doing the writev. To be exact 14% of
> > the commits were saved by combining multiple segments into a single
> > prepare/write pair. Since the kernel segments don't fragment anymore
> > in 2.6.23-rc5 those savings must come from user space stuff.
> >
> > From the stats posted earlier you can see that there is a substantial
> > amount of calls with 6 segments all (alot) smaller than a page. Lots
> > of calls our patch or the write_begin/end will save.
>
> OK. The write_begin/write_end patchset is intrusive, no question. I'm not
> sure what you're intending to do with it. They have been tested in -mm for
> quite a while now, but just going with a simple patch that tries to copy
> more segments might be OK for you if you're backporting. The deadlock is
> pretty uncommon.

Lustre should probably have to be ported over to write_begin/write_end in
order to use it too. With the patches in -mm, if a filesystem is still using
prepare_write/commit_write, the vm reverts to a safe path which avoids
the deadlock (and allows multi-seg io copies), but copies the data twice.

OTOH, this is very likely to go upstream, so your filesystem will need to be
ported over sooner or later anyway.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/