linux-kernel - Re: patch: improve generic_file_buffered

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87odgeckjf.fsf@informatik.uni-tuebingen.de>
Date:	Fri, 07 Sep 2007 22:01:08 +0200
From:	Goswin von Brederlow <brederlo@...ormatik.uni-tuebingen.de>
To:	Nick Piggin <nickpiggin@...oo.com.au>
Cc:	Bernd Schubert <bs@...eap.de>,
	Randy Dunlap <randy.dunlap@...cle.com>,
	linux-kernel@...r.kernel.org,
	"J. Bruce Fields" <bfields@...ldses.org>, brian@...sterfs.com,
	Goswin von Brederlow <brederlo@...ormatik.uni-tuebingen.de>
Subject: Re: patch: improve generic_file_buffered_write() (2nd try 1/2)

Nick Piggin <nickpiggin@...oo.com.au> writes:

> On Thursday 06 September 2007 03:41, Bernd Schubert wrote:
> Minor nit: when resubmitting a patch, you should include everything
> (ie. the full changelog of problem statement and fix description) in a
> single mail. It's just a bit easier...

Will do next time.

> So I believe the problem is that for a multi-segment iovec, we currently
> prepare_write/commit_write once for each segment, right? We do this

It is more complex.

Currently a __grab_cache_page, a_ops->prepare_write,
filemap_copy_from_user[_iovec] and a_ops->commit_write is done
whenever we hit

  a) a page boundary
  b) a segment boundary

Those two cases don't have to, and from the stats basically never,
coincide. For NFSd this means we do this TWICE per segment and TWICE
per page.

> because there is a nasty deadlock in the VM (copy_from_user being
> called with a page locked), and copying multiple segs dramatically
> increases the chances that one of these copies will cause a page fault
> and thus potentially deadlock.

What actually locks the page? Is it __grab_cache_page or
a_ops->prepare_write?

Note that the patch does not change the number of copy_from_user calls
being made nor does it change their arguments. If we need 2 (or more)
segments to fill a page we still do 2 seperate calls to
filemap_copy_from_user_iovec, both only spanning (part of) one
segment.

What the patch changes is the number of copy_from_user calls between
__grab_cache_page and a_ops->commit_write.

Copying a full PAGE_SIZE bytes from multiple segments in one go would
be a further improvement if that is possible.

> The fix you have I don't think can work because a filesystem must be
> notified of the modification _before_ it has happened. (If I understand
> correctly, you are skipping the prepare_write potentially until after
> some data is copied?).

Yes. We changed the order of copy_from_user calls and
a_ops->prepare_write by mistake. We will rectify that and do the
prepare_write for the full page (when possible) before copying the
data into the page.

> Anyway, there are fixes for this deadlock in Andrew's -mm tree, but
> also a workaround for the NFSD problem in git commit 29dbb3fc. Did
> you try a later kernel to see if it is fixed there?

Later than 2.6.23-rc5?

> Thanks,
> Nick

MfG
        Goswin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/