[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1262913752.2659.100.camel@localhost>
Date: Thu, 07 Jan 2010 20:22:32 -0500
From: Trond Myklebust <Trond.Myklebust@...app.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] Please pull NFS client bugfixes....
On Thu, 2010-01-07 at 17:12 -0800, Linus Torvalds wrote:
>
> On Thu, 7 Jan 2010, Trond Myklebust wrote:
> > >
> > > Because it means that you can trivially take page faults before the thing
> > > is validated (think threads).
> >
> > Which would mean that another process/thread already has part of the
> > file mmapped on the same client. I'm not arguing that have to revalidate
> > in _that_ case.
>
> No, I'm talking about the new mapping. Nothing else.
>
> If the mmap'ing thread releases mmap_sem, and then does the revalidate,
> then you can have
>
> thread1 thread2
> ------- -------
>
> mmap
> map it in
> release mmap_sem
> page-fault the mapping before it got validated
> ->post_mmap()
> revalidate outside mmap_sem
>
> See? No "already part of the file mmapped" case at all. The exact mmap
> that you just set up - without the revalidation having happened.
>
> In fact, because of this kind of _fundamental_ race, I don't see why I
> would ever accept any patches that add multiple mmap() down-calls at
> different phases to the filesystem at the VFS layer.
>
> A filesystem that depends on the different phases would be a fundamentally
> buggy filesystem. Right now mmap is "atomic", and you can pre-populate (or
> pre-verify, like NFS does) the mapping in the _knowledge_ that there are
> no page faults that will populate it concurrently. Exactly because we hold
> the mmap_sem for writing.
I don't think anyone has been advocating doing the revalidation _after_
the call to mmap_region(). All I want is to be able to do it as part of
the mmap() syscall. It would be quite OK to add a ->pre_mmap() (which is
what I believe Peter's patches do).
All I want to ensure is that people who use non-posix-lock based
synchronisation can set the 'noac' flag, and be assured that if mmap()
is called _after_ they have grabbed their lock, then the page cache will
be duly revalidated (under the lock), and the fresh data will be made
available.
Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists