[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wir89LPH6A4H2hkxVXT20+dpcw2qQq0GtQJvy87ARga-g@mail.gmail.com>
Date: Mon, 21 Sep 2020 09:20:25 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Jan Kara <jack@...e.cz>
Cc: Dave Chinner <david@...morbit.com>,
Hugh Dickins <hughd@...gle.com>,
Amir Goldstein <amir73il@...il.com>,
Andreas Gruenbacher <agruenba@...hat.com>,
Theodore Tso <tytso@....edu>,
Martin Brandenburg <martin@...ibond.com>,
Mike Marshall <hubcap@...ibond.com>,
Damien Le Moal <damien.lemoal@....com>,
Jaegeuk Kim <jaegeuk@...nel.org>,
Qiuyang Sun <sunqiuyang@...wei.com>,
linux-xfs <linux-xfs@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux MM <linux-mm@...ck.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Matthew Wilcox <willy@...radead.org>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Andrew Morton <akpm@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>, nborisov@...e.de
Subject: Re: More filesystem need this fix (xfs: use MMAPLOCK around filemap_map_pages())
On Mon, Sep 21, 2020 at 2:11 AM Jan Kara <jack@...e.cz> wrote:
>
> Except that on truncate, we have to unmap these
> anonymous pages in private file mappings as well...
I'm actually not 100% sure we strictly would need to care.
Once we've faulted in a private file mapping page, that page is
"ours". That's kind of what MAP_PRIVATE means.
If we haven't written to it, we do keep things coherent with the file,
but that's actually not required by POSIX afaik - it's a QoI issue,
and a lot of (bad) Unixes didn't do it at all.
So as long as truncate _clears_ the pages it truncates, I think we'd
actually be ok.
The SIGBUS is supposed to happen, but that's really only relevant for
the _first_ access. Once we've accessed the page, and have it mapped,
the private part really means that there are no guarantees it stays
coherent.
In particular, obviously if we've written to a page, we've lost the
association with the original file entirely. And I'm pretty sure that
a private mapping is allowed to act as if it was a private copy
without that association in the first place.
That said, this _is_ a QoI thing, and in Linux we've generally tried
quite hard to stay as coherent as possible even with private mappings.
In fact, before we had real shared file mappings (in a distant past,
long long ago), we allowed read-only shared mappings because we
internally turned them into read-only private mappings and our private
mappings were coherent.
And those "fake" read-only shared mappings actually were much better
than some other platforms "real" shared mappings (*cough*hpux*cough*)
and actually worked with things that mixed "write()" and "mmap()" and
expected coherency.
Admittedly the only case I'm aware of that did that was nntpd or
something like that. Exactly because a lot of Unixes were *not*
coherent (either because they had actual hardware cache coherency
issues, or because their VM was not set up for it).
Linus
Powered by blists - more mailing lists