linux-kernel - Re: [PATCH] mm for fs: add truncate_pagecache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sun, 25 Mar 2012 14:55:36 -0700 (PDT)
From:	Hugh Dickins <hughd@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Hugh Dickins <hughd@...gle.com>,
	Christoph Hellwig <hch@...radead.org>,
	Theodore Ts'o <tytso@....edu>,
	Al Viro <viro@...iv.linux.org.uk>,
	Alex Elder <elder@...nel.org>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	Ben Myers <bpm@....com>, Dave Chinner <david@...morbit.com>,
	Joel Becker <jlbec@...lplan.org>,
	Mark Fasheh <mfasheh@...e.com>, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm for fs: add truncate_pagecache_range

On Sun, 25 Mar 2012, Andrew Morton wrote:
> On Sun, 25 Mar 2012 13:26:10 -0700 (PDT) Hugh Dickins <hughd@...gle.com> wrote:
> > truncate_pagecache_range() is just a drop-in replacement for
> > truncate_inode_pages_range(), and has no different locking needs.
> 
> Does anything prevent new pages from getting added to pagecache and
> perhaps faulted into VMAs after or during the execution of these
> functions?

If a page is faulted into a vma after the unmap_mapping_range() but
before truncate_inode_pages_range() reaches it, then it gets unmapped
by the fallback unmap_mapping_range(), called from truncate_inode_page()
while holding page lock.

A new page could be faulted in a moment after; but last year I did
change truncate_inode_pages_range() slightly, pinching down on the range
instead of just the ascending linear scan, so it doesn't return until
the range is empty of pages (excepting rcu races, which I think mean
there's no exact instant of return which all cpus would agree upon).

A new page could be faulted in a moment after that, and then it survives:
unlike in the truncation case, there's no equivalent of i_size to
determine whether to SIGBUS.  (But even in the truncation case, a
truncate or write to increase i_size may follow an instant later.)

Individual filesystems may impose additional constraints to guarantee
their own internal consistency; and tmpfs certainly finds inode->i_mutex
useful for that, to serialize between holepunch and truncate and write.
I wouldn't be surprised if other filesystems found it useful too,
but that's up to them - truncate_pagecache_range() doesn't need it.

> 
> Also, I wonder what prevents pages in the range from being dirtied
> between ext4_ext_punch_hole()'s filemap_write_and_wait_range() and
> truncate_inode_pages_range().

I'm not going to guess on that, or whether it matters: Ted?

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/