[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200614162650.GP8681@bombadil.infradead.org>
Date: Sun, 14 Jun 2020 09:26:50 -0700
From: Matthew Wilcox <willy@...radead.org>
To: linux-fsdevel@...r.kernel.org
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Hugh Dickins <hughd@...gle.com>
Subject: Re: [RFC v6 00/51] Large pages in the page cache
On Wed, Jun 10, 2020 at 01:12:54PM -0700, Matthew Wilcox wrote:
> Another fortnight, another dump of my current large pages work.
The generic/127 test has pointed out to me that range writeback is
broken by this patchset. Here's how (may not be exactly what's going on,
but it's close):
page cache allocates an order-2 page covering indices 40-43.
bytes are written, page is dirtied
test then calls fallocate(FALLOC_FL_COLLAPSE_RANGE) for a range which
starts in page 41.
XFS calls filemap_write_and_wait_range() which calls
__filemap_fdatawrite_range() which calls
do_writepages() which calls
iomap_writepages() which calls
write_cache_pages() which calls
tag_pages_for_writeback() which calls
xas_for_each_marked() starting at page 41. Which doesn't find page
41 because when we dirtied pages 40-43, we only marked index 40 as
being dirty.
Annoyingly, the XArray actually handles this just fine ... if we were
using multi-order entries, we'd find it. But we're still storing 2^N
entries for an order N page.
I can see two ways to fix this. One is to bite the bullet and do the
conversion of the page cache to use multi-order entries. The second
is to set and clear the marks on all entries. I'm concerned about the
performance of the latter solution. Not so bad for order-2 pages, but for
an order-9 page we have 520 bits to set, spread over 9 non-consecutive
cachelines. Also, I'm unenthusiastic about writing code that I want to
throw away as quickly as possible.
So unless somebody has a really good alternative idea, I'm going to
convert the page cache over to multi-order entries. This will have
several positive effects:
- Get DAX and regular page cache using the xarray in a more similar way
- Saves about 4.5kB of memory for every 2MB page in tmpfs/shmem
- Prep work for converting hugetlbfs to use the page cache the same
way as tmpfs
Powered by blists - more mailing lists