[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zy0onj9R_VJnk17p@casper.infradead.org>
Date: Thu, 7 Nov 2024 20:52:46 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Dave Chinner <david@...morbit.com>
Cc: Asahi Lina <lina@...hilina.net>, Jan Kara <jack@...e.cz>,
Dan Williams <dan.j.williams@...el.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>,
Sergio Lopez Pascual <slp@...hat.com>,
linux-fsdevel@...r.kernel.org, nvdimm@...ts.linux.dev,
linux-kernel@...r.kernel.org, asahi@...ts.linux.dev
Subject: Re: [PATCH] dax: Allow block size > PAGE_SIZE
On Tue, Nov 05, 2024 at 09:16:40AM +1100, Dave Chinner wrote:
> The DAX infrastructure needs the same changes for fsb > page size
> support. We have a limited number bits we can use for DAX entry
> state:
>
> /*
> * DAX pagecache entries use XArray value entries so they can't be mistaken
> * for pages. We use one bit for locking, one bit for the entry size (PMD)
> * and two more to tell us if the entry is a zero page or an empty entry that
> * is just used for locking. In total four special bits.
> *
> * If the PMD bit isn't set the entry has size PAGE_SIZE, and if the ZERO_PAGE
> * and EMPTY bits aren't set the entry is a normal DAX entry with a filesystem
> * block allocation.
> */
> #define DAX_SHIFT (4)
> #define DAX_LOCKED (1UL << 0)
> #define DAX_PMD (1UL << 1)
> #define DAX_ZERO_PAGE (1UL << 2)
> #define DAX_EMPTY (1UL << 3)
>
> I *think* that we have at most PAGE_SHIFT worth of bits we can
> use because we only store the pfn part of the pfn_t in the dax
> entry. There are PAGE_SHIFT high bits in the pfn_t that hold
> pfn state that we mask out.
We're a lot more constrained than that on 32-bit. We support up to 40
bits of physical address on arm32 (well, the hardware supports it ...
Linux is not very good with that amount of physical space). Assuming a
PAGE_SHIFT of 12, we've got 3 bits (yes, the current DAX doesn't support
the 40th bit on arm32). Fortunately, we don't need more than that.
There are a set of encodings which don't seem to have a name (perhaps
I should name it after myself) that can encode any power-of-two that is
naturally aligned by using just one extra bit. I've documented it here:
https://kernelnewbies.org/MatthewWilcox/NaturallyAlignedOrder
So we can just recycle the DAX_PMD bit as bit 0 of the encoding.
We can also reclaim DAX_EMPTY by using the "No object" encoding as
DAX_EMPTY. So that gives us a bit back.
ie the functions I'd actually have in dax.c would be:
#define DAX_LOCKED 1
#define DAX_ZERO_PAGE 2
unsigned int dax_entry_order(void *entry)
{
return ffsl(xa_to_value(entry) >> 2) - 1;
}
unsigned long dax_to_pfn(void *entry)
{
unsigned long v = xa_to_value(entry) >> 2;
return (v & (v - 1)) / 2;
}
void *dax_make_entry(pfn_t pfn, unsigned int order, unsigned long flags)
{
VM_BUG_ON(pfn_t_to_pfn(pfn) & ((1UL << order) - 1) != 0);
flags |= (4UL << order) | (pfn_t_to_pfn(pfn) * 8);
return xa_mk_value(flags);
}
bool dax_is_empty_entry(void *entry)
{
return (xa_to_value(entry) >> 2) == 0;
}
Powered by blists - more mailing lists