lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zy0onj9R_VJnk17p@casper.infradead.org>
Date: Thu, 7 Nov 2024 20:52:46 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Dave Chinner <david@...morbit.com>
Cc: Asahi Lina <lina@...hilina.net>, Jan Kara <jack@...e.cz>,
	Dan Williams <dan.j.williams@...el.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Christian Brauner <brauner@...nel.org>,
	Sergio Lopez Pascual <slp@...hat.com>,
	linux-fsdevel@...r.kernel.org, nvdimm@...ts.linux.dev,
	linux-kernel@...r.kernel.org, asahi@...ts.linux.dev
Subject: Re: [PATCH] dax: Allow block size > PAGE_SIZE

On Tue, Nov 05, 2024 at 09:16:40AM +1100, Dave Chinner wrote:
> The DAX infrastructure needs the same changes for fsb > page size
> support. We have a limited number bits we can use for DAX entry
> state:
> 
> /*
>  * DAX pagecache entries use XArray value entries so they can't be mistaken
>  * for pages.  We use one bit for locking, one bit for the entry size (PMD)
>  * and two more to tell us if the entry is a zero page or an empty entry that
>  * is just used for locking.  In total four special bits.
>  *
>  * If the PMD bit isn't set the entry has size PAGE_SIZE, and if the ZERO_PAGE
>  * and EMPTY bits aren't set the entry is a normal DAX entry with a filesystem
>  * block allocation.
>  */
> #define DAX_SHIFT       (4)
> #define DAX_LOCKED      (1UL << 0)
> #define DAX_PMD         (1UL << 1)
> #define DAX_ZERO_PAGE   (1UL << 2)
> #define DAX_EMPTY       (1UL << 3)
> 
> I *think* that we have at most PAGE_SHIFT worth of bits we can
> use because we only store the pfn part of the pfn_t in the dax
> entry. There are PAGE_SHIFT high bits in the pfn_t that hold
> pfn state that we mask out.

We're a lot more constrained than that on 32-bit.  We support up to 40
bits of physical address on arm32 (well, the hardware supports it ...
Linux is not very good with that amount of physical space).  Assuming a
PAGE_SHIFT of 12, we've got 3 bits (yes, the current DAX doesn't support
the 40th bit on arm32).  Fortunately, we don't need more than that.

There are a set of encodings which don't seem to have a name (perhaps
I should name it after myself) that can encode any power-of-two that is
naturally aligned by using just one extra bit.  I've documented it here:

https://kernelnewbies.org/MatthewWilcox/NaturallyAlignedOrder

So we can just recycle the DAX_PMD bit as bit 0 of the encoding.
We can also reclaim DAX_EMPTY by using the "No object" encoding as
DAX_EMPTY.  So that gives us a bit back.

ie the functions I'd actually have in dax.c would be:

#define DAX_LOCKED	1
#define DAX_ZERO_PAGE	2

unsigned int dax_entry_order(void *entry)
{
	return ffsl(xa_to_value(entry) >> 2) - 1;
}

unsigned long dax_to_pfn(void *entry)
{
	unsigned long v = xa_to_value(entry) >> 2;
	return (v & (v - 1)) / 2;
}

void *dax_make_entry(pfn_t pfn, unsigned int order, unsigned long flags)
{
	VM_BUG_ON(pfn_t_to_pfn(pfn) & ((1UL << order) - 1) != 0);
	flags |= (4UL << order) | (pfn_t_to_pfn(pfn) * 8);
	return xa_mk_value(flags);
}

bool dax_is_empty_entry(void *entry)
{
	return (xa_to_value(entry) >> 2) == 0;
}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ