[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241104105711.mqk4of6frmsllarn@quack3>
Date: Mon, 4 Nov 2024 11:57:11 +0100
From: Jan Kara <jack@...e.cz>
To: Asahi Lina <lina@...hilina.net>
Cc: Dan Williams <dan.j.williams@...el.com>,
Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>,
Sergio Lopez Pascual <slp@...hat.com>,
linux-fsdevel@...r.kernel.org, nvdimm@...ts.linux.dev,
linux-kernel@...r.kernel.org, asahi@...ts.linux.dev
Subject: Re: [PATCH] dax: Allow block size > PAGE_SIZE
On Fri 01-11-24 21:22:31, Asahi Lina wrote:
> For virtio-dax, the file/FS blocksize is irrelevant. FUSE always uses
> large DAX blocks (2MiB), which will work with all host page sizes. Since
> we are mapping files into the DAX window on the host, the underlying
> block size of the filesystem and its block device (if any) are
> meaningless.
>
> For real devices with DAX, the only requirement should be that the FS
> block size is *at least* as large as PAGE_SIZE, to ensure that at least
> whole pages can be mapped out of the device contiguously.
>
> Fixes warning when using virtio-dax on a 4K guest with a 16K host,
> backed by tmpfs (which sets blksz == PAGE_SIZE on the host).
>
> Signed-off-by: Asahi Lina <lina@...hilina.net>
> ---
> fs/dax.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Well, I don't quite understand how just relaxing the check is enough. I
guess it may work with virtiofs (I don't know enough about virtiofs to
really tell either way) but for ordinary DAX filesystem it would be
seriously wrong if DAX was used with blocksize > pagesize as multiple
mapping entries could be pointing to the same PFN which is going to have
weird results. If virtiofs can actually map 4k subpages out of 16k page on
host (and generally perform 4k granular tracking etc.), it would seem more
appropriate if virtiofs actually exposed the filesystem 4k block size instead
of 16k blocksize? Or am I missing something?
Honza
> diff --git a/fs/dax.c b/fs/dax.c
> index c62acd2812f8d4981aaba82acfeaf972f555362a..406fb75bdbe9d17a6e4bf3d4cb92683e90f05910 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1032,7 +1032,7 @@ int dax_writeback_mapping_range(struct address_space *mapping,
> int ret = 0;
> unsigned int scanned = 0;
>
> - if (WARN_ON_ONCE(inode->i_blkbits != PAGE_SHIFT))
> + if (WARN_ON_ONCE(inode->i_blkbits < PAGE_SHIFT))
> return -EIO;
>
> if (mapping_empty(mapping) || wbc->sync_mode != WB_SYNC_ALL)
>
> ---
> base-commit: 81983758430957d9a5cb3333fe324fd70cf63e7e
> change-id: 20241101-dax-page-size-83a1073b4e1b
>
> Cheers,
> ~~ Lina
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists