[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aRzYvYCLW66Zhcda@redhat.com>
Date: Tue, 18 Nov 2025 15:36:13 -0500
From: Benjamin Marzinski <bmarzins@...hat.com>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@...il.com>,
Alasdair Kergon <agk@...hat.com>, DMML <dm-devel@...ts.linux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Snitzer <snitzer@...hat.com>, Christoph Hellwig <hch@....de>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
>
>
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
>
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > >
> > >
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > >
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > >
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > >
> > > Hi
> > >
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > users would also benefit from the fix.
> >
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> >
> > In dm_bufio_client_create() I think we want to make sure that block_size
> > is a multiple of bdev_logical_block_size(bdev), instead of 512b.
>
> I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to
> dm_bufio_client_create. But I think it's too late in this development
> cycle, I would add it after the next merge window closes, when I open a
> new patch series for the kernel 6.20 (or 7.0).
>
> > Otherwise block_to_sector() can return sectors that are not addressable
> > on the device. Unfortunatley, I don't think all users of dm-bufio will
> > pass in block_sizes that are larger than 4k (uds_make_bufio() in
> > dm-vdp/indexer/io-factory.c for instance).
> >
> > -Ben
> >
> > > Please try this patch - does it fix it?
> > >
> > > Mikulas
>
> I changed the patch below, so that it aligns write bios on
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
> greater than logical block size, the writes are aligned so that the device
> doesn't do read-modify-write.
This will really only help if the bufio client block_size is a multiple
of the underlying device's physical block size, and the device is
aligned to the physical block size. Perhaps we should figure
out the alignment in dm_bufio_client_create(), with something like:
c->align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(bdev));
if (block_size & -bdev_physical_block_size(bdev) &&
bdev_alignment_offset(bdev) == 0)
c->align = bdev_physical_block_size(bdev);
I suppose pre-calculating this could cause problems if the underlying
device was another dm device, and it switched tables in a way that
changed its limits. I dunno if we care about that, however.
-Ben
> Mikulas
>
> > > From: Mikulas Patocka <mpatocka@...hat.com>
> > >
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > >
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@...hat.com>
> > > Cc: stable@...r.kernel.org
> > >
> > > ---
> > > drivers/md/dm-bufio.c | 9 +++++----
> > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > {
> > > unsigned int n_sectors;
> > > sector_t sector;
> > > - unsigned int offset, end;
> > > + unsigned int offset, end, align;
> > >
> > > b->end_io = end_io;
> > >
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > b->c->write_callback(b);
> > > offset = b->write_start;
> > > end = b->write_end;
> > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > + offset &= -align;
> > > + end += align - 1;
> > > + end &= -align;
> > > if (unlikely(end > b->c->block_size))
> > > end = b->c->block_size;
> > >
> > >
> >
Powered by blists - more mailing lists