[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221101235144.06a3dbd3@xps.demsh.org>
Date: Tue, 1 Nov 2022 23:51:44 +0300
From: Dmitrii Tcvetkov <me@...sh.org>
To: Keith Busch <kbusch@...nel.org>
Cc: Jens Axboe <axboe@...nel.dk>, Song Liu <song@...nel.org>,
linux-raid@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [bisected] RAID1 direct IO redirecting sector loop since 6.0
On Tue, 1 Nov 2022 11:22:21 -0600
Keith Busch <kbusch@...nel.org> wrote:
> On Tue, Nov 01, 2022 at 12:15:58AM +0300, Dmitrii Tcvetkov wrote:
> >
> > # cat /proc/7906/stack
> > [<0>] submit_bio_wait+0xdb/0x140
> > [<0>] blkdev_direct_IO+0x62f/0x770
> > [<0>] blkdev_read_iter+0xc1/0x140
> > [<0>] vfs_read+0x34e/0x3c0
> > [<0>] __x64_sys_pread64+0x74/0xc0
> > [<0>] do_syscall_64+0x6a/0x90
> > [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0xb5
> >
> > After "mdadm --fail" invocation the last line becomes:
> > [pid 7906] pread64(13, 0x627c34c8d200, 4096, 0) = -1 EIO
> > (Input/output error)
>
> It looks like something isn't accounting for the IO size correctly
> when there's an offset. It may be something specific to one of the
> stacking drivers in your block setup. Does this still happen without
> the cryptosetup step?
>
I created setup lvm(mdraid(gpt(HDD))):
# lsblk -t -a
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
...
sdd 0 512 0 512 512 1 bfq 64 128 0B
├─sdd3 0 512 0 512 512 1 bfq 64 128 0B
│ └─md1 0 512 0 512 512 1 128 128 0B
│ ├─512lvmraid-zfs 0 512 0 512 512 1 128 128 0B
│ └─512lvmraid-wrk 0 512 0 512 512 1 128 128 0B
sde 0 512 0 512 512 1 bfq 64 128 0B
├─sde3 0 512 0 512 512 1 bfq 64 128 0B
│ └─md1 0 512 0 512 512 1 128 128 0B
│ ├─512lvmraid-zfs 0 512 0 512 512 1 128 128 0B
│ └─512lvmraid-wrk 0 512 0 512 512 1 128 128 0B
where:
# mdadm --create --level=1 --metadata=1.2 \
--raid-devices=2 /dev/md1 /dev/sdd3 /dev/sde3
# pvcreate /dev/md1
# vgcreate 512lvmraid /dev/md2
In this case problem doesn't reproduce, both guests start successfully.
It also doesn't reproduce with 4096 sector loop:
# lsblk -t -a
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
loop0 0 4096 0 4096 4096 0 none 128 128 0B
└─md2 0 4096 0 4096 4096 0 128 128 0B
├─4096lvmraid-zfs 0 4096 0 4096 4096 0 128 128 0B
└─4096lvmraid-wrk 0 4096 0 4096 4096 0 128 128 0B
loop1 0 4096 0 4096 4096 0 none 128 128 0B
└─md2 0 4096 0 4096 4096 0 128 128 0B
├─4096lvmraid-zfs 0 4096 0 4096 4096 0 128 128 0B
└─4096lvmraid-wrk 0 4096 0 4096 4096 0 128 128 0B
where:
# losetup --sector-size 4096 -f /dev/sdd4
# losetup --sector-size 4096 -f /dev/sde4
# mdadm --create --level=1 --metadata=1.2 \
--raid-devices=2 /dev/md2 /dev/loop0 /dev/loop1
# pvcreate /dev/md2
# vgcreate 4096lvmraid /dev/md2
Indeed then something is wrong in LUKS.
> For a different experiment, it may be safer to just force all
> alignment for stacking drivers. Could you try the following and see
> if that gets it working again?
>
> ---
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 8bb9eef5310e..5c16fdb00c6f 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -646,6 +646,7 @@ int blk_stack_limits(struct queue_limits *t,
> struct queue_limits *b, t->misaligned = 1;
> ret = -1;
> }
> + blk_queue_dma_alignment(t, t->logical_block_size - 1);
>
> t->max_sectors = blk_round_down_sectors(t->max_sectors,
> t->logical_block_size); t->max_hw_sectors =
> blk_round_down_sectors(t->max_hw_sectors, t->logical_block_size); --
This doesn't compile:
CC block/blk-settings.o
block/blk-settings.c: In function ‘blk_stack_limits’:
block/blk-settings.c:649:33: error: passing argument 1 of ‘blk_queue_dma_alignment’ from incompatible pointer type [-Werror=incompatible-pointer-types]
649 | blk_queue_dma_alignment(t, t->logical_block_size - 1);
| ^
| |
| struct queue_limits *
In file included from block/blk-settings.c:9:
./include/linux/blkdev.h:956:37: note: expected ‘struct request_queue *’ but argument is of type ‘struct queue_limits *’
956 | extern void blk_queue_dma_alignment(struct request_queue *, int);
I didn't find obvious way to get a request_queue pointer, which corresponds to struct queue_limits *t.
Powered by blists - more mailing lists