[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e6a7e68ff9e23aee448003ee1a279a4ab13192c0.camel@physik.fu-berlin.de>
Date: Mon, 06 Oct 2025 15:34:22 +0200
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Andreas Larsson <andreas@...sler.com>, Anthony Yznaga
<anthony.yznaga@...cle.com>, Sam James <sam@...too.org>, "David S . Miller"
<davem@...emloft.net>, Michael Karcher
<kernel@...rcher.dialup.fu-berlin.de>, sparclinux@...r.kernel.org
Subject: Re: [PATCH v2] Revert "sunvdc: Do not spin in an infinite loop when
vio_ldc_send() returns EAGAIN"
Hi Jens,
On Mon, 2025-10-06 at 07:19 -0600, Jens Axboe wrote:
> > The problem is that giving up can lead to filesystem corruption which
> > is problem that was never observed before the change from what I know.
>
> Yes, I get that.
>
> > We have deployed a kernel with the change reverted on several LDOMs that
> > are seeing heavy use such as cfarm202.cfarm.net and we have seen any system
> > lock ups or similar.
>
> I believe you. I'm just wondering how long you generally need to spin,
> as per the question above: how many times does it generally spin where
> it would've failed before?
I don't have any data on that, unfortunately. All I can say that we have seen
filesystem corruption when installing Linux inside an LDOM and this particular
issue was eventually tracked down to this commit.
> > > Not that it's _really_ that important as this is a pretty niche driver,
> > > but still pretty ugly... Does it work reliably with a limit of 100
> > > spins? If things get truly stuck, spinning forever in that loop is not
> > > going to help. In any case, your patch should have
> >
> > Isn't it possible that the call to vio_ldc_send() will eventually succeed
> > which is why there is no need to abort in __vdc_tx_trigger()?
>
> Of course. Is it also possible that some conditions will lead it to
> never succeed?
I would assume that this would require for the virtual disk server to not
respond which should never happen since it's a virtualized environment.
If hardware issues would cause vio_ldc_send() to never succeed, these would
have to be handled by the virtualization environment, I guess.
> The nicer approach would be to have sunvdc punt retries back up to the
> block stack, which would at least avoid a busy spin for the condition.
> Rather than return BLK_STS_IOERR which terminates the request and
> bubbles back the -EIO as per your log. IOW, if we've already spent
> 10.5ms in that busy loop as per the above rough calculation, perhaps
> we'd be better off restarting the queue and hence this operation after a
> small sleeping delay instead. That would seem a lot saner than hammering
> on it.
I generally agree with this remark. I just wonder whether this behavior should
apply for a logical domain as well. I guess if a request doesn't succeed immediately,
it's an urgent problem if the logical domain locks up, is it?
I have to admit though that I'm absolutely not an expert on the block layer.
> > And unlike the change in adddc32d6fde ("sunvnet: Do not spin in an infinite
> > loop when vio_ldc_send() returns EAGAIN"), we can't just drop data as this
> > driver concerns a block device while the other driver concerns a network
> > device. Dropping network packages is expected, dropping bytes from a block
> > device driver is not.
>
> Right, but we can sanely retry it rather than sit in a tight loop.
> Something like the entirely untested below diff.
>
> diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c
> index db1fe9772a4d..aa49dffb1b53 100644
> --- a/drivers/block/sunvdc.c
> +++ b/drivers/block/sunvdc.c
> @@ -539,6 +539,7 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
> struct vdc_port *port = hctx->queue->queuedata;
> struct vio_dring_state *dr;
> unsigned long flags;
> + int ret;
>
> dr = &port->vio.drings[VIO_DRIVER_TX_RING];
>
> @@ -560,7 +561,13 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
> return BLK_STS_DEV_RESOURCE;
> }
>
> - if (__send_request(bd->rq) < 0) {
> + ret = __send_request(bd->rq);
> + if (ret == -EAGAIN) {
> + spin_unlock_irqrestore(&port->vio.lock, flags);
> + /* already spun for 10msec, defer 10msec and retry */
> + blk_mq_delay_kick_requeue_list(hctx->queue, 10);
> + return BLK_STS_DEV_RESOURCE;
> + } else if (ret < 0) {
> spin_unlock_irqrestore(&port->vio.lock, flags);
> return BLK_STS_IOERR;
> }
We could add this particular change on top of mine after we have extensively tested it.
For now, I would propose to pick up my patch to revert the previous change. I can then
pick up your proposed change and deploy it for extensive testing and see if it has any
side effects.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Powered by blists - more mailing lists