[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6ed7112cb4f338ba02d9ab67c14e7a3af4afbca0.camel@physik.fu-berlin.de>
Date: Mon, 06 Oct 2025 16:12:11 +0200
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Andreas Larsson <andreas@...sler.com>, Anthony Yznaga
<anthony.yznaga@...cle.com>, Sam James <sam@...too.org>, "David S . Miller"
<davem@...emloft.net>, Michael Karcher
<kernel@...rcher.dialup.fu-berlin.de>, sparclinux@...r.kernel.org
Subject: Re: [PATCH v2] Revert "sunvdc: Do not spin in an infinite loop when
vio_ldc_send() returns EAGAIN"
On Mon, 2025-10-06 at 08:03 -0600, Jens Axboe wrote:
> > To be fair, the sunvdc driver is fairly old and I'm not sure whether these
> > tools already existed back then. FWIW, Oracle engineers did actually work
> > on the Linux for SPARC code for a while and it might be possible that their
> > UEK kernel tree [1] contains some improvements in this regard.
>
> Requeueing and retry has always been available on the block side. It's
> not an uncommon thing for a driver to need, in case of resource
> starvation. And sometimes those resources can be unrelated to the IO, eg
> iommu shortages. Or this busy condition.
I see. Makes sense.
> But that's fine, it's not uncommon for drivers to miss things like that,
> and then we fix them up when noticed. It was probably written by someone
> not super familiar with the IO stack.
FWIW, Oracle engineers actually made some significant changes to the driver
that they never upstreamed, see:
https://github.com/oracle/linux-uek/commits/uek4/qu7/drivers/block/sunvdc.c
In particular, they added support for out-of-order execution:
https://github.com/oracle/linux-uek/commit/68f7c9c17fb80d29cbc1e5110f6c021f8da8d610
and they also changed the driver to use the BIO-based interface for VDC I/O requests:
https://github.com/oracle/linux-uek/commit/4b725eb64cc10a4877f2af75ff3a776586f68eb7
Could you review these two changes and tell me whether these would actually implement
the changes you would want to see? I think the BIO layer is a generic interface of
the block layer in the kernel, isn't it?
> > > > > > And unlike the change in adddc32d6fde ("sunvnet: Do not spin in an infinite
> > > > > > loop when vio_ldc_send() returns EAGAIN"), we can't just drop data as this
> > > > > > driver concerns a block device while the other driver concerns a network
> > > > > > device. Dropping network packages is expected, dropping bytes from a block
> > > > > > device driver is not.
> > > > >
> > > > > Right, but we can sanely retry it rather than sit in a tight loop.
> > > > > Something like the entirely untested below diff.
> > > > >
> > > > > diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c
> > > > > index db1fe9772a4d..aa49dffb1b53 100644
> > > > > --- a/drivers/block/sunvdc.c
> > > > > +++ b/drivers/block/sunvdc.c
> > > > > @@ -539,6 +539,7 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
> > > > > struct vdc_port *port = hctx->queue->queuedata;
> > > > > struct vio_dring_state *dr;
> > > > > unsigned long flags;
> > > > > + int ret;
> > > > >
> > > > > dr = &port->vio.drings[VIO_DRIVER_TX_RING];
> > > > >
> > > > > @@ -560,7 +561,13 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
> > > > > return BLK_STS_DEV_RESOURCE;
> > > > > }
> > > > >
> > > > > - if (__send_request(bd->rq) < 0) {
> > > > > + ret = __send_request(bd->rq);
> > > > > + if (ret == -EAGAIN) {
> > > > > + spin_unlock_irqrestore(&port->vio.lock, flags);
> > > > > + /* already spun for 10msec, defer 10msec and retry */
> > > > > + blk_mq_delay_kick_requeue_list(hctx->queue, 10);
> > > > > + return BLK_STS_DEV_RESOURCE;
> > > > > + } else if (ret < 0) {
> > > > > spin_unlock_irqrestore(&port->vio.lock, flags);
> > > > > return BLK_STS_IOERR;
> > > > > }
> > > >
> > > > We could add this particular change on top of mine after we have
> > > > extensively tested it.
> > >
> > > It doesn't really make sense on top of yours, as that removes the
> > > limited looping that sunvdc would do...
> >
> > Why not? From what I understood, you're moving the limited looping to a
> > different part of the driver where it can delegate the request back up
> > the stack meaning that the current place to implement the limitation is
> > not correct anyway, is it?
>
> Because your change never gives up, hence it'd never trigger the softer
> retry condition. It'll just keep busy looping.
Ah, that makes sense.
> > > > For now, I would propose to pick up my patch to revert the previous
> > > > change. I can then pick up your proposed change and deploy it for
> > > > extensive testing and see if it has any side effects.
> > >
> > > Why not just test this one and see if it works? As far as I can tell,
> > > it's been 6.5 years since this change went in, I can't imagine there's a
> > > huge sense of urgency to fix it up that can't wait for testing a more
> > > proper patch rather than a work-around?
> >
> > Well, the thing is that a lot of people have been running older kernels
> > on SPARC because of issues like these and I have started working on trying
> > to track down all of these issues now [2] for users to be able to run a
> > current kernel. So, the 6.5 years existence of this change shouldn't
> > be an argument I think.
>
> While I agree that the bug is unfortunate, it's also a chance to
> properly fix it rather than just go back to busy looping. How difficult
> is it to test an iteration of the patch? It'd be annoying to queue a
> bandaid only to have to revert that again for a real fix. If this was a
> regression from the last release or two then that'd be a different
> story, but the fact that this has persisted for 6.5 years and is only
> bubbling back up to mainstream now would seem to indicate that we should
> spend a bit of extra time to just get it right the first time.
We could do that for sure. But I would like to hear your opinion on the changes
contributed by Oracle engineers first. Maybe their improvements are much better
so that it might make sense to try to upstream them.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Powered by blists - more mailing lists