linux-kernel - Re: [PATCH v2] Revert "sunvdc: Do not spin in an infinite loop when vio_ldc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a80a1c5f-da21-4437-b956-a9f659c355a4@kernel.dk>
Date: Mon, 6 Oct 2025 07:19:40 -0600
From: Jens Axboe <axboe@...nel.dk>
To: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
 linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: Andreas Larsson <andreas@...sler.com>,
 Anthony Yznaga <anthony.yznaga@...cle.com>, Sam James <sam@...too.org>,
 "David S . Miller" <davem@...emloft.net>,
 Michael Karcher <kernel@...rcher.dialup.fu-berlin.de>,
 sparclinux@...r.kernel.org
Subject: Re: [PATCH v2] Revert "sunvdc: Do not spin in an infinite loop when
 vio_ldc_send() returns EAGAIN"

On 10/6/25 7:00 AM, John Paul Adrian Glaubitz wrote:
> Hi Jens,
> 
> On Mon, 2025-10-06 at 06:48 -0600, Jens Axboe wrote:
>> When you apply this patch and things work, how many times does it
>> generally spin where it would've failed before? It's a bit unnerving to
>> have a never ending spin loop for this, with udelay spinning in between
>> as well. Looking at vio_ldc_send() as well, that spins for potentially
>> 1000 loops of 1usec each, which would be 1ms. With the current limit of
>> 10 retries, the driver would end up doing udelays of:
>>
>> 1
>> 2
>> 4
>> 8
>> 16
>> 32
>> 64
>> 128
>> 128
>> 128
>>
>> which is 511 usec on top, for 10.5ms in total spinning time before
>> giving up. That is kind of mind boggling, that's an eternity.
> 
> The problem is that giving up can lead to filesystem corruption which
> is problem that was never observed before the change from what I know.

Yes, I get that.

> We have deployed a kernel with the change reverted on several LDOMs that
> are seeing heavy use such as cfarm202.cfarm.net and we have seen any system
> lock ups or similar.

I believe you. I'm just wondering how long you generally need to spin,
as per the question above: how many times does it generally spin where
it would've failed before?

>> Not that it's _really_ that important as this is a pretty niche driver,
>> but still pretty ugly... Does it work reliably with a limit of 100
>> spins? If things get truly stuck, spinning forever in that loop is not
>> going to help. In any case, your patch should have
> 
> Isn't it possible that the call to vio_ldc_send() will eventually succeed
> which is why there is no need to abort in __vdc_tx_trigger()?

Of course. Is it also possible that some conditions will lead it to
never succeed?

The nicer approach would be to have sunvdc punt retries back up to the
block stack, which would at least avoid a busy spin for the condition.
Rather than return BLK_STS_IOERR which terminates the request and
bubbles back the -EIO as per your log. IOW, if we've already spent
10.5ms in that busy loop as per the above rough calculation, perhaps
we'd be better off restarting the queue and hence this operation after a
small sleeping delay instead. That would seem a lot saner than hammering
on it.

> And unlike the change in adddc32d6fde ("sunvnet: Do not spin in an infinite
> loop when vio_ldc_send() returns EAGAIN"), we can't just drop data as this
> driver concerns a block device while the other driver concerns a network
> device. Dropping network packages is expected, dropping bytes from a block
> device driver is not.

Right, but we can sanely retry it rather than sit in a tight loop.
Something like the entirely untested below diff.

diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c
index db1fe9772a4d..aa49dffb1b53 100644
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -539,6 +539,7 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
 	struct vdc_port *port = hctx->queue->queuedata;
 	struct vio_dring_state *dr;
 	unsigned long flags;
+	int ret;

 	dr = &port->vio.drings[VIO_DRIVER_TX_RING];

@@ -560,7 +561,13 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
 		return BLK_STS_DEV_RESOURCE;
 	}

-	if (__send_request(bd->rq) < 0) {
+	ret = __send_request(bd->rq);
+	if (ret == -EAGAIN) {
+		spin_unlock_irqrestore(&port->vio.lock, flags);
+		/* already spun for 10msec, defer 10msec and retry */
+		blk_mq_delay_kick_requeue_list(hctx->queue, 10);
+		return BLK_STS_DEV_RESOURCE;
+	} else if (ret < 0) {
 		spin_unlock_irqrestore(&port->vio.lock, flags);
 		return BLK_STS_IOERR;
 	}

-- 
Jens Axboe