[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e45e3182c4718cafad1166e9ef8dcca1c301651.camel@physik.fu-berlin.de>
Date: Mon, 06 Oct 2025 15:00:10 +0200
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Andreas Larsson <andreas@...sler.com>, Anthony Yznaga
<anthony.yznaga@...cle.com>, Sam James <sam@...too.org>, "David S . Miller"
<davem@...emloft.net>, Michael Karcher
<kernel@...rcher.dialup.fu-berlin.de>, sparclinux@...r.kernel.org
Subject: Re: [PATCH v2] Revert "sunvdc: Do not spin in an infinite loop when
vio_ldc_send() returns EAGAIN"
Hi Jens,
On Mon, 2025-10-06 at 06:48 -0600, Jens Axboe wrote:
> When you apply this patch and things work, how many times does it
> generally spin where it would've failed before? It's a bit unnerving to
> have a never ending spin loop for this, with udelay spinning in between
> as well. Looking at vio_ldc_send() as well, that spins for potentially
> 1000 loops of 1usec each, which would be 1ms. With the current limit of
> 10 retries, the driver would end up doing udelays of:
>
> 1
> 2
> 4
> 8
> 16
> 32
> 64
> 128
> 128
> 128
>
> which is 511 usec on top, for 10.5ms in total spinning time before
> giving up. That is kind of mind boggling, that's an eternity.
The problem is that giving up can lead to filesystem corruption which
is problem that was never observed before the change from what I know.
We have deployed a kernel with the change reverted on several LDOMs that
are seeing heavy use such as cfarm202.cfarm.net and we have seen any system
lock ups or similar.
> Not that it's _really_ that important as this is a pretty niche driver,
> but still pretty ugly... Does it work reliably with a limit of 100
> spins? If things get truly stuck, spinning forever in that loop is not
> going to help. In any case, your patch should have
Isn't it possible that the call to vio_ldc_send() will eventually succeed
which is why there is no need to abort in __vdc_tx_trigger()?
And unlike the change in adddc32d6fde ("sunvnet: Do not spin in an infinite
loop when vio_ldc_send() returns EAGAIN"), we can't just drop data as this
driver concerns a block device while the other driver concerns a network
device. Dropping network packages is expected, dropping bytes from a block
device driver is not.
> Cc: stable@...r.kernel.org
> Fixes: a11f6ca9aef9 ("sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN")
>
> tags added.
Will do.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Powered by blists - more mailing lists