[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wi8+Ecn9VJH8WYPb7BR4ECYRZGKiiWdhcCjTKZbNkbTkQ@mail.gmail.com>
Date: Fri, 11 Jul 2025 14:46:01 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Jakub Kicinski <kuba@...nel.org>, Frederic Weisbecker <frederic@...nel.org>,
Valentin Schneider <vschneid@...hat.com>, Nam Cao <namcao@...utronix.de>,
Christian Brauner <brauner@...nel.org>
Cc: Thomas Zimmermann <tzimmermann@...e.de>, Simona Vetter <simona@...ll.ch>,
Dave Airlie <airlied@...il.com>, davem@...emloft.net, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, pabeni@...hat.com,
dri-devel <dri-devel@...ts.freedesktop.org>
Subject: Re: [GIT PULL] Networking for v6.16-rc6 (follow up)
On Fri, 11 Jul 2025 at 13:35, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Indeed. It turns out that the problem actually started somewhere
> between rc4 and rc5, and all my previous bisections never even came
> close, because kernels usually work well enough that I never realized
> that it went back that far.
It looks like it's actually due to commit 8c44dac8add7 ("eventpoll:
Fix priority inversion problem"), and it's been going on for a while
now and the behavior was just too subtle for me to have noticed.
Does not look hardware-specific, except in the sense that it probably
needs several CPU's along with the odd startup pattern to trigger
this.
It's possible that the bisection ended up wrong, and when it appeared
to start going off in the weeds I was like "this is broken again", but
before I marked a kernel "good" I tested it several times, and then in
the end that "eventpoll: Fix priority inversion problem" kind of makes
sense after all.
I would never have guessed at that commit otherwise (well, considering
that I blamed both the drm code and the netlink code first, that goes
without saying), but at the same time, that *is* the kind of change
that would certainly make user space get hung up with odd timeouts.
I've only tested the previous commit being good twice now, but I'll go
back to the head of tree and try a revert to verify that this is
really it. Because maybe it's the now Nth time I found something that
hides the problem, not the real issue.
Fingers crossed that this very timing-dependent odd problem really did
bisect right finally, after many false starts.
Linus
Powered by blists - more mailing lists