[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250516235509.GU2023217@ZenIV>
Date: Sat, 17 May 2025 00:55:09 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Ozgur Kara <ozgur@...sey.org>
Cc: Christian Brauner <brauner@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Jeff Layton <jlayton@...nel.org>, Song Liu <song@...nel.org>,
Joel Granados <joel.granados@...nel.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] kernel: fix acct.c first test openai codex
On Fri, May 16, 2025 at 11:15:49PM +0000, Ozgur Kara wrote:
> From: Ozgur Karatas <ozgur@...sey.org>
>
> Hello,
>
> i want to try out the openai codex and it seemed like a logical
> process so with rcu_read_lock() a protection is started but we dont
> call rcu_read_unlock(); it has to be called to end rcu read lock.
>
> I guess, this means rcu stays open forever and data structures are not
> cleaned which causes performance degradation.
>
> Regards
>
> Ozgur
>
> Reported-by: Ozgur Karatas <ozgur@...sey.org>
... showing that one needs *something* that would be able to reason somewhere
in the loop.
Your "I guess, this means" is an excellent example - you've got something
(openai codex, visual examination, whatever) pointing you to unusual pattern -
rcu_read_lock() + call of function with no rcu_read_unlock() in sight.
Next step: hypothesis that unlock might be missing.
Next step: blindly send a patch adding an unlock on the strength of that
hypothesis?
Sorry, no. It's not just that hypothesis is wrong - it is, but that's not
the real issue. It's that instead of
* asking how the hell could that work
* trying to trigger that and checking whether the hypothesis is
corrent
* looking around for similar places to see whether it looks intentional
(in which case it still might be a bug, but if it's inconsistent, the odds of
a bug are going up)
* looking through the history (e.g. with git blame) to see if there
might be any explanations in commit message
* looking through the function in question (git grep would immediately
point to fs/fs_pin.c, and the first glance would show rcu_read_unlock() on
all paths through it, with the total balance of lock/unlock being -1 on
each)
you've chosen to post a patch "fixing" the problem as a way to see whether
it's actually there.
Folks, programming tools can be very useful in directing to places that might
be worth checking. The codebase is huge, so any heuristics useful in triage
are Good Things(tm); if instead of "bug found because somebody looked at
random line picked out of millions such and it turned out to be broken" you
have "bug found because such-and-such heuristic had picked that line as
odd-looking and it turned out to be broken", you win even if it lists
20 places and only one of them turns out to be actually broken.
But you really can't treat that as anything beyond a hint to bump that
particular place in the list of things to look through. It might tell you
what looks unusual, but that's it. Same as with any warning from any
source.
Asking around is fine; so's experimenting, so's reasoning, etc.
But you can't go from "tool gives a warning, I guess it translates into plain
language this way" to "modify that place so that the tool doesn't point to
that place anymore". You really need to understand what's going on first.
Powered by blists - more mailing lists