[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZfSWqXKWnalm9wE5@redhat.com>
Date: Fri, 15 Mar 2024 13:42:49 -0500
From: David Teigland <teigland@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, gfs2@...ts.linux.dev
Subject: Re: [GIT PULL] dlm fixes for 6.9
On Fri, Mar 15, 2024 at 10:10:00AM -0700, Linus Torvalds wrote:
> Now, if the issue is that you want to clean up something that is never
> getting cleaned up by anybody else, and this is a fatal error, and
> you're just trying to fix things up (badly), and you know that this is
> all racy but the code is trying to kill a dead data structure, then
> you should
>
> (a) need a damn big comment (bigger than the comment is already)
>
> (b) should *NOT* pretend to do some stupid "atomic decrement and test" loop
Yes, that looks pretty messed up, the counter should not be an atomic_t. I was
a bit wary of making that atomic when it wasn't necessary, but didn't push back
enough on that change:
commit 75a7d60134ce84209f2c61ec4619ee543aa8f466
Author: Alexander Aring <aahringo@...hat.com>
Date: Mon May 29 17:44:38 2023 -0400
Currently the lkb_wait_count is locked by the rsb lock and it should be
fine to handle lkb_wait_count as non atomic_t value. However for the
overall process of reducing locking this patch converts it to an
atomic_t value.
.. and the result is the primitives get abused, and the code becomes crazy.
My initial plan is to go back to a non-atomic counter there. It is indeed a
recovery situation that involves a forced reset of state, but I'll need to go
back and study that case further before I can say what it should finally look
like. Whatever that looks like, it'll have a very good comment :) Dropping
the pull is fine, there's a chance I may resend with the other patch and a new
fix, we'll see.
Thanks,
Dave
Powered by blists - more mailing lists