[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875yfhs4km.ffs@tglx>
Date: Mon, 14 Nov 2022 20:45:29 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
Stephen Boyd <sboyd@...nel.org>,
Guenter Roeck <linux@...ck-us.net>,
Anna-Maria Gleixner <anna-maria@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Julia Lawall <Julia.Lawall@...ia.fr>,
Eric Dumazet <edumazet@...gle.com>,
Marcel Holtmann <marcel@...tmann.org>
Subject: Re: [PATCH v6 4/6] timers: Add timer_shutdown_sync() to be called
before freeing timers
Linus!
On Mon, Nov 14 2022 at 09:16, Linus Torvalds wrote:
> On Mon, Nov 14, 2022 at 7:42 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>>
>> So if we want to make this solid and make the life of driver writers
>> easier, then we cannot issue a warning as I said in the original thread
>> already.
>
> So I think that there are two issues at play:
>
> (a) do we want to *find* problem places after the conversion
>
> (b) do we want to make driver writing easier
>
> and (a) argues for warning on timer re-arming, but (b) just says
> "don't warn, just ignore it, the driver is being shut down".
>
> I'm personally ok with either of those approaches, and it's literally
> just a question of mindset.
Correct. I'm very much for (b). Look at the bluetooth example. The "fix"
was obviously right and then introduced a new subtle bug which will only
happen every 7th half-moon.
But if you turn it around then:
timer_shutdown();
destroy_workqueue();
will trigger the warning in mod_timer() every 6.5th half-moon.
And then you have to go and sprinkle 'if (mydev->inshutdown)'
conditionals all over the place with a high probability that they will
not cut it completely. Or you end up with the reverse order of shutdown
calls which is wrong too.
So I rather have the very simple semantics that attempts to arm a
shutdown timer are silently ignored. As I said to Steven in the other
mail, I'm sure that the vast majority of teardown sites will not depend
on the timer(s) being functional. The two other esoteric cases will have
to be treated special.
>> The semantics of timer_shutdown_sync() have to be:
>>
>> After return:
>> - the timer is not queued
>> - the timer callbacks is not running
>> - the timer cannot be enqueued again
>
> Yes, but that last case is literally a "do we expect the *driver* to
> not enqueue it and warn if it tries, or do we just silently enforce
> it"?
>
> I agree with all three points. I'm just not sure about who we expect
> to do the "don't enqueue again".
>
> There's a big argument for "make it easy for driver writers" in just
> saying "make mod_timer() silently just ignore a re-arming". Making
> things easier for driver writers is a good thing.
>
> But maybe it's a "you shouldn't have done that in the first place"
> thing, and merits a warning?
See above.
> I have no strong opinions on that.
>
> What I *do* still want to happen is for subsystems to be able to start
> doing the conversion one by one. Which is why I'd still prefer to have
> the new names available just so that we don't have to have one
> 50-patch series, but we can have subsystems apply the obvious cases.
>
> And I'd still like the mindless "let's get the non-semantic changes
> out of the way" as one single patch, to get rid of mindless noise.
>
> And honestly, for that to happen I'd be perfectly happy with something like
>
> #define timer_shutdown(t) del_timer(t)
> #define timer_shutdown_sync(t) del_timer_sync(t)
>
> (obviously with the patches that first remove the existing
> 'timer_shutdown()' uses first). That wouldn't introduce the *new*
> semantics, but it would at least allow the different subsystems to do
> the obvious cases, and let the networking people wonder about the much
> less obvious ones.
As we are at -rc5 now and the core code is not yet ready, I suggest that
we get the core changes done for the next merge window and have some
obvious fixes which demonstrate the usage, e.g. the borked BT fix
replacement, and then subsystem people can queue their stuff for 6.3 or
send in the obvious bugfixes during the 6.2-rc series.
I'm not a fan of having
#define timer_shutdown_sync(t) del_timer_sync(t)
as a gap measure right now. That's just going to make things worse
because the semantical difference between the both functions is
significant and I don't want people to run around and replace their
'if (mydev->in_shutdown)' conditionals prematurely or do any other fancy
"fixes" which cause more problems than they solve.
This problem exists for ever so there is no need to rush this just
because.
If we all agree that the semantics of timer_shutdown_sync() are:
After return:
- the timer is not queued
- the timer callback is not running
- the timer cannot be enqueued again. Any attempts to do
so are silently ignored (needs some more explanation...)
and the semantics of timer_shutdown() are:
After return:
- the timer is not queued
- the timer cannot be enqueued again. Any attempts to do
so are silently ignored (needs some more explanation...)
- the timer callback might be still running
then we can definitly get this in shape for 6.2.
Thanks,
tglx
Powered by blists - more mailing lists