[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEXW_YTYzoAfOcgYA-N7VJYJoZTVX2=GtDrZc19RXoAXidbvyA@mail.gmail.com>
Date: Tue, 10 Oct 2023 22:44:16 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: paulmck@...nel.org
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Naresh Kamboju <naresh.kamboju@...aro.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, patches@...ts.linux.dev,
linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org, linux@...ck-us.net, shuah@...nel.org,
patches@...nelci.org, lkft-triage@...ts.linaro.org, pavel@...x.de,
jonathanh@...dia.com, f.fainelli@...il.com,
sudipm.mukherjee@...il.com, srw@...dewatkins.net, rwarsow@....de,
conor@...nel.org, Chengming Zhou <zhouchengming@...edance.com>,
Peter Zijlstra <peterz@...radead.org>,
Ovidiu Panait <ovidiu.panait@...driver.com>,
Ingo Molnar <mingo@...nel.org>, rcu <rcu@...r.kernel.org>
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review
On Sun, Oct 8, 2023 at 9:20 PM Paul E. McKenney <paulmck@...nel.org> wrote:
[...]
> > > > How frequent is this function called? We could check something for
> > > > early boot... or track down where the cpu is put online and restore idle
> > > > before that happens?
> > >
> > > Once per RCU Tasks Trace grace period per reader seen to be blocking
> > > that grace period. Its performance is as issue, but not to anywhere
> > > near the same extent as (say) rcu_read_lock_trace().
> > >
> > > > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > > > v6.5).
> > > > >
> > > > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> > > >
> > > > The Right Thing is to fix the bug all the way back to the introduction,
> > > > but what fallout makes the backport less desirable than living with the
> > > > unexposed bug?
> > >
> > > You are quite right that it is possible for the risk of a backport to
> > > exceed the risk of the original bug.
> > >
> > > I defer to Joel (CCed) on how best to resolve this in -stable.
> >
> > Maybe I am missing something but this issue should also be happening
> > in mainline right?
> >
> > Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
> > for recently offlined CPUs") , the warning should still be happening
> > due to Liam's "kernel/sched: Modify initial boot task idle setup"
> > because the warning is just rearranged a bit but essentially the same.
> >
> > IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
> > fix it in mainline (using the ideas described in this thread), then
> > backport both that new fix and Liam's patch to 5.15.
> >
> > Or is there a reason this warning does not show up on the mainline?
> >
> > My impression is that dropping Liam's patch for the stable release and
> > revisiting it later is a better approach since tiny RCU is used way
> > less in the wild than tree/tasks RCU. Thoughts?
>
> I think that this one is strange enough that we need to write down the
> situation in detail, make sure we have all the corner cases covered in
> both mainline and -stable, and decide what to do from there.
>
> Yes, I know, this email thread contains much of this information, but
> a little organizing of it would be good.
>
> Would you like to put that together, or should I? If me, I will get
> a draft out by the end of this coming Tuesday, Pacific Time.
I apologize, I haven't been able to do any real work as I was OOO for
the most part due to dental issues. I am about 25% back now. I will
review your other email writeup and thanks for putting it together!
- Joel
Powered by blists - more mailing lists