[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZgstlCZn0l9wSv7H@pavilion.home>
Date: Mon, 1 Apr 2024 23:56:36 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...nel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>
Subject: Re: [PATCH 2/2] timers: Fix removed self-IPI on global timer's
enqueue in nohz_full
Le Mon, Apr 01, 2024 at 02:26:25PM -0700, Paul E. McKenney a écrit :
> > > _ The RCU CPU Stall report. I strongly suspect the cause is the hrtimer
> > > enqueue to an offline CPU. Let's solve that and we'll see if it still
> > > triggers.
> >
> > Sounds like a plan!
>
> Just checking in on this one. I did reproduce your RCU CPU stall report
> and also saw a TREE03 OOM that might (or might not) be related. Please
> let me know if hammering TREE03 harder or adding some debug would help.
> Otherwise, I will assume that you are getting sufficient bug reports
> from your own testing to be getting along with.
Hehe, there are a lot indeed :-)
So there has been some discussion on CPUSET VS Hotplug, as a problem there
is likely the cause of the hrtimer warning you saw, which in turn might
be the cause of the RCU stalls.
Do you always see the hrtimer warning along the RCU stalls? Because if so, this
might help:
https://lore.kernel.org/lkml/20240401145858.2656598-1-longman@redhat.com/T/#m1bed4d298715d1a6b8289ed48e9353993c63c896
Thanks.
>
> Thanx, Paul
Powered by blists - more mailing lists