[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtA31Z0N9hd4z_GPvoZwK=KTf4fPbx_jDbK653mdVDLEbw@mail.gmail.com>
Date: Thu, 18 Jan 2024 17:35:28 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Julia Lawall <julia.lawall@...ia.fr>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>, Mel Gorman <mgorman@...e.de>, linux-kernel@...r.kernel.org
Subject: Re: EEVDF and NUMA balancing
Hi Julia,
Sorry for the delay. I have been involved on other perf regression
On Fri, 5 Jan 2024 at 18:27, Julia Lawall <julia.lawall@...ia.fr> wrote:
>
>
>
> On Fri, 5 Jan 2024, Julia Lawall wrote:
>
> >
> >
> > On Fri, 5 Jan 2024, Vincent Guittot wrote:
> >
> > > On Fri, 5 Jan 2024 at 15:51, Julia Lawall <julia.lawall@...ia.fr> wrote:
> > > >
> > > > > Your system is calling the polling mode and not the default
> > > > > cpuidle_idle_call() ? This could explain why I don't see such problem
> > > > > on my system which doesn't have polling
> > > > >
> > > > > Are you forcing the use of polling mode ?
> > > > > If yes, could you check that this problem disappears without forcing
> > > > > polling mode ?
> > > >
> > > > I expanded the code in do_idle to:
> > > >
> > > > if (cpu_idle_force_poll) { c1++;
> > > > tick_nohz_idle_restart_tick();
> > > > cpu_idle_poll();
> > > > } else if (tick_check_broadcast_expired()) { c2++;
> > > > tick_nohz_idle_restart_tick();
> > > > cpu_idle_poll();
> > > > } else { c3++;
> > > > cpuidle_idle_call();
> > > > }
> > > >
> > > > Later, I have:
> > > >
> > > > trace_printk("force poll: %d: c1: %d, c2: %d, c3: %d\n",cpu_idle_force_poll, c1, c2, c3);
> > > > flush_smp_call_function_queue();
> > > > schedule_idle();
> > > >
> > > > force poll, c1 and c2 are always 0, and c3 is always some non-zero value.
> > > > Sometimes small (often 1), and sometimes large (304 or 305).
> > > >
> > > > So I don't think it's calling cpu_idle_poll().
> > >
> > > I agree that something else
> > >
> > > >
> > > > x86 has TIF_POLLING_NRFLAG defined to be a non zero value, which I think
> > > > is sufficient to cause the issue.
> > >
> > > Could you trace trace_sched_wake_idle_without_ipi() ans csd traces as well ?
> > > I don't understand what set need_resched() in your case; having in
> > > mind that I don't see the problem on my Arm systems and IIRC Peter
> > > said that he didn't face the problem on his x86 system.
> >
> > TIF_POLLING_NRFLAG doesn't seem to be defined on Arm.
> >
> > Peter said that he didn't see the problem, but perhaps that was just
> > random. It requires a NUMA move to occur. I make 20 runs to be sure to
> > see the problem at least once. But another machine might behave
> > differently.
> >
> > I believe the call chain is:
> >
> > scheduler_tick
> > trigger_load_balance
> > nohz_balancer_kick
> > kick_ilb
> > smp_call_function_single_async
> > generic_exec_single
> > __smp_call_single_queue
> > send_call_function_single_ipi
> > call_function_single_prep_ipi
> > set_nr_if_polling <====== sets need_resched
> >
> > I'll make a trace to reverify that.
>
> This is what I see at a tick, which corresponds to the call chain shown
> above:
>
> bt.B.x-4184 [046] 466.410605: bputs: scheduler_tick: calling trigger_load_balance
> bt.B.x-4184 [046] 466.410605: bputs: trigger_load_balance: calling nohz_balancer_kick
> bt.B.x-4184 [046] 466.410605: bputs: trigger_load_balance: calling kick_ilb
> bt.B.x-4184 [046] 466.410607: bprint: trigger_load_balance: calling smp_call_function_single_async 22
> bt.B.x-4184 [046] 466.410607: bputs: smp_call_function_single_async: calling generic_exec_single
> bt.B.x-4184 [046] 466.410607: bputs: generic_exec_single: calling __smp_call_single_queue
> bt.B.x-4184 [046] 466.410608: bputs: __smp_call_single_queue: calling send_call_function_single_ipi
> bt.B.x-4184 [046] 466.410608: bputs: __smp_call_single_queue: calling call_function_single_prep_ipi
> bt.B.x-4184 [046] 466.410608: bputs: call_function_single_prep_ipi: calling set_nr_if_polling
> bt.B.x-4184 [046] 466.410609: sched_wake_idle_without_ipi: cpu=22
I don't know if you have made progress on this in the meantime.
Regarding the trace above, do you know if anything happens on CPU22
just before the scheduler tried to kick the ILB on it ?
Have you found why TIF_POLLING_NRFLAG seems to be always set when the
kick_ilb happens ? It should be cleared once entering the idle state.
Could you check your cpuidle driver ?
Vincent
>
> julia
Powered by blists - more mailing lists