lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 5 Jan 2024 17:00:28 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Julia Lawall <julia.lawall@...ia.fr>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Mel Gorman <mgorman@...e.de>, linux-kernel@...r.kernel.org
Subject: Re: EEVDF and NUMA balancing

On Fri, 5 Jan 2024 at 15:51, Julia Lawall <julia.lawall@...ia.fr> wrote:
>
> > Your system is calling the polling mode and not the default
> > cpuidle_idle_call() ? This could explain why I don't see such problem
> > on my system which doesn't have polling
> >
> > Are you forcing the use of polling mode ?
> > If yes, could you check that this problem disappears without forcing
> > polling mode ?
>
> I expanded the code in do_idle to:
>
>                 if (cpu_idle_force_poll) { c1++;
>                         tick_nohz_idle_restart_tick();
>                         cpu_idle_poll();
>                 } else if (tick_check_broadcast_expired()) { c2++;
>                         tick_nohz_idle_restart_tick();
>                         cpu_idle_poll();
>                 } else { c3++;
>                         cpuidle_idle_call();
>                 }
>
> Later, I have:
>
>         trace_printk("force poll: %d: c1: %d, c2: %d, c3: %d\n",cpu_idle_force_poll, c1, c2, c3);
>         flush_smp_call_function_queue();
>         schedule_idle();
>
> force poll, c1 and c2 are always 0, and c3 is always some non-zero value.
> Sometimes small (often 1), and sometimes large (304 or 305).
>
> So I don't think it's calling cpu_idle_poll().

I agree that something else

>
> x86 has TIF_POLLING_NRFLAG defined to be a non zero value, which I think
> is sufficient to cause the issue.

Could you trace trace_sched_wake_idle_without_ipi() ans csd traces as well ?
I don't understand what set need_resched() in your case; having in
mind that I don't see the problem on my Arm systems and IIRC Peter
said that he didn't face the problem on his x86 system.

>
> julia

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ