lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z3jd8ohf_05k1ie3@gpd3>
Date: Sat, 4 Jan 2025 08:06:26 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, Changwoo Min <changwoo@...lia.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched_ext: Refresh idle state when kicking CPUs

On Fri, Jan 03, 2025 at 11:39:51AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Fri, Jan 03, 2025 at 09:55:14AM +0100, Andrea Righi wrote:
> ...
> > > When the put_prev/set_next paths were reorganized, we lost the signal on the
> > > CPU re-entering idle from idle. However, that signal is still available if
> > > we hook into idle_class->pick_task(), right? So, if we move
> > > update_idle(true) call there and make sure that we don't generate an event
> > > on busy->busy transitions, we should be able to restore the previous
> > > behavior?
> > 
> > Which is basically what I did here:
> > https://lore.kernel.org/lkml/20241015111539.12136-1-andrea.righi@linux.dev/
> > 
> > We didn't fully like this, because it'd introduce unbalanced transitions,
> > as update_idle(cpu, true) can be generated multiple times. But it's
> > probably fine, at the end we would just restore the original behavior and
> > it'd allow to solve both the "pick_idle + kick CPU" and the "kick from
> > update_idle()" scenarios.
> > 
> > If we like this approach I can send a new patch updating the comment to
> > better clarify the scenarios that we are trying to solve. What do you
> > think?
> 
> Maybe we can solve the unbalanced transitions by tracking per-cpu idle state
> separately and invoking ops.update_idle() only on actual transitions?

We could call scx_update_idle() from pick_task_idle() to refresh the idle
cpumasks, but skip the call to ops.update_idle() to avoid unbalanced
transitions.

However, if a scheduler implements a custom idle tracking policy through
ops.update_idle() we might face a similar issue: the typical sequence
scx_bpf_pick_idle_cpu() + scx_bpf_kick_cpu() + CPU going back to idle state
without dispatching a task would leave the CPU marked as busy, incorrectly.

The issue is that we call ops.update_idle() when a CPU enters or exits
SCHED_IDLE, whereas it should ideally be called when the CPU transitions
in/out of the idle state. So perhaps a kick from idle should trigger
ops.update_idle(cpu, false)? Still, I'm not sure if that would provide any
benefit... after all, do you see any practical scenarios where having
unbalanced transitions could be a problem?

Thanks,
-Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ