lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZnnijsMAQYgCnrZF@slm.duckdns.org>
Date: Mon, 24 Jun 2024 11:18:06 -1000
From: Tejun Heo <tj@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: torvalds@...ux-foundation.org, mingo@...hat.com, juri.lelli@...hat.com,
	vincent.guittot@...aro.org, dietmar.eggemann@....com,
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
	bristot@...hat.com, vschneid@...hat.com, ast@...nel.org,
	daniel@...earbox.net, andrii@...nel.org, martin.lau@...nel.org,
	joshdon@...gle.com, brho@...gle.com, pjt@...gle.com,
	derkling@...gle.com, haoluo@...gle.com, dvernet@...a.com,
	dschatzberg@...a.com, dskarlat@...cmu.edu, riel@...riel.com,
	changwoo@...lia.com, himadrics@...ia.fr, memxor@...il.com,
	andrea.righi@...onical.com, joel@...lfernandes.org,
	linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
	kernel-team@...a.com
Subject: Re: [PATCH 09/39] sched: Add @reason to
 sched_class->rq_{on|off}line()

Hello, Peter.

On Mon, Jun 24, 2024 at 01:32:12PM +0200, Peter Zijlstra wrote:
> On Wed, May 01, 2024 at 05:09:44AM -1000, Tejun Heo wrote:
> > ->rq_{on|off}line are called either during CPU hotplug or cpuset partition
> > updates. A planned BPF extensible sched_class wants to tell the BPF
> > scheduler progs about CPU hotplug events in a way that's synchronized with
> > rq state changes.
> > 
> > As the BPF scheduler progs aren't necessarily affected by cpuset partition
> > updates, we need a way to distinguish the two types of events. Let's add an
> > argument to tell them apart.
> 
> That would be a bug. Must not be able to ignore partitions.

So, first of all, this implementation was brittle in assuming CPU hotplug
events would be called in first and broke after recent cpuset changes. In
v7, it's replaced by hooks in sched_cpu_[de]activate(), which has the extra
benefit of allowing the BPF hotplug methods to be sleepable.

Taking a step back to the sched domains. They don't translate well to
sched_ext schedulers where task to CPU associations are often more dynamic
(e.g. multiple CPUs sharing a task queue) and load balancing operations can
be implemented pretty differently from CFS. The benefits of exposing sched
domains directly to the BPF schedulers is unclear as most of relevant
information can be obtained from userspace already.

The cgroup support side isn't fully developed yet (e.g. cpu.weight is
available but I haven't added cpu.max yet) and plans can always change but I
was thinking taking a similar approach as cpu.weight for cpuset's isolation
features - ie. give the BPF scheduler a way to access the user's
configuration and let it implement whatever it wants to implement.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ