lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240925170956.GC26346@maniforge>
Date: Wed, 25 Sep 2024 12:09:56 -0500
From: David Vernet <void@...ifault.com>
To: Tejun Heo <tj@...nel.org>
Cc: kernel-team@...a.com, linux-kernel@...r.kernel.org, sched-ext@...a.com
Subject: Re: [PATCH 2/5] sched_ext: Allow only user DSQs for
 scx_bpf_consume(), scx_bpf_dsq_nr_queued() and bpf_iter_scx_dsq_new()

On Tue, Sep 24, 2024 at 02:06:04PM -1000, Tejun Heo wrote:

Hi Tejun,

> SCX_DSQ_GLOBAL is special in that it can't be used as a priority queue and
> is consumed implicitly, but all BPF DSQ related kfuncs could be used on it.
> SCX_DSQ_GLOBAL will be split per-node for scalability and those operations
> won't make sense anymore. Disallow SCX_DSQ_GLOBAL on scx_bpf_consume(),
> scx_bpf_dsq_nr_queued() and bpf_iter_scx_dsq_new(). This means that
> SCX_DSQ_GLOBAL can only be used as a dispatch target from BPF schedulers.

This API impedance where you can dispatch but not consume feels unnatural, and
a bit leaky. I understand why we don't want to allow BPF to consume it -- we're
already doing it for the user as part of (and before) the dispatch loop. That's
also one-off logic that's separate from the normal interface for DSQs though,
and because of that, SCX_DSQ_GLOBAL imposes a cognitive overhead that IMO
arguably outweighs the convenience it provides.

I'm still of the opinion that we should just hide SCX_DSQ_GLOBAL from the user
altogether. It makes sense why we'd need it as a backup DSQ for when we're e.g.
unloading the scheduler, but as a building block that's provided by the kernel
to the user, I'm not sure. In a realistic production scenario where you're
doing something like running a scheduler that's latency sensitive and cares
about deadlines, I'm not sure it would be viable or ever the optimal decision
to throw the task in a global DSQ and tolerate it being consumed before other
higher-priority tasks that were enqueued in normal DSQs. Or at the very least,
I could see users being surprised by the semantics, and having instead expected
it to function as just a built-in / pre-created DSQ that functions normally
otherwise.

Thanks,
David

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ