linux-kernel - Re: [PATCH 11/13] sched_ext: Add scx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aRJT1dbGTmPRw4-p@gpd4>
Date: Mon, 10 Nov 2025 22:06:29 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, Changwoo Min <changwoo@...lia.com>,
	Dan Schatzberg <schatzberg.dan@...il.com>,
	Emil Tsalapatis <etsal@...a.com>, sched-ext@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 11/13] sched_ext: Add scx_cpu0 example scheduler

On Mon, Nov 10, 2025 at 08:44:12AM -1000, Tejun Heo wrote:
> On Mon, Nov 10, 2025 at 09:36:46AM +0100, Andrea Righi wrote:
> > > +void BPF_STRUCT_OPS(cpu0_enqueue, struct task_struct *p, u64 enq_flags)
> > > +{
> > > +	if (p->nr_cpus_allowed < nr_cpus) {
> > 
> > We could be even more aggressive with DSQ_CPU0 and check
> > bpf_cpumask_test_cpu(0, p->cpus_ptr), but this is fine as well.
> 
> I did the following instead:
> 
>   void BPF_STRUCT_OPS(cpu0_enqueue, struct task_struct *p, u64 enq_flags)
>   {
>           /*
>            * select_cpu() always picks CPU0. If @p is not on CPU0, it can't run on
>            * CPU 0. Queue on whichever CPU it's currently only.
>            */
>           if (scx_bpf_task_cpu(p) != 0) {
>                   stat_inc(0);	/* count local queueing */
>                   scx_bpf_dsq_insert(p, SCX_DSQ_LOCAL, SCX_SLICE_DFL, 0);
>                   return;
>           }
> 
>           stat_inc(1);	/* count cpu0 queueing */
>           scx_bpf_dsq_insert(p, DSQ_CPU0, SCX_SLICE_DFL, enq_flags);
>   }
> 
> This should be safe against migration disabled tasks and so on.

Looks good.

> 
> > > +		stat_inc(0);	/* count local queueing */
> > > +		scx_bpf_dsq_insert(p, SCX_DSQ_LOCAL, SCX_SLICE_DFL, 0);
> > 
> > And this is why I was suggesting to automatically fallback to the new
> > global default time slice internally. In this case do we want to preserve
> > the old 20ms default or automatically switch to the new one?
> 
> Maybe SCX_SLICE_DFL can become runtime loaded const volatile but anyone
> who's using it is just saying "I don't care". As long as it's not something
> that breaks the system left and right, does it matter what exact value it
> is?

I agree that if a scheduler uses SCX_SLICE_DFL it shouldn't care too much
about the exact value.

My concern was more about those schedulers that are quite paranoid about
latency and even if something isn't handled properly (directly dispatching
to a wrong CPU, a task being rescheduled internally, etc.), we'd still have
a guarantee that a task's time slice can't exceed a known upper bound. But
this could be managed by being able to set a default time slice (somehow)
and it can be addressed separately.

So yeah, in this case the exact value of SCX_SLICE_DFL doesn't really
matter probably.

-Andrea