linux-kernel - Re: [PATCH] sched_ext: introduce cpu tick

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <aEkYdGS3qohjfEE8@gpd4>
Date: Wed, 11 Jun 2025 07:47:32 +0200
From: Andrea Righi <arighi@...dia.com>
To: liuwenfang <liuwenfang@...or.com>
Cc: 'Tejun Heo' <tj@...nel.org>, 'David Vernet' <void@...ifault.com>,
	'Changwoo Min' <changwoo@...lia.com>,
	'Ingo Molnar' <mingo@...hat.com>,
	'Peter Zijlstra' <peterz@...radead.org>,
	'Juri Lelli' <juri.lelli@...hat.com>,
	'Vincent Guittot' <vincent.guittot@...aro.org>,
	'Dietmar Eggemann' <dietmar.eggemann@....com>,
	'Steven Rostedt' <rostedt@...dmis.org>,
	'Ben Segall' <bsegall@...gle.com>, 'Mel Gorman' <mgorman@...e.de>,
	'Valentin Schneider' <vschneid@...hat.com>,
	"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>,
	Kumar Kartikeya Dwivedi <memxor@...il.com>,
	Joel Fernandes <joelagnelf@...dia.com>
Subject: Re: [PATCH] sched_ext: introduce cpu tick

On Wed, Jun 11, 2025 at 02:22:11AM +0000, liuwenfang wrote:
> Thanks for your feedback.
> 
> Another one issue is that if a runnable local SCX task has p->nr_cpus_allowed equal to 1,
> and there are RT tasks on this CPU's runqueue, we need a chance to let BPF scheduler to adjust RT 
> throttle param properly(or other methods), so that the local boud SCX task will be scheduled
> in time. This is important for the mobile scenario to render smoothly at 120 frames per second.
> scx_bpf_reenqueue_local will not work for the local SCX when p->nr_cpus_allowed == 1.
> 
> Also some tradeoff methods can be taken to balance the performance:
> If the running SCX task is preempted by one short-running RT task(predicted by its history),
> then it is better for the BPF scheduler to keep this SCX task on its local dsq, rather than directly calling
> scx_bpf_reenqueue_local(). However, we still need protection for this situation in case the
> short RT task become long-running task(perhaps due to some exception).
> 
> Any suggestions and comments are welcome!

This will be all addressed by the DL server work that Joel is doing:
https://lore.kernel.org/all/20250602180110.816225-10-joelagnelf@nvidia.com/

Thanks,
-Andrea

> 
> Best regards
> 
> > 
> > Hello,
> > 
> > On Tue, Jun 10, 2025 at 08:59:45AM +0000, liuwenfang wrote:
> > > Assume one CPU is running one RT task and one runnable scx task on its
> > > local dsq, the scx task cannot be scheduled until RT task enters
> > > sleep, if RT task will run for 100ms, the scx task should be migrated
> > > to other dsqs, then it can have a chance to be scheduled by other CPUs.
> > >
> > > So cpu_tick is added to notitfy BPF scheduler to check long runnable
> > > scx on its local dsq, related policy can be taken to improve the
> > > performance.
> > 
> > (cc'ing Kumar as we discussed similar issue recently)
> > 
> > There are some race conditions we need to address but calling
> > scx_bpf_reenqueue_local() from ops.cpu_release() is the intended way of
> > handling these situations. I don't think periodically polling from ticks is a good
> > approach, especially given that ticks can be skipped w/ nohz_full.
> > 
> > Thanks.
> > 
> > --
> > tejun