lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ef409c01-1588-6e3a-aa3c-f142c80657c7@amd.com>
Date: Wed, 25 Sep 2024 08:47:47 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Tejun Heo <tj@...nel.org>
CC: <void@...ifault.com>, <linux-kernel@...r.kernel.org>,
	<kernel-team@...a.com>, <sched-ext@...a.com>, <peterz@...radead.org>, Pat
 Somaru <patso@...ewhatevs.io>
Subject: Re: [PATCH sched_ext/for-6.12-fixes] Disable SM_IDLE/rq empty path
 when scx_enabled

Hello Tejun,

On 9/25/2024 3:51 AM, Tejun Heo wrote:
> Hello,
> 
> On Tue, Sep 24, 2024 at 09:10:02AM +0530, K Prateek Nayak wrote:
>>>    	prev_state = READ_ONCE(prev->__state);
>>>    	if (sched_mode == SM_IDLE) {
>>> -		if (!rq->nr_running) {
>>> +		/* SCX must consult the BPF scheduler to tell if rq is empty */
>>
>> I was wondering if sched_ext case could simply do:
>>
>> 		if (scx_enabled())
>> 			prev_balance(rq, prev, rf);
>>
>> and use "rq->scx.flags" to skip balancing in balance_scx() later when
>> __pick_next_task() calls prev_balance() but (and please correct me if
>> I'm wrong here) balance_scx() calls balance_one() which can call
>> consume_dispatch_q() to pick a task from global / user-defined dispatch
>> queue, and in doing so, it does not update "rq->nr_running".
> 
> Hmm... would that be a meaningful optimization? prev_balance() calls into
> SCX's dispatch path and there can be quite a bit going on there. I'm not
> sure whether it'd worth much to save a trip through __pick_next_task().

Probably not worth it given balance_scx() is indeed very complex and can
release and re-acquire the rq-lock (I don't believe it should be a
problem in SM_IDLE path but the given he complexity, I could have easily
missed something again :)

> 
>> I could only see add_nr_running() being called from enqueue_task_scx()
>> and this is even before the ext core calls do_enqueue_task() which hooks
>> into the bpf layer which makes the decision where the task actually
>> goes.
>>
>> Is my understanding correct that whichever CPU is the target for the
>> enqueue_task_scx() callback initially is the one that accounts the
>> enqueue in "rq->nr_running" until the task is dequeued or did I miss
>> something?
> 
> Whenever a task is dispatched to a local DSQ of a CPU including from
> balance_one(), if the task is not on that CPU already,
> move_remote_task_to_local_dsq() is called which migrates the task to the
> target CPU by deactivating and then re-activating it. As deactivating and
> re-activating involves dequeueing and re-enqueueing, rq->running gets
> updated accordingly.

Ah! I gave up too soon going down the call chain. Thank you for
clarifying.

> 
> Thanks.
> 

-- 
Thanks and Regards,
Prateek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ