linux-kernel - Re: [PATCH 17/24] sched/fair: Implement delayed dequeue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b8ed9d5d-4186-4b2a-aabf-d027d183c44a@amd.com>
Date: Tue, 5 Nov 2024 09:52:54 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Mike Galbraith <efault@....de>, Phil Auld <pauld@...hat.com>
CC: Peter Zijlstra <peterz@...radead.org>, <mingo@...hat.com>,
	<juri.lelli@...hat.com>, <vincent.guittot@...aro.org>,
	<dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
	<mgorman@...e.de>, <vschneid@...hat.com>, <linux-kernel@...r.kernel.org>,
	<wuyun.abel@...edance.com>, <youssefesmat@...omium.org>, <tglx@...utronix.de>
Subject: Re: [PATCH 17/24] sched/fair: Implement delayed dequeue

Hello Mike,

On 11/5/2024 9:35 AM, Mike Galbraith wrote:
> On Mon, 2024-11-04 at 08:05 -0500, Phil Auld wrote:
>> On Sat, Nov 02, 2024 at 05:32:14AM +0100 Mike Galbraith wrote:
>>
>>>
>>> The buddy being preempted certainly won't be wakeup migrated...
>>
>> Not the waker who gets preempted but the wakee may be a bit more
>> sticky on his current cpu and thus stack more since he's still
>> in that runqueue.
> 
> Ah, indeed, if wakees don't get scraped off before being awakened, they
> can and do miss chances at an idle CPU according to trace_printk().
> 
> I'm undecided if overall it's boon, bane or even matters, as there is
> still an ample supply of wakeup migration, but seems it can indeed
> inject wakeup latency needlessly, so <sharpens stick>...

I had tried this out a while back but I was indiscriminately doing a
DEQUEUE_DELAYED and letting delayed tasks go through a full ttwu cycle
which did not yield any improvements on hackbench. Your approach to
selectively do it might indeed be better (more thoughts below)

> 
> My box booted and neither become exceptionally noisy nor inexplicably
> silent in.. oh, minutes now, so surely yours will be perfectly fine.
> 
> After one minute of lightly loaded box browsing, trace_printk() said:
> 
>    645   - racy peek says there is a room available
>     11   - cool, reserved room is free
>    206   - no vacancy or wakee pinned
> 38807   - SIS accommodates room seeker
> 
> The below should improve the odds, but high return seems unlikely.
> 
> ---
>   kernel/sched/core.c |    9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3790,7 +3790,13 @@ static int ttwu_runnable(struct task_str
>   	rq = __task_rq_lock(p, &rf);
>   	if (task_on_rq_queued(p)) {
>   		update_rq_clock(rq);
> -		if (p->se.sched_delayed)
> +		/*
> +		 * If wakee is mobile and the room it reserved is occupied, let it try to migrate.
> +		 */
> +		if (p->se.sched_delayed && rq->nr_running > 1 && cpumask_weight(p->cpus_ptr) > 1) {

Would checking "p->nr_cpus_allowed > 1" be enough instead of doing a
"cpumask_weight(p->cpus_ptr) > 1"?

I was thinking, since the task is indeed delayed, there has to be more
than one task on the runqueue right since a single task by itself cannot
be ineligible and be marked for delayed dequeue? The only time we
encounter a delayed task with "rq->nr_running == 1" is if the other
tasks have been fully dequeued and pick_next_task() is in the process of
picking off all the delayed task, but since that is done with the rq
lock held in schedule(), it is even possible for the
"rq->nr_running > 1" to be false here?

> +			dequeue_task(rq, p, DEQUEUE_SLEEP | DEQUEUE_DELAYED | DEQUEUE_NOCLOCK);
> +			goto out_unlock;
> +		} else if (p->se.sched_delayed)
>   			enqueue_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_DELAYED);
>   		if (!task_on_cpu(rq, p)) {
>   			/*
> @@ -3802,6 +3808,7 @@ static int ttwu_runnable(struct task_str
>   		ttwu_do_wakeup(p);
>   		ret = 1;
>   	}
> +out_unlock:
>   	__task_rq_unlock(rq, &rf);
> 
>   	return ret;
> 
> 

-- 
Thanks and Regards,
Prateek