linux-kernel - Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aa2205ab-de8e-4ea0-b9fc-56208ccd30ba@linux.alibaba.com>
Date: Wed, 19 Mar 2025 17:05:44 +0800
From: Tianchen Ding <dtcccc@...ux.alibaba.com>
To: Xuewen Yan <xuewen.yan@...soc.com>
Cc: vincent.guittot@...aro.org, peterz@...radead.org, mingo@...hat.com,
 juri.lelli@...hat.com, dietmar.eggemann@....com, rostedt@...dmis.org,
 bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
 linux-kernel@...r.kernel.org, ke.wang@...soc.com, di.shen@...soc.com,
 xuewen.yan94@...il.com
Subject: Re: [RFC PATCH V2 1/3] sched/fair:
 Fixup-wake_up_sync-vs-DELAYED_DEQUEUE

Hi Xuewen,

On 3/3/25 6:52 PM, Xuewen Yan wrote:
> Delayed dequeued feature keeps a sleeping task enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> So when in wake_affine_idle(), we should use the real running-tasks
> in rq to check whether we should place the wake-up task to
> current cpu.
> On the other hand, add a helper function to return the nr-delayed.
> 
> Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> Signed-off-by: Xuewen Yan <xuewen.yan@...soc.com>

We noticed that your patch can fix a regression introduced by DELAY_DEQUEUE 
in lmbench lat_ctx.

Here's the performance data running
`./lat_ctx -P $(nproc) 96`
on an intel SPR server with 192 CPUs (smaller is better):

DELAY_DEQUEUE                 9.71
NO_DELAY_DEQUEUE              4.02
DELAY_DEQUEUE + this_patch    3.86

Also on an aarch64 server with 128 CPUs:

DELAY_DEQUEUE                 14.82
NO_DELAY_DEQUEUE               5.62
DELAY_DEQUEUE + this_patch     4.66


We found the lmbench lat_ctx regression when enabling DELAY_DEQUEUE, with 
cpu-migrations increasing more than 100 times, higher nr_wakeups_migrate, 
nr_wakeups_remote, nr_wakeups_affine, nr_wakeups_affine_attempts and lower 
nr_wakeups_local.

We think this benchmark prefers waker and wakee staying on the same cpu, 
but WA_IDLE failed to reach this due to sched_delay noise. So your patch 
does fix it.

Feel free to add
Reviewed-and-tested-by: Tianchen Ding <dtcccc@...ux.alibaba.com>

Thanks.