lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CABcWv98gu6HqbOEoOPBL4tJrcGmU=1x3=fN3Fpho8wiY+D22CQ@mail.gmail.com>
Date: Tue, 28 Oct 2025 03:49:32 -0400
From: Tingjia Cao <tjcao980311@...il.com>
To: mingo@...hat.com, peterz@...radead.org, 
	Vincent Guittot <vincent.guittot@...aro.org>, dietmar.eggemann@....com, 
	linux-kernel@...r.kernel.org
Subject: [BUG] select_idle_sibling() doesn't consider sync wakeup logic

Hello,

We have observed an issue in the CFS scheduler's task placement logic
present in kernel versions v6.14 and latest v6.18-rc3.

The function select_idle_sibling() in *fair.c* does not correctly handle
the WF_SYNC wakeup flag, leading to suboptimal placement of newly awakened
child tasks.

The core issue lies in a logical contradiction between wake_affine_idle()
and select_idle_sibling() when sync is true.

1. Intended Behavior (wake_affine_idle): During a sync wakeup (WF_SYNC is
true), the scheduler's intent is to place the child task on this_cpu (the
parent's current CPU) if there is only 1 runnable task. The rationale is
that the parent is expected to go to sleep almost immediately, making its
CPU available. It keeps the child task on a CPU with a hot cache.
static int wake_affine_idle(int this_cpu, int prev_cpu, int sync) {
    ...
    if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
        return this_cpu;
    ...
}

2. Flawed Behavior (select_idle_sibling): When select_task_rq_fair() later
calls select_idle_sibling() with "this_cpu" as the "target", it rejects the
"target" because it is not currently idle (the parent is still running).
Instead, it searches for an actually idle sibling CPU within the same LLC
domain. During a sync wakeup, however, the scheduler assumes the parent
will sleep immediately and should treat the parent’s CPU as effectively
available if it's the only runnable task.

3. The Consequence: The wakee is placed on a remote idle sibling rather
than on the idle parent’s CPU, losing cache locality. The remote CPU may
also have been idle in a deeper C-state and/or at a lower frequency,
further hurting the child’s performance.

Kernel Info
===============
Host OS: on ubuntu24.04, running qemu with "-smp cpus=3,cores=3"
Processor: Two Intel Xeon Silver 4114 10-core CPUs at 2.20 GHz
Kernel Version: v6.14 and latest v6.18-rc3

===============
We attached a patch to fix this issue. Thank you for the effort!

Best,
Tingjia

Content of type "text/html" skipped

Download attachment "0001-consider-sync-wakeup-when-selecting-idle-sibling.patch" of type "application/octet-stream" (2161 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ