lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad666d65-6b67-4068-b429-12bd6273954c@arm.com>
Date: Wed, 28 Jan 2026 12:24:19 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
 peterz@...radead.org, juri.lelli@...hat.com, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 vschneid@...hat.com, linux-kernel@...r.kernel.org,
 mgorman@...hsingularity.net, vineethr@...ux.ibm.com, clm@...a.com,
 Christian.Loehle@....com
Subject: Re: [PATCH] sched/fair: revert force wakeup preemption

On 23/01/2026 10:28, Vincent Guittot wrote:
> This agressively bypasses run_to_parity and slice protection with the
> assumpiton that this is what waker wants but there is no garantee that
> the wakee will be the next to run. It is a better choice to use
> yield_to_task or WF_SYNC in such case.
> 
> This increases the number of resched and preemption because a task becomes
> quickly "ineligible" when it runs; We update the task vruntime periodically
> and before the task exhausted its slice or at least quantum.
> 
> Example:
> 2 tasks A and B wake up simultaneously with lag = 0. Both are
> eligible. Task A runs 1st and wakes up task C. Scheduler updates task
> A's vruntime which becomes greater than average runtime as all others
> have a lag == 0 and didn't run yet. Now task A is ineligible because
> it received more runtime than the other task but it has not yet
> exhausted its slice nor a min quantum. We force preemption, disable
> protection but Task B will run 1st not task C.
> 
> Sidenote, DELAY_ZERO increases this effect by clearing positive lag at
> wake up.
> 
> Fixes: e837456fdca8 ("sched/fair: Reimplement NEXT_BUDDY to align with EEVDF goals")
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>

I see that this is already merged for -rc7 (which is great - thanks for the fast 
turnaround!). Here are the performance results I promised.

TL;DR: This patch combined with the NEXT_BUDDY disablement patch fixes all the 
regressions I originally reported.


6-18-0 (base)		(baseline)
6-19-0-rc6		(New NEXT_BUDDY implementation enabled)
6-19-0-rc6+p1		(New NEXT_BUDDY implementation disabled)
6-19-0-rc6+p1+p2	(+ this patch)


Multi-node SUT (workload running across 2 machines):

+---------------------------------+----------------------------------------------------+---------------+-------------+---------------+------------------+
| Benchmark                       | Result Class                                       | 6-18-0 (base) |  6-19-0-rc6 | 6-19-0-rc6+p1 | 6-19-0-rc6+p1+p2 |
+=================================+====================================================+===============+=============+===============+==================+
| repro-collection/mysql-workload | db transaction rate (transactions/min)             |     646267.33 |  (R) -0.89% |     (I) 4.01% |        (I) 6.03% |
|                                 | new order rate (orders/min)                        |     213256.50 |  (R) -0.89% |     (I) 3.94% |        (I) 6.05% |
+---------------------------------+----------------------------------------------------+---------------+-------------+---------------+------------------+

Single-node SUT (workload running on single machine):

+---------------------------------+----------------------------------------------------+---------------+-------------+---------------+------------------+
| Benchmark                       | Result Class                                       | 6-18-0 (base) |  6-19-0-rc6 | 6-19-0-rc6+p1 | 6-19-0-rc6+p1+p2 |
+=================================+====================================================+===============+=============+===============+==================+
| specjbb/composite               | critical-jOPS (jOPS)                               |      94700.00 |  (R) -4.12% |     (I) 3.07% |        (I) 1.27% |
|                                 | max-jOPS (jOPS)                                    |     113984.50 |  (R) -2.80% |     (I) 1.94% |        (I) 1.94% |
+---------------------------------+----------------------------------------------------+---------------+-------------+---------------+------------------+
| repro-collection/mysql-workload | db transaction rate (transactions/min)             |     245438.25 |  (R) -3.07% |        -1.34% |            0.23% |
|                                 | new order rate (orders/min)                        |      80985.75 |  (R) -3.06% |        -1.29% |            0.25% |
+---------------------------------+----------------------------------------------------+---------------+-------------+---------------+------------------+
| pts/pgbench                     | Scale: 1 Clients: 1 Read Only (TPS)                |      63124.00 |   (I) 2.67% |         2.58% |        (I) 2.69% |
|                                 | Scale: 1 Clients: 1 Read Only - Latency (ms)       |         0.016 |       4.35% |         4.35% |            4.35% |
|                                 | Scale: 1 Clients: 1 Read Write (TPS)               |        974.92 |       0.03% |         0.11% |           -0.06% |
|                                 | Scale: 1 Clients: 1 Read Write - Latency (ms)      |          1.03 |       0.01% |         0.14% |           -0.04% |
|                                 | Scale: 1 Clients: 250 Read Only (TPS)              |    1915931.58 |  (R) -3.28% |    (R) -3.92% |            1.23% |
|                                 | Scale: 1 Clients: 250 Read Only - Latency (ms)     |          0.13 |  (R) -3.33% |    (R) -3.93% |            1.16% |
|                                 | Scale: 1 Clients: 250 Read Write (TPS)             |        855.67 |       0.27% |        -0.49% |           -1.44% |
|                                 | Scale: 1 Clients: 250 Read Write - Latency (ms)    |        292.39 |       0.32% |        -0.49% |           -1.40% |
|                                 | Scale: 1 Clients: 1000 Read Only (TPS)             |    1534130.08 | (R) -12.20% |   (R) -11.85% |            0.45% |
|                                 | Scale: 1 Clients: 1000 Read Only - Latency (ms)    |          0.65 | (R) -12.19% |   (R) -11.87% |            0.46% |
|                                 | Scale: 1 Clients: 1000 Read Write (TPS)            |        578.75 |       0.85% |         1.60% |           -5.23% |
|                                 | Scale: 1 Clients: 1000 Read Write - Latency (ms)   |       1736.98 |       1.12% |         1.52% |           -4.91% |
|                                 | Scale: 100 Clients: 1 Read Only (TPS)              |      57170.33 |       1.64% |         2.16% |            1.69% |
|                                 | Scale: 100 Clients: 1 Read Only - Latency (ms)     |         0.018 |       1.94% |         1.94% |            2.94% |
|                                 | Scale: 100 Clients: 1 Read Write (TPS)             |        836.58 |       0.27% |         0.07% |            0.13% |
|                                 | Scale: 100 Clients: 1 Read Write - Latency (ms)    |          1.20 |       0.27% |         0.06% |            0.15% |
|                                 | Scale: 100 Clients: 250 Read Only (TPS)            |    1773440.67 |  (R) -2.54% |    (R) -2.94% |            1.00% |
|                                 | Scale: 100 Clients: 250 Read Only - Latency (ms)   |          0.14 |  (R) -2.42% |    (R) -2.87% |            1.08% |
|                                 | Scale: 100 Clients: 250 Read Write (TPS)           |       5505.50 |      -1.51% |         0.17% |           -0.03% |
|                                 | Scale: 100 Clients: 250 Read Write - Latency (ms)  |         45.42 |      -1.52% |         0.17% |           -0.03% |
|                                 | Scale: 100 Clients: 1000 Read Only (TPS)           |    1393037.50 | (R) -10.08% |   (R) -10.36% |            0.60% |
|                                 | Scale: 100 Clients: 1000 Read Only - Latency (ms)  |          0.72 | (R) -10.07% |   (R) -10.35% |            0.60% |
|                                 | Scale: 100 Clients: 1000 Read Write (TPS)          |       5085.92 |       0.70% |        -2.32% |           -0.28% |
|                                 | Scale: 100 Clients: 1000 Read Write - Latency (ms) |        196.79 |       0.72% |        -2.27% |           -0.29% |
+---------------------------------+----------------------------------------------------+---------------+-------------+---------------+------------------+
| mmtests/hackbench               | hackbench-process-pipes-1 (seconds)                |          0.14 |      -1.28% |         0.35% |           -1.85% |
|                                 | hackbench-process-pipes-4 (seconds)                |          0.44 |   (I) 8.20% |     (I) 5.72% |        (I) 7.23% |
|                                 | hackbench-process-pipes-7 (seconds)                |          0.68 | (R) -18.31% |   (R) -24.54% |            1.56% |
|                                 | hackbench-process-pipes-12 (seconds)               |          1.24 | (R) -19.52% |   (R) -24.55% |           -0.25% |
|                                 | hackbench-process-pipes-21 (seconds)               |          1.81 |  (R) -7.33% |   (R) -13.58% |           -1.14% |
|                                 | hackbench-process-pipes-30 (seconds)               |          2.39 |  (R) -7.86% |   (R) -13.21% |           -0.23% |
|                                 | hackbench-process-pipes-48 (seconds)               |          3.18 | (R) -10.72% |   (R) -12.63% |            1.22% |
|                                 | hackbench-process-pipes-79 (seconds)               |          3.84 |  (R) -9.52% |   (R) -10.31% |           -0.07% |
|                                 | hackbench-process-pipes-110 (seconds)              |          4.68 |  (R) -6.78% |    (R) -7.15% |            1.30% |
|                                 | hackbench-process-pipes-141 (seconds)              |          5.75 |  (R) -5.50% |    (R) -5.60% |            1.11% |
|                                 | hackbench-process-pipes-172 (seconds)              |          6.80 |  (R) -4.67% |    (R) -4.79% |            1.61% |
|                                 | hackbench-process-pipes-203 (seconds)              |          7.94 |  (R) -4.01% |    (R) -3.74% |        (I) 2.08% |
|                                 | hackbench-process-pipes-234 (seconds)              |          9.02 |  (R) -3.69% |    (R) -3.63% |            1.67% |
|                                 | hackbench-process-pipes-256 (seconds)              |          9.78 |  (R) -3.80% |    (R) -3.19% |            1.65% |
|                                 | hackbench-process-sockets-1 (seconds)              |          0.29 |      -0.38% |        -0.43% |            0.03% |
|                                 | hackbench-process-sockets-4 (seconds)              |          0.76 |  (I) 17.71% |    (I) 18.69% |       (I) 19.52% |
|                                 | hackbench-process-sockets-7 (seconds)              |          1.16 |  (I) 12.10% |    (I) 11.37% |       (I) 13.52% |
|                                 | hackbench-process-sockets-12 (seconds)             |          1.86 |  (I) 10.19% |     (I) 9.31% |       (I) 12.83% |
|                                 | hackbench-process-sockets-21 (seconds)             |          3.12 |   (I) 9.59% |     (I) 8.99% |       (I) 12.15% |
|                                 | hackbench-process-sockets-30 (seconds)             |          4.30 |   (I) 6.23% |     (I) 6.75% |        (I) 8.88% |
|                                 | hackbench-process-sockets-48 (seconds)             |          6.58 |   (I) 2.39% |     (I) 2.98% |        (I) 4.39% |
|                                 | hackbench-process-sockets-79 (seconds)             |         10.56 |   (I) 3.44% |     (I) 3.10% |        (I) 3.94% |
|                                 | hackbench-process-sockets-110 (seconds)            |         13.85 |      -0.77% |         0.44% |        (I) 2.50% |
|                                 | hackbench-process-sockets-141 (seconds)            |         19.23 |      -0.47% |         1.54% |            2.95% |
|                                 | hackbench-process-sockets-172 (seconds)            |         26.33 |   (I) 3.44% |     (I) 4.25% |        (I) 3.21% |
|                                 | hackbench-process-sockets-203 (seconds)            |         30.27 |       0.36% |         1.67% |            0.90% |
|                                 | hackbench-process-sockets-234 (seconds)            |         35.12 |       2.05% |     (I) 3.11% |        (I) 2.45% |
|                                 | hackbench-process-sockets-256 (seconds)            |         38.74 |      -0.39% |         1.48% |            2.13% |
|                                 | hackbench-thread-pipes-1 (seconds)                 |          0.17 |      -0.38% |        -0.76% |           -1.51% |
|                                 | hackbench-thread-pipes-4 (seconds)                 |          0.45 |   (I) 7.85% |     (I) 6.15% |        (I) 9.93% |
|                                 | hackbench-thread-pipes-7 (seconds)                 |          0.74 |  (R) -7.22% |    (R) -9.98% |        (I) 6.47% |
|                                 | hackbench-thread-pipes-12 (seconds)                |          1.32 |  (R) -7.62% |   (R) -14.42% |            1.27% |
|                                 | hackbench-thread-pipes-21 (seconds)                |          1.95 |  (R) -3.00% |    (R) -7.93% |           -1.67% |
|                                 | hackbench-thread-pipes-30 (seconds)                |          2.50 |  (R) -4.79% |   (R) -11.99% |           -1.72% |
|                                 | hackbench-thread-pipes-48 (seconds)                |          3.32 |  (R) -5.49% |   (R) -11.45% |            1.15% |
|                                 | hackbench-thread-pipes-79 (seconds)                |          4.04 |  (R) -6.16% |    (R) -8.88% |           -0.56% |
|                                 | hackbench-thread-pipes-110 (seconds)               |          4.94 |  (R) -2.62% |    (R) -4.92% |            0.63% |
|                                 | hackbench-thread-pipes-141 (seconds)               |          6.04 |  (R) -2.05% |    (R) -3.56% |            0.51% |
|                                 | hackbench-thread-pipes-172 (seconds)               |          7.15 |      -0.74% |        -1.93% |            0.91% |
|                                 | hackbench-thread-pipes-203 (seconds)               |          8.31 |      -1.20% |        -1.41% |            0.91% |
|                                 | hackbench-thread-pipes-234 (seconds)               |          9.49 |      -0.65% |        -1.21% |            0.92% |
|                                 | hackbench-thread-pipes-256 (seconds)               |         10.30 |      -0.56% |        -0.92% |            0.88% |
|                                 | hackbench-thread-sockets-1 (seconds)               |          0.31 |       0.16% |        -0.05% |           -0.48% |
|                                 | hackbench-thread-sockets-4 (seconds)               |          0.79 |  (I) 18.70% |    (I) 19.30% |       (I) 19.79% |
|                                 | hackbench-thread-sockets-7 (seconds)               |          1.16 |  (I) 12.35% |    (I) 11.90% |       (I) 12.91% |
|                                 | hackbench-thread-sockets-12 (seconds)              |          1.87 |  (I) 12.75% |    (I) 11.66% |       (I) 14.43% |
|                                 | hackbench-thread-sockets-21 (seconds)              |          3.16 |  (I) 11.55% |    (I) 11.06% |       (I) 14.41% |
|                                 | hackbench-thread-sockets-30 (seconds)              |          4.32 |   (I) 7.66% |     (I) 6.58% |       (I) 10.15% |
|                                 | hackbench-thread-sockets-48 (seconds)              |          6.45 |   (I) 2.62% |         1.92% |        (I) 4.10% |
|                                 | hackbench-thread-sockets-79 (seconds)              |         10.15 |       1.85% |        -0.20% |            1.54% |
|                                 | hackbench-thread-sockets-110 (seconds)             |         13.45 |      -0.29% |        -0.41% |            0.08% |
|                                 | hackbench-thread-sockets-141 (seconds)             |         17.87 |      -1.84% |        -1.01% |            1.33% |
|                                 | hackbench-thread-sockets-172 (seconds)             |         24.38 |       0.82% |         1.33% |            3.68% |
|                                 | hackbench-thread-sockets-203 (seconds)             |         28.38 |      -1.29% |         0.72% |            1.58% |
|                                 | hackbench-thread-sockets-234 (seconds)             |         32.75 |      -1.01% |         1.00% |            0.94% |
|                                 | hackbench-thread-sockets-256 (seconds)             |         36.49 |      -0.99% |         1.22% |            1.00% |
+---------------------------------+----------------------------------------------------+---------------+-------------+---------------+------------------+

Thanks,
Ryan


> ---
>  kernel/sched/fair.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 04993c763a06..16ecc3475fe2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8822,16 +8822,6 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
>  	if ((wake_flags & WF_FORK) || pse->sched_delayed)
>  		return;
>  
> -	/*
> -	 * If @p potentially is completing work required by current then
> -	 * consider preemption.
> -	 *
> -	 * Reschedule if waker is no longer eligible. */
> -	if (in_task() && !entity_eligible(cfs_rq, se)) {
> -		preempt_action = PREEMPT_WAKEUP_RESCHED;
> -		goto preempt;
> -	}
> -
>  	/* Prefer picking wakee soon if appropriate. */
>  	if (sched_feat(NEXT_BUDDY) &&
>  	    set_preempt_buddy(cfs_rq, wake_flags, pse, se)) {


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ