lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250916160036.584174-1-sieberf@amazon.com>
Date: Tue, 16 Sep 2025 18:00:35 +0200
From: Fernand Sieber <sieberf@...zon.com>
To: <sieberf@...zon.com>
CC: <bsegall@...gle.com>, <dietmar.eggemann@....com>, <dwmw@...zon.co.uk>,
	<graf@...zon.com>, <jschoenh@...zon.de>, <juri.lelli@...hat.com>,
	<linux-kernel@...r.kernel.org>, <mingo@...hat.com>, <peterz@...radead.org>,
	<tanghui20@...wei.com>, <vincent.guittot@...aro.org>,
	<vineethr@...ux.ibm.com>, <wangtao554@...wei.com>, <zhangqiao22@...wei.com>
Subject: Re: [PATCH v2] sched/fair: Forfeit vruntime on yield

After further testing I think we should stick with the original approach or
iterate on the vruntime forfeiting.

The vruntime forfeiting doesn't work well with core scheduling. The core
scheduler picks the best task across the SMT mask, and then the siblings run a
matching task no matter what. This means the core scheduler can keep picking
the yielding task on the sibling even after it becomes ineligible (because it's
preferrable than force idle). In this scenario the vruntime of the yielding
task runs away rapidly, which causes problematic imbalances later on.

Perhaps an alternative is to forfeit the vruntime (set it to the deadline), but
only once. I.e don't do it if the task is already ineligible? If the task is
ineligible then we simply increment the deadline as in my original patch?

Peter, let me know your thoughts on this.

Testing data below showing the vruntime forfeit yields bad max run delays:
vruntime forfeit:
• **yield_loop**: 4.37s runtime, max delay 272.99ms
• **busy_loop**: 13.54s runtime, max delay 552.01ms

deadline clamp:,
• **busy_loop**: 9.26s runtime, max delay 4.11ms
• **yield_loop**: 9.25s runtime, max delay 7.77ms

Test program:
#define PR_SCHED_CORE_SCOPE_THREAD 0
#define PR_SCHED_CORE_SCOPE_THREAD_GROUP 1
#endif

#include <sched.h>
#include <time.h>
#include <unistd.h>
#include <sys/prctl.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
    int should_yield = (argc > 1) ? atoi(argv[1]) : 1;
    time_t program_start = time(NULL);

    // Create core cookie for current process
    prctl(PR_SCHED_CORE, PR_SCHED_CORE_CREATE, 0, PR_SCHED_CORE_SCOPE_THREAD, 0);

    pid_t pid = fork();

    if (pid == 0) {
        // Child: yield for 5s then busy loop (if should_yield is 1)
        if (should_yield) {
            time_t start = time(NULL);
            while (time(NULL) - start < 5 && time(NULL) - program_start < 30) {
                sched_yield();
            }
        }
        while (time(NULL) - program_start < 30) {
            // busy loop
        }
    } else {
        // Parent: share cookie with child, then busy loop
        prctl(PR_SCHED_CORE, PR_SCHED_CORE_SHARE_TO, pid, PR_SCHED_CORE_SCOPE_THREAD, 0);
        while (time(NULL) - program_start < 30) {
            // busy loop
        }
    }

    return 0;
}

Repro:
taskset -c 0,1 core_yield_loop 1 &  #arg 1 = do yield
taskset -c 0,1 core_yield_loop 0 &  #arg 0 = don't yield



Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ