lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZvXnl29kN4zPtGxU@slm.duckdns.org>
Date: Thu, 26 Sep 2024 13:00:39 -1000
From: Tejun Heo <tj@...nel.org>
To: void@...ifault.com
Cc: kernel-team@...a.com, linux-kernel@...r.kernel.org, sched-ext@...a.com
Subject: Re: [PATCHSET sched_ext/for-6.12-fixes] sched_ext: Split
 %SCX_DSQ_GLOBAL per-node

On Tue, Sep 24, 2024 at 02:06:02PM -1000, Tejun Heo wrote:
> In the bypass mode, the global DSQ is used to schedule all tasks in simple
> FIFO order. All tasks are queued into the global DSQ and all CPUs try to
> execute tasks from it. This creates a lot of cross-node cacheline accesses
> and scheduling across the node boundaries, and can lead to live-lock
> conditions where the system takes tens of minutes to disable the BPF
> scheduler while executing in the bypass mode.
> 
> This patchset splits the global DSQ per NUMA node. Each node has its own
> global DSQ. When a task is dispatched to SCX_DSQ_GLOBAL, it's put into the
> global DSQ local to the task's CPU and all CPUs in a node only consume its
> node-local global DSQ.
> 
> This resolves a livelock condition which could be reliably triggered on an
> 2x EPYC 7642 system by running `stress-ng --race-sched 1024` together with
> `stress-ng --workload 80 --workload-threads 10` while repeatedly enabling
> and disabling a SCX scheduler.
> 
> This patchset contains the following patches:
> 
>  0001-scx_flatcg-Use-a-user-DSQ-for-fallback-instead-of-SC.patch
>  0002-sched_ext-Allow-only-user-DSQs-for-scx_bpf_consume-s.patch
>  0003-sched_ext-Relocate-find_user_dsq.patch
>  0004-sched_ext-Split-the-global-DSQ-per-NUMA-node.patch
>  0005-sched_ext-Use-shorter-slice-while-bypassing.patch

Applied to sched_ext/for-6.12-fixes.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ