[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZvXnl29kN4zPtGxU@slm.duckdns.org>
Date: Thu, 26 Sep 2024 13:00:39 -1000
From: Tejun Heo <tj@...nel.org>
To: void@...ifault.com
Cc: kernel-team@...a.com, linux-kernel@...r.kernel.org, sched-ext@...a.com
Subject: Re: [PATCHSET sched_ext/for-6.12-fixes] sched_ext: Split
%SCX_DSQ_GLOBAL per-node
On Tue, Sep 24, 2024 at 02:06:02PM -1000, Tejun Heo wrote:
> In the bypass mode, the global DSQ is used to schedule all tasks in simple
> FIFO order. All tasks are queued into the global DSQ and all CPUs try to
> execute tasks from it. This creates a lot of cross-node cacheline accesses
> and scheduling across the node boundaries, and can lead to live-lock
> conditions where the system takes tens of minutes to disable the BPF
> scheduler while executing in the bypass mode.
>
> This patchset splits the global DSQ per NUMA node. Each node has its own
> global DSQ. When a task is dispatched to SCX_DSQ_GLOBAL, it's put into the
> global DSQ local to the task's CPU and all CPUs in a node only consume its
> node-local global DSQ.
>
> This resolves a livelock condition which could be reliably triggered on an
> 2x EPYC 7642 system by running `stress-ng --race-sched 1024` together with
> `stress-ng --workload 80 --workload-threads 10` while repeatedly enabling
> and disabling a SCX scheduler.
>
> This patchset contains the following patches:
>
> 0001-scx_flatcg-Use-a-user-DSQ-for-fallback-instead-of-SC.patch
> 0002-sched_ext-Allow-only-user-DSQs-for-scx_bpf_consume-s.patch
> 0003-sched_ext-Relocate-find_user_dsq.patch
> 0004-sched_ext-Split-the-global-DSQ-per-NUMA-node.patch
> 0005-sched_ext-Use-shorter-slice-while-bypassing.patch
Applied to sched_ext/for-6.12-fixes.
Thanks.
--
tejun
Powered by blists - more mailing lists