[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67dde65a-75fa-4c34-a8a7-02260c394bf2@efficios.com>
Date: Fri, 23 Aug 2024 16:45:38 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Yury Norov <yury.norov@...il.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
linux-kernel@...r.kernel.org, Valentin Schneider <vschneid@...hat.com>,
Mel Gorman <mgorman@...e.de>, Steven Rostedt <rostedt@...dmis.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>, Ben Segall
<bsegall@...gle.com>, Rasmus Villemoes <linux@...musvillemoes.dk>,
Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: [RFC PATCH v1 4/6] sched: NUMA-aware per-memory-map concurrency
IDs
On 2024-08-23 22:14, Yury Norov wrote:
> On Fri, Aug 23, 2024 at 02:59:44PM -0400, Mathieu Desnoyers wrote:
>> The issue addressed by this change is the non-locality of NUMA accesses
>> to data structures indexed by concurrency IDs: for example, in a
>> scenario where a process has two threads, and they periodically run one
>> after the other on different NUMA nodes, each will be assigned mm_cid=0.
>> As a consequence, they will end up accessing the same pages, and thus at
>> least one of the threads will need to perform remote NUMA accesses,
>> which is inefficient.
>>
>> That being said, the same issue theoretically exists due to false
>> sharing of cache lines by threads running on after another on different
>
> running one after another you mean?
Yes, you are correct. I will fix this typo for the next round,
Thanks,
Mathieu
>
>> cores/CPUs within a single NUMA node, but the extent of the performance
>> impact is lesser than remote NUMA accesses.
>>
>> Solve this by making the rseq concurrency ID (mm_cid) NUMA-aware. On
>> NUMA systems, when a NUMA-aware concurrency ID is observed by user-space
>> to be associated with a NUMA node, guarantee that it never changes NUMA
>> node unless either a kernel-level NUMA configuration change happens, or
>> scheduler migrations end up migrating tasks across NUMA nodes.
>>
>> There is a tradeoff between NUMA locality and compactness of the
>> concurrency ID allocation. Favor compactness over NUMA locality when
>> the scheduler migrates tasks across NUMA nodes, as this does not cause
>> the frequent remote NUMA accesses behavior. This is done by limiting the
>> concurrency ID range to minimum between the number of threads belonging
>> to the process and the number of allowed CPUs.
>>
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>> Cc: Peter Zijlstra <peterz@...radead.org>
>> Cc: Ingo Molnar <mingo@...hat.com>
>> Cc: Valentin Schneider <vschneid@...hat.com>
>> Cc: Mel Gorman <mgorman@...e.de>
>> Cc: Steven Rostedt <rostedt@...dmis.org>
>> Cc: Vincent Guittot <vincent.guittot@...aro.org>
>> Cc: Dietmar Eggemann <dietmar.eggemann@....com>
>> Cc: Ben Segall <bsegall@...gle.com>
>> ---
>> Changes since v0:
>> - Rename "notandnot" to "nor".
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists