linux-kernel - Re: [RFC 1/1] mm/mempolicy: introduce system default interleave weights

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87a5nme9c1.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Tue, 27 Feb 2024 13:59:26 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Gregory Price <gregory.price@...verge.com>
Cc: Gregory Price <gourry.memverge@...il.com>,  <linux-mm@...ck.org>,
  <linux-kernel@...r.kernel.org>,  <hannes@...xchg.org>,
  <dan.j.williams@...el.com>,  <dave.jiang@...el.com>
Subject: Re: [RFC 1/1] mm/mempolicy: introduce system default interleave
 weights

Gregory Price <gregory.price@...verge.com> writes:

> On Tue, Feb 27, 2024 at 08:38:19AM +0800, Huang, Ying wrote:
>> Gregory Price <gregory.price@...verge.com> writes:
>> > Where are the 100 nodes coming from?
>> 
>> If you have a real large machine with more than 100 nodes, and some of
>> them are CXL memory nodes, then it's possible that most nodes will have
>> interleave weight "1" because the sum of all interleave weights is
>> "100".  Then, even if you use only one socket, the interleave weight of
>> DRAM and CXL MEM could be all "1", lead to useless default value.  So, I
>> suggest don't cap the sum of interleave weights.
>
> I have to press this issue: Is this an actual, practical, concern?

I don't know who have large machine like that.  But I guess that it's
possible in the long run.

> It seems to me in this type of scenario, there are larger, more complex
> numa topology issues that make the use of the general, global weighted
> mempolicy system entirely impractical.  This is a bit outside the scope

It's possible to solve the problem step by step.  For example, add
per-task interleave weight at some time.

>> > So, long winded winded way of saying:
>> > - Could we use a larger default number? Yes.
>> > - Does that actually help us? Not really, we want smaller numbers.
>> 
>> The larger number will be reduced after GCD.
>>
>
> I suppose another strategy is to calculate the interleave weights
> un-bounded from the raw bandwidth - but continuously force reductions
> (through some yet-undefined algorithm) until at least one node reaches a
> weight of `1`.  This suffers from the opposite problem: what if the top
> node has a value greater than 255? Do we just cap it at 255? That seems
> the opposite form of problematic.
>
> (Large numbers are quite pointless, as it is essentially the antithesis
> of interleave)

Yes.  So I suggest to use a relative small number as the default weight
to start with for normal DRAM.  We will have to floor/ceiling the weight
value.

--
Best Regards,
Huang, Ying