linux-kernel - Re: [External Mail] [RFC PATCH v2] Weighted interleave auto-tuning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <87ikr8abhn.fsf@DESKTOP-5N7EMDA>
Date: Wed, 25 Dec 2024 07:48:36 +0800
From: "Huang, Ying" <ying.huang@...ux.alibaba.com>
To: Gregory Price <gourry@...rry.net>
Cc: Hyeonggon Yoo <hyeonggon.yoo@...com>,  Joshua Hahn
 <joshua.hahnjy@...il.com>,  kernel_team@...ynix.com,  42.hyeyoo@...il.com,
  "rafael@...nel.org" <rafael@...nel.org>,  "lenb@...nel.org"
 <lenb@...nel.org>,  "gregkh@...uxfoundation.org"
 <gregkh@...uxfoundation.org>,  "akpm@...ux-foundation.org"
 <akpm@...ux-foundation.org>,  Honggyu Kim <honggyu.kim@...com>,  Rakie Kim
 <rakie.kim@...com>,  "dan.j.williams@...el.com"
 <dan.j.williams@...el.com>,  "Jonathan.Cameron@...wei.com"
 <Jonathan.Cameron@...wei.com>,  "dave.jiang@...el.com"
 <dave.jiang@...el.com>,  "horen.chuang@...ux.dev"
 <horen.chuang@...ux.dev>,  "hannes@...xchg.org" <hannes@...xchg.org>,
  "linux-mm@...ck.org" <linux-mm@...ck.org>,
  "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
  "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
  "kernel-team@...a.com" <kernel-team@...a.com>
Subject: Re: [External Mail] [RFC PATCH v2] Weighted interleave auto-tuning

Gregory Price <gourry@...rry.net> writes:

> On Sun, Dec 22, 2024 at 03:21:32PM +0800, Huang, Ying wrote:
>> Hyeonggon Yoo <hyeonggon.yoo@...com> writes:
>> 
>> > On this server, ideally weighted interleaving should be configured
>> > within a socket (e.g. local NUMA node + local CXL node) because
>> > weighted interleaving does not consider the bandwidth when accessed
>> > from a remote socket.
>> 
>> If multiple sockets are considered, what is the best behavior?
>> 
>> The process may be cross-socket too.  So, we will need to use
>> set_mempolicy() to bind tasks to sockets firstly.  Then, it may be
>> better to use per-task weights.
>>
>
> If we want to revisit this, we might be able to make task-local weights
> work without a new syscall, but the use case was not clear enough which
> is why it was soft-nak'd originally.

Yes.  That is doable.  However, the challenge is lacking use cases.  I
guess that we can wait for more use cases?

> vma-local weights are arguably more usable, but require the task to be
> numa-aware and probably require a new mempolicy syscall because mbind
> has no remaining arguments.
>
> recall my original testing results from stream:
> https://lore.kernel.org/linux-mm/20240202170238.90004-1-gregory.price@memverge.com/
>
> Stream Benchmark (vs DRAM, 1 Socket + 1 CXL Device)
> Default interleave : -78% (slower than DRAM)
> Global weighting   : -6% to +4% (workload dependant)
> Targeted weights   : +2.5% to +4% (consistently better than DRAM)

---
Best Regards,
Huang, Ying