linux-kernel - Re: [RFC PATCH v2 0/3] mm: mempolicy: Multi-tier weighted interleaving

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ZS33ClT00KsHKsXQ@memverge.com>
Date:   Mon, 16 Oct 2023 22:52:58 -0400
From:   Gregory Price <gregory.price@...verge.com>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Gregory Price <gourry.memverge@...il.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, linux-cxl@...r.kernel.org,
        akpm@...ux-foundation.org, sthanneeru@...ron.com
Subject: Re: [RFC PATCH v2 0/3] mm: mempolicy: Multi-tier weighted
 interleaving

On Wed, Oct 18, 2023 at 04:29:02PM +0800, Huang, Ying wrote:
> Gregory Price <gregory.price@...verge.com> writes:
> 
> > There are at least 5 proposals that i know of at the moment
> >
> > 1) mempolicy
> > 2) memory-tiers
> > 3) memory-block interleaving? (weighting among blocks inside a node)
> >    Maybe relevant if Dynamic Capacity devices arrive, but it seems
> >    like the wrong place to do this.
> > 4) multi-device nodes (e.g. cxl create-region ... mem0 mem1...)
> > 5) "just do it in hardware"
> 
> It may be easier to start with the use case.  What is the practical use
> cases in your mind that can not be satisfied with simple per-memory-tier
> weight?  Can you compare the memory layout with different proposals?
>

Before I delve in, one clarifying question:  When you asked whether
weights should be part of node or memory-tiers, i took that to mean
whether it should be part of mempolicy or memory-tiers.

Were you suggesting that weights should actually be part of
drivers/base/node.c?

Because I had not considered that, and this seems reasonable, easy to
implement, and would not require tying mempolicy.c to memory-tiers.c

Beyond this, i think there's been 3 imagined use cases (now, including
this).

a)
numactl --weighted-interleave=Node:weight,0:16,1:4,...

b)
echo weight > /sys/.../memory-tiers/memtier/access0/interleave_weight
numactl --interleave=0,1

c)
echo weight > /sys/bus/node/node0/access0/interleave_weight
numactl --interleave=0,1

d)
options b or c, but with --weighted-interleave=0,1 instead
this requires libnuma changes to pick up, but it retains --interleave
as-is to avoid user confusion.

The downside of an approach like A (which was my original approach), was
that the weights cannot really change should a node be hotplugged. Tasks
would need to detect this and change the policy themselves.  That's not
a good solution.

However in both B and C's design, weights can be rebalanced in response
to any number of events.  Ultimately B and C are equivalent, but
the placement in nodes is cleaner and more intuitive.  If memory-tiers
wants to use/change this information, there's nothing that prevents it.

Assuming this is your meaning, I agree and I will pivot to this.

~Gregory