lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 10 May 2022 17:14:23 +0530
From:   "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>
To:     Wei Xu <weixugc@...gle.com>,
        Hesham Almatary <hesham.almatary@...wei.com>
Cc:     Yang Shi <shy828301@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Huang Ying <ying.huang@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Linux MM <linux-mm@...ck.org>,
        Greg Thelen <gthelen@...gle.com>,
        Jagdish Gediya <jvgediya@...ux.ibm.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Alistair Popple <apopple@...dia.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Michal Hocko <mhocko@...nel.org>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Brice Goglin <brice.goglin@...il.com>,
        Feng Tang <feng.tang@...el.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Re: RFC: Memory Tiering Kernel Interfaces

Wei Xu <weixugc@...gle.com> writes:

> On Mon, May 9, 2022 at 7:32 AM Hesham Almatary
> <hesham.almatary@...wei.com> wrote:
>>

....

> > nearest lower tier before demoting to lower lower tiers.
>> There might still be simple cases/topologies where we might want to "skip"
>> the very next lower tier. For example, assume we have a 3 tiered memory
>> system as follows:
>>
>> node 0 has a CPU and DDR memory in tier 0, node 1 has GPU and DDR memory
>> in tier 0,
>> node 2 has NVMM memory in tier 1, node 3 has some sort of bigger memory
>> (could be a bigger DDR or something) in tier 2. The distances are as
>> follows:
>>
>> --------------          --------------
>> |   Node 0   |          |   Node 1   |
>> |  -------   |          |  -------   |
>> | |  DDR  |  |          | |  DDR  |  |
>> |  -------   |          |  -------   |
>> |            |          |            |
>> --------------          --------------
>>         | 20               | 120    |
>>         v                  v        |
>> ----------------------------       |
>> | Node 2     PMEM          |       | 100
>> ----------------------------       |
>>         | 100                       |
>>         v                           v
>> --------------------------------------
>> | Node 3    Large mem                |
>> --------------------------------------
>>
>> node distances:
>> node   0    1    2    3
>>     0  10   20   20  120
>>     1  20   10  120  100
>>     2  20  120   10  100
>>     3  120 100  100   10
>>
>> /sys/devices/system/node/memory_tiers
>> 0-1
>> 2
>> 3
>>
>> N_TOPTIER_MEMORY: 0-1
>>
>>
>> In this case, we want to be able to "skip" the demotion path from Node 1
>> to Node 2,
>>
>> and make demotion go directely to Node 3 as it is closer, distance wise.
>> How can
>>
>> we accommodate this scenario (or at least not rule it out as future
>> work) with the current RFC?
>
> This is an interesting example.  I think one way to support this is to
> allow all the lower tier nodes to be the demotion targets of a node in
> the higher tier.  We can then use the allocation fallback order to
> select the best demotion target.
>
> For this example, we will have the demotion targets of each node as:
>
> node 0: allowed=2-3, order (based on allocation fallback order): 2, 3
> node 1: allowed=2-3, order (based on allocation fallback order): 3, 2
> node 2: allowed = 3, order (based on allocation fallback order): 3
> node 3: allowed = empty
>
> What do you think?
>

Can we simplify this further with

tier 0 - > empty (no HBM/GPU)
tier 1 ->  Node0, Node1
tier 2 ->  Node2, Node3

Hence

 node 0: allowed=2-3, order (based on allocation fallback order): 2, 3
 node 1: allowed=2-3, order (based on allocation fallback order): 3, 2
 node 2: allowed = empty
 node 3: allowed = empty

-aneesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ