[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sfl6y4d0.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Mon, 05 Sep 2022 09:52:43 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com>
Cc: Wei Xu <weixugc@...gle.com>, Johannes Weiner <hannes@...xchg.org>,
Linux MM <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Yang Shi <shy828301@...il.com>,
Davidlohr Bueso <dave@...olabs.net>,
Tim C Chen <tim.c.chen@...el.com>,
Michal Hocko <mhocko@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Hesham Almatary <hesham.almatary@...wei.com>,
Dave Hansen <dave.hansen@...el.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Alistair Popple <apopple@...dia.com>,
Dan Williams <dan.j.williams@...el.com>,
jvgediya.oss@...il.com, Bharata B Rao <bharata@....com>,
Greg Thelen <gthelen@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: [PATCH v3 updated] mm/demotion: Expose memory tier details via
sysfs
Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com> writes:
> On 9/2/22 2:34 PM, Huang, Ying wrote:
>> Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com> writes:
>>
>>> On 9/2/22 1:27 PM, Huang, Ying wrote:
>>>> Wei Xu <weixugc@...gle.com> writes:
>>>>
>>>>> On Thu, Sep 1, 2022 at 11:44 PM Aneesh Kumar K V
>>>>> <aneesh.kumar@...ux.ibm.com> wrote:
>>>>>>
>>>>>> On 9/2/22 12:10 PM, Huang, Ying wrote:
>>>>>>> Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com> writes:
>>>>>>>
>>>>>>>> On 9/2/22 11:42 AM, Huang, Ying wrote:
>>>>>>>>> Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com> writes:
>>>>>>>>>
>>>>>>>>>> On 9/2/22 11:10 AM, Huang, Ying wrote:
>>>>>>>>>>> Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com> writes:
>>>>>>>>>>>
>>>>>>>>>>>> On 9/2/22 10:39 AM, Wei Xu wrote:
>>>>>>>>>>>>> On Thu, Sep 1, 2022 at 5:33 PM Huang, Ying <ying.huang@...el.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com> writes:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 9/1/22 12:31 PM, Huang, Ying wrote:
>>>>>>>>>>>>>>>> "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com> writes:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This patch adds /sys/devices/virtual/memory_tiering/ where all memory tier
>>>>>>>>>>>>>>>>> related details can be found. All allocated memory tiers will be listed
>>>>>>>>>>>>>>>>> there as /sys/devices/virtual/memory_tiering/memory_tierN/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The nodes which are part of a specific memory tier can be listed via
>>>>>>>>>>>>>>>>> /sys/devices/virtual/memory_tiering/memory_tierN/nodes
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think "memory_tier" is a better subsystem/bus name than
>>>>>>>>>>>>>>>> memory_tiering. Because we have a set of memory_tierN devices inside.
>>>>>>>>>>>>>>>> "memory_tier" sounds more natural. I know this is subjective, just my
>>>>>>>>>>>>>>>> preference.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I missed replying to this earlier. I will keep memory_tiering as subsystem name in v4
>>>>>>>>>>>> because we would want it to a susbsystem where all memory tiering related details can be found
>>>>>>>>>>>> including memory type in the future. This is as per discussion
>>>>>>>>>>>>
>>>>>>>>>>>> https://lore.kernel.org/linux-mm/CAAPL-u9TKbHGztAF=r-io3gkX7gorUunS2UfstudCWuihrA=0g@mail.gmail.com
>>>>>>>>>>>
>>>>>>>>>>> I don't think that it's a good idea to mix 2 types of devices in one
>>>>>>>>>>> subsystem (bus). If my understanding were correct, that breaks the
>>>>>>>>>>> driver core convention.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> All these are virtual devices .I am not sure i follow what you mean by 2 types of devices.
>>>>>>>>>> memory_tiering is a subsystem that represents all the details w.r.t memory tiering. It shows
>>>>>>>>>> details of memory tiers and can possibly contain details of different memory types .
>>>>>>>>>
>>>>>>>>> IMHO, memory_tier and memory_type are 2 kind of devices. They have
>>>>>>>>> almost totally different attributes (sysfs file). So, we should create
>>>>>>>>> 2 buses for them. Each has its own attribute group. "virtual" itself
>>>>>>>>> isn't a subsystem.
>>>>>>>>
>>>>>>>> Considering both the details are related to memory tiering, wouldn't it be much simpler we consolidate
>>>>>>>> them within the same subdirectory? I am still not clear why you are suggesting they need to be in different
>>>>>>>> sysfs hierarchy. It doesn't break any driver core convention as you mentioned earlier.
>>>>>>>>
>>>>>>>> /sys/devices/virtual/memory_tiering/memory_tierN
>>>>>>>> /sys/devices/virtual/memory_tiering/memory_typeN
>>>>>>>
>>>>>>> I think we should add
>>>>>>>
>>>>>>> /sys/devices/virtual/memory_tier/memory_tierN
>>>>>>> /sys/devices/virtual/memory_type/memory_typeN
>>>>>>>
>>>>>>
>>>>>> I am trying to find if there is a technical reason to do the same?
>>>>>>
>>>>>>> I don't think this is complex. Devices of same bus/subsystem should
>>>>>>> have mostly same attributes. This is my understanding of driver core
>>>>>>> convention.
>>>>>>>
>>>>>>
>>>>>> I was not looking at this from code complexity point. Instead of having multiple directories
>>>>>> with details w.r.t memory tiering, I was looking at consolidating the details
>>>>>> within the directory /sys/devices/virtual/memory_tiering. (similar to all virtual devices
>>>>>> are consolidated within /sys/devics/virtual/).
>>>>>>
>>>>>> -aneesh
>>>>>
>>>>> Here is an example of /sys/bus/nd/devices (I know it is not under
>>>>> /sys/devices/virtual, but it can still serve as a reference):
>>>>>
>>>>> ls -1 /sys/bus/nd/devices
>>>>>
>>>>> namespace2.0
>>>>> namespace3.0
>>>>> ndbus0
>>>>> nmem0
>>>>> nmem1
>>>>> region0
>>>>> region1
>>>>> region2
>>>>> region3
>>>>>
>>>>> So I think it is not unreasonable if we want to group memory tiering
>>>>> related interfaces within a single top directory.
>>>>
>>>> Thanks for pointing this out. My original understanding of driver core
>>>> isn't correct.
>>>>
>>>> But I still think it's better to separate instead of mixing memory_tier
>>>> and memory_type. Per my understanding, memory_type shows information
>>>> (abstract distance, latency, bandwidth, etc.) of memory types (and
>>>> nodes), it can be useful even without memory tiers. That is, memory
>>>> types describes the physical characteristics, while memory tier reflects
>>>> the policy.
>>>>
>>>
>>> The latency and bandwidth details are already exposed via
>>>
>>> /sys/devices/system/node/nodeY/access0/initiators/
>>>
>>> Documentation/admin-guide/mm/numaperf.rst
>>>
>>> That is the interface that libraries like libmemkind will look at for finding
>>> details w.r.t latency/bandwidth
>>
>> Yes. Only with that, it's still inconvenient to find out which nodes
>> belong to same memory type (has same performance, same topology, managed
>> by same driver, etc). So memory types can still provide useful
>> information even without memory tiering.
>>
>
> I am not sure i quiet follow what to conclude from your reply. I used the subsystem name
> "memory_tiering" so that all memory tiering related information can be consolidated there.
> I guess you agreed to the above part that we can consolidated things like that.
I just prefer to separate memory_tier and memory_type sysfs directories
personally. Because memory_type describes the physical memory types and
performance, while memory_tier is more about the policy to group
memory_types.
> We might end up adding memory_type there if we allow changing "abstract distance" of a
> memory type from userspace later. Otherwise, I don't see a reason for memory type to be
> exposed. But then we don't have to decide on this now.
As above, because I think memory_type can provide value even outside of
memory_tier, I prefer to add memory_type sysfs interface anyway
personally.
Best Regards,
Huang, Ying
Powered by blists - more mailing lists