lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 6 Jun 2022 14:32:17 +0530
From:   Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com>
To:     Ying Huang <ying.huang@...el.com>
Cc:     Greg Thelen <gthelen@...gle.com>, Yang Shi <shy828301@...il.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Tim C Chen <tim.c.chen@...el.com>,
        Brice Goglin <brice.goglin@...il.com>,
        Michal Hocko <mhocko@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Hesham Almatary <hesham.almatary@...wei.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        Alistair Popple <apopple@...dia.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Feng Tang <feng.tang@...el.com>,
        Jagdish Gediya <jvgediya@...ux.ibm.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        David Rientjes <rientjes@...gle.com>, linux-mm@...ck.org,
        akpm@...ux-foundation.org
Subject: Re: [RFC PATCH v4 1/7] mm/demotion: Add support for explicit memory
 tiers

On 6/6/22 2:22 PM, Ying Huang wrote:
....
>>>> I can move the patch "mm/demotion/dax/kmem: Set node's memory tier to
>>>> MEMORY_TIER_PMEM" before switching the demotion logic so that on systems
>>>> with two memory tiers (DRAM and pmem) the demotion continues to work
>>>> as expected after patch 3 ("mm/demotion: Build demotion targets based on
>>>> explicit memory tiers"). With that, there will not be any regression in
>>>> between the patch series.
>>>>
>>>
>>> Thanks!  Please do that.  And I think you can add sysfs interface after
>>> that patch too.  That is, in [1/7]
>>>
>>
>> I am not sure why you insist on moving sysfs interfaces later. They are
>> introduced based on the helper added. It make patch review easier to
>> look at both the helpers and the user of the helper together in a patch.
> 
> Yes.  We should introduce a function and its user in one patch for
> review.  But this doesn't mean that we should introduce the user space
> interface as the first step.  I think the user space interface should
> output correct information when we expose it.
> 

If you look at this patchset we are not exposing any wrong information.

patch 1 -> adds ability to register the memory tiers and expose details 
of registered memory tier. At this point the patchset only support DRAM 
tier and hence only one tier is shown

patch 2 -> adds per node memtier attribute. So only DRAM nodes shows the 
details, because the patchset yet has not introduced a slower memory 
tier like PMEM.

patch 4 -> introducing demotion. Will make that patch 5

patch 5 -> add dax kmem numa nodes as slower memory tier. Now this 
becomes patch 4 at which point we will correctly show two memory tiers 
in the system.


>>> +struct memory_tier {
>>> +	nodemask_t nodelist;
>>> +};
>>>
>>> And struct device can be added after the kernel has switched the
>>> implementation based on explicit memory tiers.
>>>
>>> +struct memory_tier {
>>> +	struct device dev;
>>> +	nodemask_t nodelist;
>>> +};
>>>
>>
>>
>> Can you elaborate on this? or possibly review the v5 series indicating
>> what change you are suggesting here?
>>
>>
>>> But I don't think it's a good idea to have "struct device" embedded in
>>> "struct memory_tier".  We don't have "struct device" embedded in "struct
>>> pgdata_list"...
>>>
>>
>> I avoided creating an array for memory_tier (memory_tier[]) so that we
>> can keep it dynamic. Keeping dev embedded in struct memory_tier simplify
>> the life cycle management of that dynamic list. We free the struct
>> memory_tier allocation via device release function (memtier->dev.release
>> = memory_tier_device_release )
>>
>> Why do you think it is not a good idea?
> 
> I think that we shouldn't bind our kernel internal implementation with
> user space interface too much.  Yes.  We can expose kernel internal
> implementation to user space in a direct way.  I suggest you to follow
> the style of "struct pglist_data" and "struct node".  If we decouple
> "struct memory_tier" and "struct memory_tier_dev" (or some other name),
> we can refer to "struct memory_tier" without depending on all device
> core.  Memory tier should be accessible inside the kernel even without a
> user interface.  And memory tier isn't a device in concept.
> 

memory_tiers are different from pglist_data and struct node in that we 
also allow the creation of them from userspace. That is the life time of 
a memory tier is driven from userspace and it is much easier to manage 
them via sysfs file lifetime mechanism rather than inventing an 
independent and more complex way of doing the same.

> For life cycle management, I think that we can do that without sysfs
> too.
> 

unless there are specific details that you think will be broken by 
embedding struct device inside struct memory_tier, IMHO I still consider 
the embedded implementation much simpler and in accordance with other 
kernel design patterns.

-aneesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ