lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 1 Aug 2022 10:10:39 +0530
From:   Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     linux-mm@...ck.org, akpm@...ux-foundation.org,
        Wei Xu <weixugc@...gle.com>, Yang Shi <shy828301@...il.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Tim C Chen <tim.c.chen@...el.com>,
        Michal Hocko <mhocko@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Hesham Almatary <hesham.almatary@...wei.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        Alistair Popple <apopple@...dia.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Johannes Weiner <hannes@...xchg.org>, jvgediya.oss@...il.com
Subject: Re: [PATCH v11 4/8] mm/demotion/dax/kmem: Set node's abstract
 distance to MEMTIER_ADISTANCE_PMEM

On 8/1/22 7:36 AM, Huang, Ying wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com> writes:
> 
>> "Huang, Ying" <ying.huang@...el.com> writes:
>>
>>> "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com> writes:
>>>
>>>> By default, all nodes are assigned to the default memory tier which
>>>> is the memory tier designated for nodes with DRAM
>>>>
>>>> Set dax kmem device node's tier to slower memory tier by assigning
>>>> abstract distance to MEMTIER_ADISTANCE_PMEM. PMEM tier
>>>> appears below the default memory tier in demotion order.
>>>>
>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@...ux.ibm.com>
>>>> ---
>>>>  drivers/dax/kmem.c           |  9 +++++++++
>>>>  include/linux/memory-tiers.h | 19 ++++++++++++++++++-
>>>>  mm/memory-tiers.c            | 28 ++++++++++++++++------------
>>>>  3 files changed, 43 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
>>>> index a37622060fff..6b0d5de9a3e9 100644
>>>> --- a/drivers/dax/kmem.c
>>>> +++ b/drivers/dax/kmem.c
>>>> @@ -11,6 +11,7 @@
>>>>  #include <linux/fs.h>
>>>>  #include <linux/mm.h>
>>>>  #include <linux/mman.h>
>>>> +#include <linux/memory-tiers.h>
>>>>  #include "dax-private.h"
>>>>  #include "bus.h"
>>>>  
>>>> @@ -41,6 +42,12 @@ struct dax_kmem_data {
>>>>  	struct resource *res[];
>>>>  };
>>>>  
>>>> +static struct memory_dev_type default_pmem_type  = {
>>>
>>> Why is this named as default_pmem_type?  We will not change the memory
>>> type of a node usually.
>>>
>>
>> Any other suggestion? pmem_dev_type? 
> 
> Or dax_pmem_type?
> 
> DAX is used to enumerate the memory device.
> 
>>
>>>> +	.adistance = MEMTIER_ADISTANCE_PMEM,
>>>> +	.tier_sibiling = LIST_HEAD_INIT(default_pmem_type.tier_sibiling),
>>>> +	.nodes  = NODE_MASK_NONE,
>>>> +};
>>>> +
>>>>  static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
>>>>  {
>>>>  	struct device *dev = &dev_dax->dev;
>>>> @@ -62,6 +69,8 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
>>>>  		return -EINVAL;
>>>>  	}
>>>>  
>>>> +	init_node_memory_type(numa_node, &default_pmem_type);
>>>> +
>>>
>>> The memory hot-add below may fail.  So the error handling needs to be
>>> added.
>>>
>>> And, it appears that the memory type and memory tier of a node may be
>>> fully initialized here before NUMA hot-adding started.  So I suggest to
>>> set node_memory_types[] here only.  And set memory_dev_type->nodes in
>>> node hot-add callback.  I think there is the proper place to complete
>>> the initialization.
>>>
>>> And, in theory dax/kmem.c can be unloaded.  So we need to clear
>>> node_memory_types[] for nodes somewhere.
>>>
>>
>> I guess by module exit we can be sure that all the memory managed
>> by dax/kmem is hotplugged out. How about something like below?
> 
> Because we set node_memorty_types[] in dev_dax_kmem_probe(), it's
> natural to clear it in dev_dax_kmem_remove().
> 

Most of required reset/clear is done as part of memory hotunplug. So
if we did manage to successfully unplug the memory, everything except
node_memory_types[node] should be reset. That makes the clear_node_memory_type
the below. 

void clear_node_memory_type(int node, struct memory_dev_type *memtype)
{

	mutex_lock(&memory_tier_lock);
	/*
	 * memory unplug did clear the node from the memtype and
	 * dax/kem did initialize this node's memory type.
	 */
	if (!node_isset(node, memtype->nodes) && node_memory_types[node]  == memtype){
		node_memory_types[node] = NULL;
	}
	mutex_unlock(&memory_tier_lock);
}

With the module unload, it is kind of force removing the usage of the specific memtype.
Considering module unload will remove the usage of specific memtype from other parts
of the kernel and we already do all the required reset in memory hot unplug, do we
need to do the clear_node_memory_type above? 

-aneesh



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ