[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a42d52c-6275-4798-19c0-dfc530c04b0e@linux.ibm.com>
Date: Wed, 8 Jun 2022 10:00:41 +0530
From: Aneesh Kumar K V <aneesh.kumar@...ux.ibm.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>, linux-mm@...ck.org,
akpm@...ux-foundation.org
Cc: Wei Xu <weixugc@...gle.com>, Huang Ying <ying.huang@...el.com>,
Greg Thelen <gthelen@...gle.com>,
Yang Shi <shy828301@...il.com>,
Davidlohr Bueso <dave@...olabs.net>,
Tim C Chen <tim.c.chen@...el.com>,
Brice Goglin <brice.goglin@...il.com>,
Michal Hocko <mhocko@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Hesham Almatary <hesham.almatary@...wei.com>,
Dave Hansen <dave.hansen@...el.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Alistair Popple <apopple@...dia.com>,
Dan Williams <dan.j.williams@...el.com>,
Feng Tang <feng.tang@...el.com>,
Jagdish Gediya <jvgediya@...ux.ibm.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH v5 1/9] mm/demotion: Add support for explicit memory tiers
On 6/8/22 12:13 AM, Tim Chen wrote:
> On Fri, 2022-06-03 at 19:12 +0530, Aneesh Kumar K.V wrote:
>>
>>
>> The nodes which are part of a specific memory tier can be listed
>> via
>> /sys/devices/system/memtier/memtierN/nodelist
>>
>> "Rank" is an opaque value. Its absolute value doesn't have any
>> special meaning. But the rank values of different memtiers can be
>> compared with each other to determine the memory tier order.
>>
>> For example, if we have 3 memtiers: memtier0, memtier1, memiter2, and
>> their rank values are 300, 200, 100, then the memory tier order is:
>> memtier0 -> memtier2 -> memtier1,
>
> Why is memtier2 (rank 100) higher than memtier1 (rank 200)? Seems like
> the order should be memtier0 -> memtier1 -> memtier2?
> (rank 300) (rank 200) (rank 100)
>
>> where memtier0 is the highest tier
>> and memtier1 is the lowest tier.
>
> I think memtier2 is the lowest as it has the lowest rank value.
typo error. Will fix that in the next update
>>
>> The rank value of each memtier should be unique.
>>
>>
>> +
>> +static void memory_tier_device_release(struct device *dev)
>> +{
>> + struct memory_tier *tier = to_memory_tier(dev);
>> +
>
> Do we need some ref counts on memory_tier?
> If there is another device still using the same memtier,
> free below could cause problem.
>
>> + kfree(tier);
>> +}
>> +
>>
> ...
>> +static struct memory_tier *register_memory_tier(unsigned int tier)
>> +{
>> + int error;
>> + struct memory_tier *memtier;
>> +
>> + if (tier >= MAX_MEMORY_TIERS)
>> + return NULL;
>> +
>> + memtier = kzalloc(sizeof(struct memory_tier), GFP_KERNEL);
>> + if (!memtier)
>> + return NULL;
>> +
>> + memtier->dev.id = tier;
>> + memtier->rank = get_rank_from_tier(tier);
>> + memtier->dev.bus = &memory_tier_subsys;
>> + memtier->dev.release = memory_tier_device_release;
>> + memtier->dev.groups = memory_tier_dev_groups;
>> +
>
> Should you take the mem_tier_lock before you insert to
> memtier-list?
Both register_memory_tier and unregister_memory_tier get called with
memory_tier_lock held.
>
>> + insert_memory_tier(memtier);
>> +
>> + error = device_register(&memtier->dev);
>> + if (error) {
>> + list_del(&memtier->list);
>> + put_device(&memtier->dev);
>> + return NULL;
>> + }
>> + return memtier;
>> +}
>> +
>> +__maybe_unused // temporay to prevent warnings during bisects
>> +static void unregister_memory_tier(struct memory_tier *memtier)
>> +{
>
> I think we should take mem_tier_lock before modifying memtier->list.
>
unregister_memory_tier get called with memory_tier_lock held.
>> + list_del(&memtier->list);
>> + device_unregister(&memtier->dev);
>> +}
>> +
>>
-aneesh
Powered by blists - more mailing lists