[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26e84b0f-68ed-4e98-925e-5799a2ae1164@linux.alibaba.com>
Date: Mon, 22 Jul 2024 11:52:09 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Ryan Roberts <ryan.roberts@....com>,
Andrew Morton <akpm@...ux-foundation.org>, Hugh Dickins <hughd@...gle.com>,
Jonathan Corbet <corbet@....net>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
David Hildenbrand <david@...hat.com>, Barry Song <baohua@...nel.org>,
Lance Yang <ioworker0@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v1 2/2] mm: mTHP stats for pagecache folio allocations
On 2024/7/14 17:05, Ryan Roberts wrote:
> On 13/07/2024 13:54, Baolin Wang wrote:
>>
>>
>> On 2024/7/13 19:00, Ryan Roberts wrote:
>>> [...]
>>>
>>>>> +static int thpsize_create(int order, struct kobject *parent)
>>>>> {
>>>>> unsigned long size = (PAGE_SIZE << order) / SZ_1K;
>>>>> + struct thpsize_child *stats;
>>>>> struct thpsize *thpsize;
>>>>> int ret;
>>>>> + /*
>>>>> + * Each child object (currently only "stats" directory) holds a
>>>>> + * reference to the top-level thpsize object, so we can drop our ref to
>>>>> + * the top-level once stats is setup. Then we just need to drop a
>>>>> + * reference on any children to clean everything up. We can't just use
>>>>> + * the attr group name for the stats subdirectory because there may be
>>>>> + * multiple attribute groups to populate inside stats and overlaying
>>>>> + * using the name property isn't supported in that way; each attr group
>>>>> + * name, if provided, must be unique in the parent directory.
>>>>> + */
>>>>> +
>>>>> thpsize = kzalloc(sizeof(*thpsize), GFP_KERNEL);
>>>>> - if (!thpsize)
>>>>> - return ERR_PTR(-ENOMEM);
>>>>> + if (!thpsize) {
>>>>> + ret = -ENOMEM;
>>>>> + goto err;
>>>>> + }
>>>>> + thpsize->order = order;
>>>>> ret = kobject_init_and_add(&thpsize->kobj, &thpsize_ktype, parent,
>>>>> "hugepages-%lukB", size);
>>>>> if (ret) {
>>>>> kfree(thpsize);
>>>>> - return ERR_PTR(ret);
>>>>> + goto err;
>>>>> }
>>>>> - ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group);
>>>>> - if (ret) {
>>>>> + stats = kzalloc(sizeof(*stats), GFP_KERNEL);
>>>>> + if (!stats) {
>>>>> kobject_put(&thpsize->kobj);
>>>>> - return ERR_PTR(ret);
>>>>> + ret = -ENOMEM;
>>>>> + goto err;
>>>>> }
>>>>> - ret = sysfs_create_group(&thpsize->kobj, &stats_attr_group);
>>>>> + ret = kobject_init_and_add(&stats->kobj, &thpsize_child_ktype,
>>>>> + &thpsize->kobj, "stats");
>>>>> + kobject_put(&thpsize->kobj);
>>>>> if (ret) {
>>>>> - kobject_put(&thpsize->kobj);
>>>>> - return ERR_PTR(ret);
>>>>> + kfree(stats);
>>>>> + goto err;
>>>>> }
>>>>> - thpsize->order = order;
>>>>> - return thpsize;
>>>>> + if (BIT(order) & THP_ORDERS_ALL_ANON) {
>>>>> + ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group);
>>>>> + if (ret)
>>>>> + goto err_put;
>>>>> +
>>>>> + ret = sysfs_create_group(&stats->kobj, &stats_attr_group);
>>>>> + if (ret)
>>>>> + goto err_put;
>>>>> + }
>>>>> +
>>>>> + if (BIT(order) & PAGECACHE_LARGE_ORDERS) {
>>>>> + ret = sysfs_create_group(&stats->kobj, &file_stats_attr_group);
>>>>> + if (ret)
>>>>> + goto err_put;
>>>>> + }
>>>>> +
>>>>> + list_add(&stats->node, &thpsize_child_list);
>>>>> + return 0;
>>>>> +err_put:
>>>>
>>>> IIUC, I think you should call 'sysfs_remove_group' to remove the group before
>>>> putting the kobject.
>>>
>>> Are you sure about that? As I understood it, sysfs_create_group() was
>>> conceptually modifying the state of the kobj, so when the kobj gets destroyed,
>>> all its state is tidied up. __kobject_del() (called on the last kobject_put())
>>> calls sysfs_remove_groups() and tidies up the sysfs state as far as I can see?
>>
>> IIUC, __kobject_del() only removes the ktype defaut groups by
>> 'sysfs_remove_groups(kobj, ktype->default_groups)', but your created groups are
>> not added into the ktype->default_groups. That means you should mannuly remove
>> them, or am I miss something?
>
> That was also putting doubt in my mind. But the sample at
> samples/kobject/kobject-example.c does not call sysfs_remove_group(). It just
> calls sysfs_create_group() in example_init() and calls kobject_put() in
> example_exit(). So I think that's the correct pattern.
>
> Looking at the code more closely, sysfs_create_group() just creates files for
> each of the attributes in the group. __kobject_del() calls sysfs_remove_dir(),
> who's comment states "we remove any files in the directory before we remove the
> directory" so I'm pretty sure sysfs_remove_group() is not required.
Thanks for the explanation, and I think you are right after checking the
code again. Sorry for the noise.
Powered by blists - more mailing lists