[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f3aa2ac-fba6-dc7a-d01d-7dd5331c8dc5@huawei.com>
Date: Fri, 21 Oct 2022 17:12:37 +0800
From: Yang Yingliang <yangyingliang@...wei.com>
To: Greg KH <gregkh@...uxfoundation.org>
CC: Luben Tuikov <luben.tuikov@....com>,
<linux-kernel@...r.kernel.org>, <qemu-devel@...gnu.org>,
<linux-f2fs-devel@...ts.sourceforge.net>,
<linux-erofs@...ts.ozlabs.org>, <ocfs2-devel@....oracle.com>,
<linux-mtd@...ts.infradead.org>, <amd-gfx@...ts.freedesktop.org>,
<rafael@...nel.org>, <somlo@....edu>, <mst@...hat.com>,
<jaegeuk@...nel.org>, <chao@...nel.org>,
<hsiangkao@...ux.alibaba.com>, <huangjianan@...o.com>,
<mark@...heh.com>, <jlbec@...lplan.org>,
<joseph.qi@...ux.alibaba.com>, <akpm@...ux-foundation.org>,
<alexander.deucher@....com>, <richard@....at>,
<liushixin2@...wei.com>
Subject: Re: [PATCH 00/11] fix memory leak while kset_register() fails
On 2022/10/21 16:36, Greg KH wrote:
> On Fri, Oct 21, 2022 at 04:24:23PM +0800, Yang Yingliang wrote:
>> On 2022/10/21 13:37, Greg KH wrote:
>>> On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote:
>>>> On 2022-10-20 22:20, Yang Yingliang wrote:
>>>>> The previous discussion link:
>>>>> https://lore.kernel.org/lkml/0db486eb-6927-927e-3629-958f8f211194@huawei.com/T/
>>>> The very first discussion on this was here:
>>>>
>>>> https://www.spinics.net/lists/dri-devel/msg368077.html
>>>>
>>>> Please use this link, and not the that one up there you which quoted above,
>>>> and whose commit description is taken verbatim from the this link.
>>>>
>>>>> kset_register() is currently used in some places without calling
>>>>> kset_put() in error path, because the callers think it should be
>>>>> kset internal thing to do, but the driver core can not know what
>>>>> caller doing with that memory at times. The memory could be freed
>>>>> both in kset_put() and error path of caller, if it is called in
>>>>> kset_register().
>>>> As I explained in the link above, the reason there's
>>>> a memory leak is that one cannot call kset_register() without
>>>> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL,
>>>> in this case, i.e. kset_register() fails with -EINVAL.
>>>>
>>>> Thus, the most common usage is something like this:
>>>>
>>>> kobj_set_name(&kset->kobj, format, ...);
>>>> kset->kobj.kset = parent_kset;
>>>> kset->kobj.ktype = ktype;
>>>> res = kset_register(kset);
>>>>
>>>> So, what is being leaked, is the memory allocated in kobj_set_name(),
>>>> by the common idiom shown above. This needs to be mentioned in
>>>> the documentation, at least, in case, in the future this is absolved
>>>> in kset_register() redesign, etc.
>>> Based on this, can kset_register() just clean up from itself when an
>>> error happens? Ideally that would be the case, as the odds of a kset
>>> being embedded in a larger structure is probably slim, but we would have
>>> to search the tree to make sure.
>> I have search the whole tree, the kset used in bus_register() - patch #3,
>> kset_create_and_add() - patch #4
>> __class_register() - patch #5, fw_cfg_build_symlink() - patch #6 and
>> amdgpu_discovery.c - patch #10
>> is embedded in a larger structure. In these cases, we can not call
>> kset_put() in error path in kset_register()
> Yes you can as the kobject in the kset should NOT be controling the
> lifespan of those larger objects.
Read through the code the only leak in this case is the name, so can we
free it
directly in kset_register():
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -844,8 +844,11 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
or unset ktype of kobject, then call kset_put():
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -844,8 +844,11 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ k->kobj.ktype = NULL;
+ kset_put(k);
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
>
> If it is, please point out the call chain here as I don't think that
> should be possible.
>
> Note all of this is a mess because the kobject name stuff was added much
> later, after the driver model had been created and running for a while.
> We missed this error path when adding the dynamic kobject name logic,
> thank for looking into this.
>
> If you could test the patch posted with your error injection systems,
> that could make this all much simpler to solve.
>
> thanks,
>
> greg k-h
> .
Powered by blists - more mailing lists