[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7856e5a8-d84e-4f41-721b-80b6fc413919@alu.unizg.hr>
Date: Fri, 24 Feb 2023 16:12:31 +0100
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: Andy Shevchenko <andriy.shevchenko@...el.com>
Cc: Bartosz Golaszewski <brgl@...ev.pl>, linux-gpio@...r.kernel.org,
Linus Walleij <linus.walleij@...aro.org>,
linux-kernel@...r.kernel.org,
Thorsten Leemhuis <regressions@...mhuis.info>
Subject: Re: INFO: REPRODUCED: memory leak in gpio device in 6.2-rc6
On 2/21/23 16:39, Andy Shevchenko wrote:
> On Tue, Feb 21, 2023 at 02:52:38PM +0100, Mirsad Goran Todorovac wrote:
>> On 20. 02. 2023. 14:43, Andy Shevchenko wrote:
>>> On Mon, Feb 20, 2023 at 02:10:00PM +0100, Mirsad Todorovac wrote:
>>>> On 2/16/23 15:16, Bartosz Golaszewski wrote:
>
> ...
>
>>>> As Mr. McKenney once said, a bunch of monkeys with keyboard could
>>>> have done it in a considerable number of trials and errors ;-)
>>>>
>>>> But here I have something that could potentially leak as well. I could not devise a
>>>> reproducer due to the leak being lightly triggered only in extreme memory contention.
>>>>
>>>> See it for yourself:
>>>>
>>>> drivers/gpio/gpio-sim.c:
>>>> 301 static int gpio_sim_setup_sysfs(struct gpio_sim_chip *chip)
>>>> 302 {
>>>> 303 struct device_attribute *val_dev_attr, *pull_dev_attr;
>>>> 304 struct gpio_sim_attribute *val_attr, *pull_attr;
>>>> 305 unsigned int num_lines = chip->gc.ngpio;
>>>> 306 struct device *dev = chip->gc.parent;
>>>> 307 struct attribute_group *attr_group;
>>>> 308 struct attribute **attrs;
>>>> 309 int i, ret;
>>>> 310
>>>> 311 chip->attr_groups = devm_kcalloc(dev, sizeof(*chip->attr_groups),
>>>> 312 num_lines + 1, GFP_KERNEL);
>>>> 313 if (!chip->attr_groups)
>>>> 314 return -ENOMEM;
>>>> 315
>>>> 316 for (i = 0; i < num_lines; i++) {
>>>> 317 attr_group = devm_kzalloc(dev, sizeof(*attr_group), GFP_KERNEL);
>>>> 318 attrs = devm_kcalloc(dev, GPIO_SIM_NUM_ATTRS, sizeof(*attrs),
>>>> 319 GFP_KERNEL);
>>>> 320 val_attr = devm_kzalloc(dev, sizeof(*val_attr), GFP_KERNEL);
>>>> 321 pull_attr = devm_kzalloc(dev, sizeof(*pull_attr), GFP_KERNEL);
>>>> 322 if (!attr_group || !attrs || !val_attr || !pull_attr)
>>>> 323 return -ENOMEM;
>>>> 324
>>>> 325 attr_group->name = devm_kasprintf(dev, GFP_KERNEL,
>>>> 326 "sim_gpio%u", i);
>>>> 327 if (!attr_group->name)
>>>> 328 return -ENOMEM;
>>>>
>>>> Apparently, if the memory allocation only partially succeeds, in the theoretical case
>>>> that the system is close to its kernel memory exhaustion, `return -ENOMEM` would not
>>>> free the partially succeeded allocs, would it?
>>>>
>>>> To explain it better, I tried a version that is not yet full doing "all or nothing"
>>>> memory allocation for the gpio-sim driver, because I am not that familiar with the
>>>> driver internals.
>>>
>>> devm_*() mean that the resource allocation is made in a managed manner, so when
>>> it's done, it will be freed automatically.
>>
>> Didn't see that one coming ... :-/ "buzzing though the bush ..."
>>
>>> The question is: is the lifetime of the attr_groups should be lesser or the
>>> same as chip->gc.parent? Maybe it's incorrect to call devm_*() in the first place?
>>
>> Bona fide said, I hope that automatic deallocation does things in the right order.
>> I've realised that devm_kzalloc() calls devm_kmalloc() that registers allocations on
>> a per driver list. But I am not sure how chip->gc was allocated?
>>
>> Here is said it is allocated in drivers/gpio/gpio-sim.c:386 in gpio_sim_add_bank(),
>> as a part of
>>
>> struct gpio_sim_chip *chip;
>> struct gpio_chip *gc;
>>
>> gc = &chip->gc;
>>
>> and gc->parent is set to
>>
>> gc->parent = dev;
>>
>> in line 420, which appears called before gpio_sim_setup_sysfs() and the lines above.
>>
>> If I understood well, automatic deallocation on unloading the driver goes
>> in the reverse order, so lifetime of chip appears to be longer than attr_groups,
>> but I am really not that good at this ...
>
> So, the device is instantiated by platform_device_register_full().
>
> It should gone with the platform_device_unregister().
>
> In case of CONFIG_DEBUG_KOBJECT_RELEASE=y the ->release() can be called
> asynchronously.
>
> So, there are following questions:
> - is the put_device() is actually called?
> - is the above mentioned option is set to Y?
> - if it's in Y, does kmemleak take it into account?
> - if no, do you get anything new in `dmesg` when enable it?
Hi, Andy,
Having set CONFIG_DEBUG_KOBJECT=y.
CONFIG_DEBUG_KOBJECT_RELEASE=y and CONFIG_DEBUG_OBJECTS_TIMERS=y disappear after "make olddefconfig"
So, I cannot tell about whether release() was called asynchronously, all I get is (after driver unload):
[ 810.989742] kobject: 'gpio-sim' (00000000251afa19): kobject_cleanup, parent 00000000447da7a7
[ 810.990216] kobject: 'gpio-sim' (00000000251afa19): auto cleanup kobject_del
[ 810.990674] kobject: 'gpio-sim' (00000000251afa19): auto cleanup 'remove' event
[ 810.991175] kobject: 'gpio-sim' (00000000251afa19): kobject_uevent_env
[ 810.991674] kobject: 'gpio-sim' (00000000251afa19): fill_kobj_path: path = '/bus/platform/drivers/gpio-sim'
[ 810.992154] kobject: 'gpio-sim' (00000000251afa19): calling ktype release
[ 810.992644] kobject: 'gpio-sim': free name
I am still trying to convince "make olddefconfig" to accept the above values he did not like :-/
Regards,
Mirsad
>>> Or maybe the chip->gc.parent should be changed to something else (actual GPIO
>>> device, but then it's unclear how to provide the attributes in non-racy way
>> Really, dunno. I have to repeat that my learning curve cannot adapt so quickly.
>>
>> I merely gave the report of KMEMLEAK, otherwise I am not a Linux kernel
>> device expert nor would be appropriate to try the craft not earned ;-)
>
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
Powered by blists - more mailing lists