lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e559ca6-4f77-8b48-e53e-12a8b498920d@alu.unizg.hr>
Date:   Tue, 21 Feb 2023 15:20:41 +0100
From:   Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
To:     Andy Shevchenko <andriy.shevchenko@...el.com>
Cc:     Bartosz Golaszewski <brgl@...ev.pl>, linux-gpio@...r.kernel.org,
        Linus Walleij <linus.walleij@...aro.org>,
        linux-kernel@...r.kernel.org,
        Thorsten Leemhuis <regressions@...mhuis.info>
Subject: Re: INFO: REPRODUCED: memory leak in gpio device in 6.2-rc6

On 21.2.2023. 14:52, Mirsad Goran Todorovac wrote:
> On 20. 02. 2023. 14:43, Andy Shevchenko wrote:
>> On Mon, Feb 20, 2023 at 02:10:00PM +0100, Mirsad Todorovac wrote:
>>> On 2/16/23 15:16, Bartosz Golaszewski wrote:
>>
>> ...
>>
>>> As Mr. McKenney once said, a bunch of monkeys with keyboard could
>>> have done it in a considerable number of trials and errors ;-)
>>>
>>> But here I have something that could potentially leak as well. I could not devise a
>>> reproducer due to the leak being lightly triggered only in extreme memory contention.
>>>
>>> See it for yourself:
>>>
>>> drivers/gpio/gpio-sim.c:
>>>  301 static int gpio_sim_setup_sysfs(struct gpio_sim_chip *chip)
>>>  302 {
>>>  303         struct device_attribute *val_dev_attr, *pull_dev_attr;
>>>  304         struct gpio_sim_attribute *val_attr, *pull_attr;
>>>  305         unsigned int num_lines = chip->gc.ngpio;
>>>  306         struct device *dev = chip->gc.parent;
>>>  307         struct attribute_group *attr_group;
>>>  308         struct attribute **attrs;
>>>  309         int i, ret;
>>>  310
>>>  311         chip->attr_groups = devm_kcalloc(dev, sizeof(*chip->attr_groups),
>>>  312                                          num_lines + 1, GFP_KERNEL);
>>>  313         if (!chip->attr_groups)
>>>  314                 return -ENOMEM;
>>>  315
>>>  316         for (i = 0; i < num_lines; i++) {
>>>  317                 attr_group = devm_kzalloc(dev, sizeof(*attr_group), GFP_KERNEL);
>>>  318                 attrs = devm_kcalloc(dev, GPIO_SIM_NUM_ATTRS, sizeof(*attrs),
>>>  319                                      GFP_KERNEL);
>>>  320                 val_attr = devm_kzalloc(dev, sizeof(*val_attr), GFP_KERNEL);
>>>  321                 pull_attr = devm_kzalloc(dev, sizeof(*pull_attr), GFP_KERNEL);
>>>  322                 if (!attr_group || !attrs || !val_attr || !pull_attr)
>>>  323                         return -ENOMEM;
>>>  324
>>>  325                 attr_group->name = devm_kasprintf(dev, GFP_KERNEL,
>>>  326 "sim_gpio%u", i);
>>>  327                 if (!attr_group->name)
>>>  328                         return -ENOMEM;
>>>
>>> Apparently, if the memory allocation only partially succeeds, in the theoretical case
>>> that the system is close to its kernel memory exhaustion, `return -ENOMEM` would not
>>> free the partially succeeded allocs, would it?
>>>
>>> To explain it better, I tried a version that is not yet full doing "all or nothing"
>>> memory allocation for the gpio-sim driver, because I am not that familiar with the
>>> driver internals.
>>
>> devm_*() mean that the resource allocation is made in a managed manner, so when
>> it's done, it will be freed automatically.
>
> Didn't see that one coming ... :-/ "buzzing though the bush ..."
>
>> The question is: is the lifetime of the attr_groups should be lesser or the
>> same as chip->gc.parent? Maybe it's incorrect to call devm_*() in the first place?
>
> Bona fide said, I hope that automatic deallocation does things in the right order.
> I've realised that devm_kzalloc() calls devm_kmalloc() that registers allocations on
> a per driver list. But I am not sure how chip->gc was allocated?
>
> Here is said it is allocated in drivers/gpio/gpio-sim.c:386 in gpio_sim_add_bank(), as a part of
> struct gpio_sim_chip *chip;
>     struct gpio_chip *gc;
>
>     gc = &chip->gc;
>
> and gc->parent is set to
>
>     gc->parent = dev;
>
> in line 420, which appears called before gpio_sim_setup_sysfs() and the lines above.

P.S.

The exact line is:

	chip  =  devm_kzalloc <https://elixir.bootlin.com/linux/latest/C/ident/devm_kzalloc>(dev,  sizeof(*chip),  GFP_KERNEL <https://elixir.bootlin.com/linux/latest/C/ident/GFP_KERNEL>); so I guess it is reasonable to assume that chip will also 
be deallocated after attr_groups. chip->gc.parent appears to be a mere pointer to dev parameter in static  int  gpio_sim_add_bank <https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_add_bank>(struct  fwnode_handle <https://elixir.bootlin.com/linux/latest/C/ident/fwnode_handle>  *swnode <https://elixir.bootlin.com/linux/latest/C/ident/swnode>,  struct  device <https://elixir.bootlin.com/linux/latest/C/ident/device>  *dev) This is OTOH called from: static  int  gpio_sim_probe <https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_probe>(struct  platform_device <https://elixir.bootlin.com/linux/latest/C/ident/platform_device>  *pdev)
{
	struct  device <https://elixir.bootlin.com/linux/latest/C/ident/device>  *dev  =  &pdev->dev;
	struct  fwnode_handle <https://elixir.bootlin.com/linux/latest/C/ident/fwnode_handle>  *swnode <https://elixir.bootlin.com/linux/latest/C/ident/swnode>;
	int  ret;

	device_for_each_child_node <https://elixir.bootlin.com/linux/latest/C/ident/device_for_each_child_node>(dev,  swnode <https://elixir.bootlin.com/linux/latest/C/ident/swnode>)  {
		ret  =  gpio_sim_add_bank <https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_add_bank>(swnode 
<https://elixir.bootlin.com/linux/latest/C/ident/swnode>,  dev); Which means dev passed to chip->gc.parent is initialised with &pdev->dev from pdev parm of gpio_sim_probe(). This is OTOH 
referenced from the very:
static struct platform_driver <https://elixir.bootlin.com/linux/latest/C/ident/platform_driver> gpio_sim_driver 
<https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_driver> = { .driver = { .name = "gpio-sim", .of_match_table 
<https://elixir.bootlin.com/linux/latest/C/ident/of_match_table> = gpio_sim_of_match 
<https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_of_match>, }, .probe 
<https://elixir.bootlin.com/linux/latest/C/ident/probe> = gpio_sim_probe 
<https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_probe>, }; Hope this helps. There's more to this than meets the eye, but 
this is really an idiot's attempt to analyse a Linux kernel driver. :-)

> If I understood well, automatic deallocation on unloading the driver goes
> in the reverse order, so lifetime of chip appears to be longer than attr_groups,
> but I am really not that good at this ...
>
>> Or maybe the chip->gc.parent should be changed to something else (actual GPIO
>> device, but then it's unclear how to provide the attributes in non-racy way
> Really, dunno. I have to repeat that my learning curve cannot adapt so quickly.
>
> I merely gave the report of KMEMLEAK, otherwise I am not a Linux kernel
> device expert nor would be appropriate to try the craft not earned ;-) 

Regards,

Mirsad

-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ