lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3d96e50b-ed17-9bf5-149b-8a50c7b4cca2@alu.unizg.hr>
Date:   Wed, 15 Feb 2023 15:45:43 +0100
From:   Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
To:     Andy Shevchenko <andriy.shevchenko@...el.com>
Cc:     linux-gpio@...r.kernel.org,
        Linus Walleij <linus.walleij@...aro.org>,
        Bartosz Golaszewski <brgl@...ev.pl>,
        linux-kernel@...r.kernel.org,
        Thorsten Leemhuis <regressions@...mhuis.info>
Subject: Re: INFO: REPRODUCED: memory leak in gpio device in 6.2-rc6

On 15.2.2023. 11:53, Andy Shevchenko wrote:
> On Tue, Feb 14, 2023 at 07:19:16PM +0100, Mirsad Goran Todorovac wrote:
>> On 12. 02. 2023. 15:19, Andy Shevchenko wrote:
>>> On Wed, Feb 08, 2023 at 08:55:24PM +0100, Mirsad Goran Todorovac wrote:
>>>> On 31. 01. 2023. 10:36, Mirsad Goran Todorovac wrote:
>>>>> I came across this memory leak apparently in the GPIO device driver.
>>>>> It is still present in 6.2-rc6 release candidate kernel (just ran kselftest).
>>>>>
>>>>> This is a vanilla Torvalds tree kernel with MGLRU and KMEMLEAK (obviously)
>>>>> enabled.
>>>>>
>>>>> If you think this bug is significant, I can attempt the bug bisect in the
>>>>> environment that triggered it (Lenovo LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
>>>>> with BIOS M22KT49A from 11/10/2022 and AlmaLinux 8.7.
>>>>>
>>>>> Here is the /sys/kernel/debug/kmemleak output:
>>>>>
>>>>> unreferenced object 0xffff9e67ad71f160 (size 32):
>>>>>    comm "gpio-sim.sh", pid 208926, jiffies 4372229685 (age 2101.564s)
>>>>>    hex dump (first 32 bytes):
>>>>>      67 70 69 6f 2d 73 69 6d 2e 30 2d 6e 6f 64 65 30  gpio-sim.0-node0
>>>>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>>>>    backtrace:
>>>>>      [<0000000098bf3d1b>] slab_post_alloc_hook+0x91/0x320
>>>>>      [<00000000da3205c5>] __kmem_cache_alloc_node+0x1bf/0x2b0
>>>>>      [<00000000aa51a58a>] __kmalloc_node_track_caller+0x55/0x140
>>>>>      [<00000000bd682ecc>] kvasprintf+0x6b/0xd0
>>>>>      [<00000000a3431d55>] kasprintf+0x4e/0x70
>>>>>      [<00000000f52d2629>] gpio_sim_device_config_live_store+0x401/0x59d [gpio_sim]
>>>>>      [<00000000673fc6df>] configfs_write_iter+0xcc/0x130
>>>>>      [<000000001d5d0829>] vfs_write+0x2b4/0x3d0
>>>>>      [<00000000d2336251>] ksys_write+0x61/0xe0
>>>>>      [<00000000f7015bb1>] __x64_sys_write+0x1a/0x20
>>>>>      [<000000008ac743d2>] do_syscall_64+0x58/0x80
>>>>>      [<000000004d7b7d50>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
>> Sorry, but unfortunately this patch didn't fix the memleak. Please see the result:
> 
> Thank you for give a try!

No sweat. It was worth the effort.

> Yeah, that's why I put that I'm skeptical, because while patch is correct per
> se it wouldn't prevent the initial leakage (it seems it happens due to other
> circumstances).

I must admit that is looks like hieroglyphs to me. My learning curve had not
yet reached that point of debugging memory leaks in the kernel drivers ...

>> root@...vin-IdeaPad-3-15ITL6:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/gpio# echo clear > /sys/kernel/debug/kmemleak
>> root@...vin-IdeaPad-3-15ITL6:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/gpio# time ./gpio-sim.sh
>> trap: SIGTERM: bad trap
>> 1. chip_name and dev_name attributes
>> 1.1. Chip name is communicated to user
>> 1.2. chip_name returns 'none' if the chip is still pending
>> 1.3. Device name is communicated to user
>> 2. Creating and configuring simulated chips
>> 2.1. Default number of lines is 1
>> 2.2. Number of lines can be specified
>> 2.3. Label can be set
>> 2.4. Label can be left empty
>> 2.5. Line names can be configured
>> 2.6. Line config can remain unused if offset is greater than number of lines
>> 2.7. Line configfs directory names are sanitized
>> 2.8. Multiple chips can be created
>> 2.9. Can't modify settings when chip is live
>> 2.10. Can't create line items when chip is live
>> 2.11. Probe errors are propagated to user-space
>> 2.12. Cannot enable a chip without any GPIO banks
>> 2.13. Duplicate chip labels are not allowed
>> 2.14. Lines can be hogged
>> 3. Controlling simulated chips
>> 3.1. Pull can be set over sysfs
>> 3.2. Pull can be read from sysfs
>> 3.3. Incorrect input in sysfs is rejected
>> 3.4. Can't write to value
>> 4. Simulated GPIO chips are functional
>> 4.1. Values can be read from sysfs
>> 4.2. Bias settings work correctly
>> GPIO gpio-sim test PASS
>>
>> real	0m1.120s
>> user	0m0.283s
>> sys	0m0.842s
>> root@...vin-IdeaPad-3-15ITL6:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/gpio# echo scan > /sys/kernel/debug/kmemleak
>> root@...vin-IdeaPad-3-15ITL6:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/gpio# cat /sys/kernel/debug/kmemleak
>> unreferenced object 0xffff91f019219660 (size 32):
>>    comm "gpio-sim.sh", pid 11223, jiffies 4295028142 (age 87.304s)
>>    hex dump (first 32 bytes):
>>      67 70 69 6f 2d 73 69 6d 2e 30 2d 6e 6f 64 65 30  gpio-sim.0-node0
>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>    backtrace:
>>      [<ffffffff86fcd408>] __kmem_cache_alloc_node+0x1d8/0x330
>>      [<ffffffff86f4c271>] __kmalloc_node_track_caller+0x51/0xd0
>>      [<ffffffff872fb359>] kvasprintf+0x69/0xd0
>>      [<ffffffff872fb4ce>] kasprintf+0x4e/0x70
>>      [<ffffffffc1864553>] gpio_sim_device_config_live_store+0x483/0x6b5 [gpio_sim]
>>      [<ffffffff870dde9c>] configfs_write_iter+0xcc/0x130
>>      [<ffffffff87016869>] vfs_write+0x1f9/0x3b0
>>      [<ffffffff87016dab>] ksys_write+0x6b/0xf0
>>      [<ffffffff87016e59>] __x64_sys_write+0x19/0x20
>>      [<ffffffff87b91188>] do_syscall_64+0x58/0x80
>>      [<ffffffff87c000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>> root@...vin-IdeaPad-3-15ITL6:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/gpio# uname -rms
>> Linux 6.2.0-rc8-lru-km-andy-00015-gf6feea56f66d-dirty x86_64
>> root@...vin-IdeaPad-3-15ITL6:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/gpio#
>>
>> I can verify that I build the right patch:
>>
>> marvin@...vin-IdeaPad-3-15ITL6:~/linux/kernel/linux_torvalds$ git diff
>> diff --git a/drivers/gpio/gpio-sim.c b/drivers/gpio/gpio-sim.c
>> index 60514bc5454f..7f79e49b23d7 100644
>> --- a/drivers/gpio/gpio-sim.c
>> +++ b/drivers/gpio/gpio-sim.c
>> @@ -954,9 +954,9 @@ static void gpio_sim_device_deactivate_unlocked(struct gpio_sim_device *dev)
>>   
>>          swnode = dev_fwnode(&dev->pdev->dev);
>>          platform_device_unregister(dev->pdev);
>> +       gpio_sim_remove_hogs(dev);
>>          gpio_sim_remove_swnode_recursive(swnode);
>>          dev->pdev = NULL;
>> -       gpio_sim_remove_hogs(dev);
>>   }
>>   
>>   static ssize_t
>> marvin@...vin-IdeaPad-3-15ITL6:~/linux/kernel/linux_torvalds$
>>
>> Alternatively, I could try to bisect if you think it's prudent to try that.
>> But first I need a stroll after this kernel build :-)
>>
>> Do you think that knowing when the bug was introduced might help find the culprit?
> 

-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ