lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9f757a7b-6ac9-804a-063f-4cc2c6fc3f54@alu.unizg.hr>
Date:   Wed, 29 Mar 2023 15:31:16 +0200
From:   Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
To:     Armin Wolf <W_Armin@....de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        linux-kernel@...r.kernel.org, Hans de Goede <hdegoede@...hat.com>,
        Mark Gross <markgross@...nel.org>,
        platform-driver-x86@...r.kernel.org,
        Mark Pearson <mpearson-lenovo@...ebb.ca>
Subject: Re: [BUG] [BISECTED] [CORRECTION] systemd-devd triggers kernel
 memleak apparently in drivers/core/dd.c: driver_register()

Hi, again,

NOTE: I forgot to rewind to the first bad commit. So please ignore
the previous email.

On 29.3.2023. 15:22, Mirsad Goran Todorovac wrote:
> Hi, Armin, Mr. Greg,
> 
> On 29.3.2023. 10:13, Mirsad Goran Todorovac wrote:
>> On 28.3.2023. 21:55, Armin Wolf wrote:
>>> Am 28.03.23 um 21:06 schrieb Mirsad Goran Todorovac:
>>>
>>>> On 3/28/2023 6:53 PM, Armin Wolf wrote:
>>>>> Am 28.03.23 um 14:44 schrieb Mirsad Todorovac:
>>>>>
>>>>>> On 3/28/23 14:17, Greg Kroah-Hartman wrote:
>>>>>>> On Tue, Mar 28, 2023 at 02:08:06PM +0200, Mirsad Todorovac wrote:
>>>>>>>> On 3/28/23 13:59, Mirsad Todorovac wrote:
>>>>>>>>
>>>>>>>>> On 3/28/23 13:28, Greg Kroah-Hartman wrote:
>>>>>>>>>> On Tue, Mar 28, 2023 at 01:13:33PM +0200, Mirsad Todorovac wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Here is another kernel memory leak report, just as I thought we
>>>>>>>>>>> have done with
>>>>>>>>>>> them by the xhci patch by Mathias.
>>>>>>>>>>>
>>>>>>>>>>> The memory leaks were caught on an AlmaLinux 8.7 (CentOS) fork
>>>>>>>>>>> system, running
>>>>>>>>>>> on a Lenovo desktop box (see lshw.txt) and the newest Linux
>>>>>>>>>>> kernel 6.3-rc4 commit
>>>>>>>>>>> g3a93e40326c8 with Mathias' patch for a xhci systemd-devd
>>>>>>>>>>> triggered leak.
>>>>>>>>>>>
>>>>>>>>>>>           See:
>>>>>>>>>>> <20230327095019.1017159-1-mathias.nyman@...ux.intel.com> on LKML.
>>>>>>>>>>>
>>>>>>>>>>> This leak is also systemd-devd triggered, except for the
>>>>>>>>>>> memstick_check() leaks
>>>>>>>>>>> which I was unable to bisect due to the box not booting older
>>>>>>>>>>> kernels (work in
>>>>>>>>>>> progress).
>>>>>>>>>>>
>>>>>>>>>>> unreferenced object 0xffff88ad12392710 (size 96):
>>>>>>>>>>>     comm "systemd-udevd", pid 735, jiffies 4294896759 (age
>>>>>>>>>>> 2257.568s)
>>>>>>>>>>>     hex dump (first 32 bytes):
>>>>>>>>>>>       53 65 72 69 61 6c 50 6f 72 74 31 41 64 64 72 65
>>>>>>>>>>> SerialPort1Addre
>>>>>>>>>>>       73 73 2c 33 46 38 2f 49 52 51 34 3b 5b 4f 70 74
>>>>>>>>>>> ss,3F8/IRQ4;[Opt
>>>>>>>>>>>     backtrace:
>>>>>>>>>>>       [<ffffffffae8fb26c>] slab_post_alloc_hook+0x8c/0x3e0
>>>>>>>>>>>       [<ffffffffae902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>>>>>>>>>>       [<ffffffffae8773c9>] __kmalloc_node_track_caller+0x59/0x180
>>>>>>>>>>>       [<ffffffffae866a1a>] kstrdup+0x3a/0x70
>>>>>>>>>>>       [<ffffffffc0d839aa>]
>>>>>>>>>>> tlmi_extract_output_string.isra.0+0x2a/0x60 [think_lmi]
>>>>>>>>>>>       [<ffffffffc0d83b64>] tlmi_setting.constprop.4+0x54/0x90
>>>>>>>>>>> [think_lmi]
>>>>>>>>>>>       [<ffffffffc0d842b1>] tlmi_probe+0x591/0xba0 [think_lmi]
>>>>>>>>>>>       [<ffffffffc051dc53>] wmi_dev_probe+0x163/0x230 [wmi]
>>>>>>>>>>
>>>>> Hi,
>>>>>
>>>>> this "SerialPort1Address" string looks like a BIOS setup option, and
>>>>> indeed think_lmi allows for
>>>>> changing BIOS setup options over sysfs. While looking at
>>>>> current_value_show() in think-lmi.c, i noticed
>>>>> that "item" holds a string which is allocated with kstrdup(), so it
>>>>> has to be freed using kfree().
>>>>> This however does not happen if strbrk() fails, so maybe the memory
>>>>> leak is caused by this?
>>>>>
>>>>> Armin Wolf
>>>>
>>>> Hi Armin,
>>>>
>>>> I tried your suggestion, and though it is an obvious improvement and a
>>>> leak fix, this
>>>> was not the one we were searching for.
>>>>
>>>> I tested the following patch:
>>>>
>>>> diff --git a/drivers/platform/x86/think-lmi.c
>>>> b/drivers/platform/x86/think-lmi.c
>>>> index c816646eb661..1e77ecb0cba8 100644
>>>> --- a/drivers/platform/x86/think-lmi.c
>>>> +++ b/drivers/platform/x86/think-lmi.c
>>>> @@ -929,8 +929,10 @@ static ssize_t current_value_show(struct kobject
>>>> *kobj, struct kobj_attribute *a
>>>>
>>>>         /* validate and split from `item,value` -> `value` */
>>>>         value = strpbrk(item, ",");
>>>> -       if (!value || value == item || !strlen(value + 1))
>>>> +       if (!value || value == item || !strlen(value + 1)) {
>>>> +               kfree(item);
>>>>                 return -EINVAL;
>>>> +       }
>>>>
>>>>         ret = sysfs_emit(buf, "%s\n", value + 1);
>>>>         kfree(item);
>>>>
>>>> (I would also object to the use of strlen() here, for it is inherently
>>>> insecure
>>>> against SEGFAULT in kernel space.)
>>>>
>>>> I still get:
>>>> [root@...mtodorov marvin]# uname -rms
>>>> Linux 6.3.0-rc4-armin-patch-00025-g3a93e40326c8-dirty x86_64
>>>> [root@...mtodorov marvin]# cat /sys/kernel/debug/kmemleak [edited]
>>>> unreferenced object 0xffff8eb008ef9260 (size 96):
>>>>   comm "systemd-udevd", pid 771, jiffies 4294896499 (age 74.880s)
>>>>   hex dump (first 32 bytes):
>>>>     53 65 72 69 61 6c 50 6f 72 74 31 41 64 64 72 65 SerialPort1Addre
>>>>     73 73 2c 33 46 38 2f 49 52 51 34 3b 5b 4f 70 74 ss,3F8/IRQ4;[Opt
>>>>   backtrace:
>>>>     [<ffffffff9eafb26c>] slab_post_alloc_hook+0x8c/0x3e0
>>>>     [<ffffffff9eb02b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>>>     [<ffffffff9ea773c9>] __kmalloc_node_track_caller+0x59/0x180
>>>>     [<ffffffff9ea66a1a>] kstrdup+0x3a/0x70
>>>>     [<ffffffffc0eef9aa>] tlmi_extract_output_string.isra.0+0x2a/0x60
>>>> [think_lmi]
>>>>     [<ffffffffc0eefb64>] tlmi_setting.constprop.4+0x54/0x90 [think_lmi]
>>>>     [<ffffffffc0ef02c1>] tlmi_probe+0x591/0xba0 [think_lmi]
>>>>     [<ffffffffc0629c53>] wmi_dev_probe+0x163/0x230 [wmi]
>>>>     [<ffffffff9f1987eb>] really_probe+0x17b/0x3d0
>>>>     [<ffffffff9f198ad4>] __driver_probe_device+0x84/0x190
>>>>     [<ffffffff9f198c14>] driver_probe_device+0x24/0xc0
>>>>     [<ffffffff9f198ed2>] __driver_attach+0xc2/0x190
>>>>     [<ffffffff9f195ab1>] bus_for_each_dev+0x81/0xd0
>>>>     [<ffffffff9f197c62>] driver_attach+0x22/0x30
>>>>     [<ffffffff9f197354>] bus_add_driver+0x1b4/0x240
>>>>     [<ffffffff9f19a0a2>] driver_register+0x62/0x120
>>>> unreferenced object 0xffff8eb018ddbb40 (size 64):
>>>>   comm "systemd-udevd", pid 771, jiffies 4294896528 (age 74.780s)
>>>>   hex dump (first 32 bytes):
>>>>     55 53 42 50 6f 72 74 41 63 63 65 73 73 2c 45 6e USBPortAccess,En
>>>>     61 62 6c 65 64 3b 5b 4f 70 74 69 6f 6e 61 6c 3a abled;[Optional:
>>>>   backtrace:
>>>>     [<ffffffff9eafb26c>] slab_post_alloc_hook+0x8c/0x3e0
>>>>     [<ffffffff9eb02b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>>>     [<ffffffff9ea773c9>] __kmalloc_node_track_caller+0x59/0x180
>>>>     [<ffffffff9ea66a1a>] kstrdup+0x3a/0x70
>>>>     [<ffffffffc0eef9aa>] tlmi_extract_output_string.isra.0+0x2a/0x60
>>>> [think_lmi]
>>>>     [<ffffffffc0eefb64>] tlmi_setting.constprop.4+0x54/0x90 [think_lmi]
>>>>     [<ffffffffc0ef02c1>] tlmi_probe+0x591/0xba0 [think_lmi]
>>>>     [<ffffffffc0629c53>] wmi_dev_probe+0x163/0x230 [wmi]
>>>>     [<ffffffff9f1987eb>] really_probe+0x17b/0x3d0
>>>>     [<ffffffff9f198ad4>] __driver_probe_device+0x84/0x190
>>>>     [<ffffffff9f198c14>] driver_probe_device+0x24/0xc0
>>>>     [<ffffffff9f198ed2>] __driver_attach+0xc2/0x190
>>>>     [<ffffffff9f195ab1>] bus_for_each_dev+0x81/0xd0
>>>>     [<ffffffff9f197c62>] driver_attach+0x22/0x30
>>>>     [<ffffffff9f197354>] bus_add_driver+0x1b4/0x240
>>>>     [<ffffffff9f19a0a2>] driver_register+0x62/0x120
>>>> unreferenced object 0xffff8eb006fe2b40 (size 64):
>>>>   comm "systemd-udevd", pid 771, jiffies 4294896542 (age 74.724s)
>>>>   hex dump (first 32 bytes):
>>>>     55 53 42 42 49 4f 53 53 75 70 70 6f 72 74 2c 45 USBBIOSSupport,E
>>>>     6e 61 62 6c 65 64 3b 5b 4f 70 74 69 6f 6e 61 6c nabled;[Optional
>>>>   backtrace:
>>>>     [<ffffffff9eafb26c>] slab_post_alloc_hook+0x8c/0x3e0
>>>>     [<ffffffff9eb02b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>>>     [<ffffffff9ea773c9>] __kmalloc_node_track_caller+0x59/0x180
>>>>     [<ffffffff9ea66a1a>] kstrdup+0x3a/0x70
>>>>     [<ffffffffc0eef9aa>] tlmi_extract_output_string.isra.0+0x2a/0x60
>>>> [think_lmi]
>>>>     [<ffffffffc0eefb64>] tlmi_setting.constprop.4+0x54/0x90 [think_lmi]
>>>>     [<ffffffffc0ef02c1>] tlmi_probe+0x591/0xba0 [think_lmi]
>>>>     [<ffffffffc0629c53>] wmi_dev_probe+0x163/0x230 [wmi]
>>>>     [<ffffffff9f1987eb>] really_probe+0x17b/0x3d0
>>>>     [<ffffffff9f198ad4>] __driver_probe_device+0x84/0x190
>>>>     [<ffffffff9f198c14>] driver_probe_device+0x24/0xc0
>>>>     [<ffffffff9f198ed2>] __driver_attach+0xc2/0x190
>>>>     [<ffffffff9f195ab1>] bus_for_each_dev+0x81/0xd0
>>>>     [<ffffffff9f197c62>] driver_attach+0x22/0x30
>>>>     [<ffffffff9f197354>] bus_add_driver+0x1b4/0x240
>>>>     [<ffffffff9f19a0a2>] driver_register+0x62/0x120
>>>>
>>>> There are currently 84 wmi_dev_probe leaks, sized mostly 64 bytes, and
>>>> one 96 and two 192 bytes.
>>>>
>>>> I also cannot figure out the mechanism by which current_value_show()
>>>> is called, when it is static?
>>>>
>>>> Any idea?
>>>>
>>>> Thanks.
>>>>
>>>> Best regards,
>>>> Mirsad
>>>>
>>> Can you tell me how many BIOS settings think-lmi provides on your machine? Because according to the stacktrace,
>>> the other place where the leak could have occurred is inside tlmi_analyze(), which calls tlmi_setting().
>>
>> Yes, Sir!
>>
>> I think these could be the ones you need (totaling 83, which is close to 82 systemd-udevd leaks):
>>
>> [root@...mtodorov marvin]# ls -l /sys/devices/virtual/firmware-attributes/thinklmi/attributes | grep ^d
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 AfterPowerLoss
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 AlarmDate(MM\DD\YYYY)
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 AlarmDayofWeek
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 AlarmTime(HH:MM:SS)
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 ASPMSupport
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 AutomaticBootSequence
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 BIOSPasswordAtBootDeviceList
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 BIOSPasswordAtReboot
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 BIOSPasswordAtUnattendedBoot
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 BootMode
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 BootPriority
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 BootUpNum-LockStatus
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 C1ESupport
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 CardReader
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 ClearTCGSecurityFeature
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 ComputraceModuleActivation
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 ConfigurationChangeDetection
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 ConfigureSATAas
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 CoreMulti-Processing
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 CoverTamperDetected
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 CSM
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 CStateSupport
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 DeviceGuard
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 DustShieldAlert
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 EISTSupport
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 EnhancedPowerSavingMode
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 ErrorBootSequence
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Friday
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 FrontUSBPorts
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 HardDiskPre-delay
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Intel(R)SGXControl
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 InternalSpeaker
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Monday
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 OnboardAudioController
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 OnboardEthernetController
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 OptionKeysDisplay
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 OptionKeysDisplayStyle
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 OSOptimizedDefaults
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 PasswordCountExceededError
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 PCIe16xSlotSpeed
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 PCIe1xSlot1Speed
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 PrimaryBootSequence
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 PXEIPV4NetworkStack
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 PXEIPV6NetworkStack
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 PXEOptionROM
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 RearUSBPorts
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 RequireAdmin.Pass.whenFlashing
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 RequireHDPonSystemBoot
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SATAController
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SATADrive1
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SATADrive2
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Saturday
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SecureBoot
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SecureRollBackPrevention
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SecurityChip
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SelectActiveVideo
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SerialPort1Address
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 SmartUSBProtection
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 StartupDeviceMenuPrompt
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 StartupSequence
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Sunday
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Thursday
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Tuesday
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 TurboMode
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBBIOSSupport
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBEnumerationDelay
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort1
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort2
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort3
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort4
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort5
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort6
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort7
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPort8
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 USBPortAccess
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 UserDefinedAlarmTime(HH:MM:SS)
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 VirtualizationTechnology
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 VTdFeature
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 WakefromSerialPortRing
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 WakeOnLAN
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 WakeUponAlarm
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 Wednesday
>> drwxr-xr-x 2 root root    0 Mar 29 08:24 WindowsUEFIFirmwareUpdate
>> [root@...mtodorov marvin]# ls -l /sys/devices/virtual/firmware-attributes/thinklmi/attributes | grep ^d | wc -l
>> 83
>> [root@...mtodorov marvin]#
>>
>>> However, i have no idea on how *info is somehow leaked, it has to happen inside the for-loop tween the call
>>> to tlmi_setting() and strreplace(), because otherwise the strings would not contain the "/" character.
>>
>> I see. It is the line 1404.
>>
>>> Can you check if the problem is somehow solved by applying the following commit from the platform-drivers-x86
>>> for-next branch:
>>> da62908efe80 ("platform/x86: think-lmi: Properly interpret return value of tlmi_setting")
>>
>> It could possibly be that. But I do not recall seeing these messages in 6.3-rc3 ...
>>
>> ...
>>
>> Unfortunately, the build with Thomas' patch you referred to did not work:
>>
>> unreferenced object 0xffff9dff4d28bbc8 (size 192):
>>    comm "systemd-udevd", pid 769, jiffies 4294897473 (age 85.700s)
>>    hex dump (first 32 bytes):
>>      50 72 69 6d 61 72 79 42 6f 6f 74 53 65 71 75 65  PrimaryBootSeque
>>      6e 63 65 2c 4d 2e 32 20 44 72 69 76 65 20 31 3a  nce,M.2 Drive 1:
>>    backtrace:
>>      [<ffffffffa48fb26c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffffa4902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffffa48773c9>] __kmalloc_node_track_caller+0x59/0x180
>>      [<ffffffffa4866a1a>] kstrdup+0x3a/0x70
>>      [<ffffffffc0c7f9aa>] tlmi_extract_output_string.isra.0+0x2a/0x60 [think_lmi]
>>      [<ffffffffc0c7fb64>] tlmi_setting.constprop.4+0x54/0x90 [think_lmi]
>>      [<ffffffffc0c802c1>] tlmi_probe+0x591/0xba0 [think_lmi]
>>      [<ffffffffc03c9c53>] wmi_dev_probe+0x163/0x230 [wmi]
>>      [<ffffffffa4f987eb>] really_probe+0x17b/0x3d0
>>      [<ffffffffa4f98ad4>] __driver_probe_device+0x84/0x190
>>      [<ffffffffa4f98c14>] driver_probe_device+0x24/0xc0
>>      [<ffffffffa4f98ed2>] __driver_attach+0xc2/0x190
>>      [<ffffffffa4f95ab1>] bus_for_each_dev+0x81/0xd0
>>      [<ffffffffa4f97c62>] driver_attach+0x22/0x30
>>      [<ffffffffa4f97354>] bus_add_driver+0x1b4/0x240
>>      [<ffffffffa4f9a0a2>] driver_register+0x62/0x120
>> unreferenced object 0xffff9dff4d28a008 (size 192):
>>    comm "systemd-udevd", pid 769, jiffies 4294897517 (age 85.540s)
>>    hex dump (first 32 bytes):
>>      45 72 72 6f 72 42 6f 6f 74 53 65 71 75 65 6e 63  ErrorBootSequenc
>>      65 2c 4e 65 74 77 6f 72 6b 20 31 3a 4d 2e 32 20  e,Network 1:M.2
>>    backtrace:
>>      [<ffffffffa48fb26c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffffa4902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffffa48773c9>] __kmalloc_node_track_caller+0x59/0x180
>>      [<ffffffffa4866a1a>] kstrdup+0x3a/0x70
>>      [<ffffffffc0c7f9aa>] tlmi_extract_output_string.isra.0+0x2a/0x60 [think_lmi]
>>      [<ffffffffc0c7fb64>] tlmi_setting.constprop.4+0x54/0x90 [think_lmi]
>>      [<ffffffffc0c802c1>] tlmi_probe+0x591/0xba0 [think_lmi]
>>      [<ffffffffc03c9c53>] wmi_dev_probe+0x163/0x230 [wmi]
>>      [<ffffffffa4f987eb>] really_probe+0x17b/0x3d0
>>      [<ffffffffa4f98ad4>] __driver_probe_device+0x84/0x190
>>      [<ffffffffa4f98c14>] driver_probe_device+0x24/0xc0
>>      [<ffffffffa4f98ed2>] __driver_attach+0xc2/0x190
>>      [<ffffffffa4f95ab1>] bus_for_each_dev+0x81/0xd0
>>      [<ffffffffa4f97c62>] driver_attach+0x22/0x30
>>      [<ffffffffa4f97354>] bus_add_driver+0x1b4/0x240
>>      [<ffffffffa4f9a0a2>] driver_register+0x62/0x120
>> [root@...mtodorov marvin]# uname -rms
>> Linux 6.3.0-rc4-armin+tw-patch-00025-g3a93e40326c8-dirty x86_64
>> [root@...mtodorov marvin]#
>>
>> What was applied is:
>>
>> mtodorov@...ac:~/linux/kernel/linux_torvalds$ git diff
>> diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c
>> index c816646eb661..9a3015f43aaf 100644
>> --- a/drivers/platform/x86/think-lmi.c
>> +++ b/drivers/platform/x86/think-lmi.c
>> @@ -929,8 +929,10 @@ static ssize_t current_value_show(struct kobject *kobj, struct kobj_attribute *a
>>
>>          /* validate and split from `item,value` -> `value` */
>>          value = strpbrk(item, ",");
>> -       if (!value || value == item || !strlen(value + 1))
>> +       if (!value || value == item || !strlen(value + 1)) {
>> +               kfree(item);
>>                  return -EINVAL;
>> +       }
>>
>>          ret = sysfs_emit(buf, "%s\n", value + 1);
>>          kfree(item);
>> @@ -1380,7 +1382,6 @@ static struct tlmi_pwd_setting *tlmi_create_auth(const char *pwd_type,
>>
>>   static int tlmi_analyze(void)
>>   {
>> -       acpi_status status;
>>          int i, ret;
>>
>>          if (wmi_has_guid(LENOVO_SET_BIOS_SETTINGS_GUID) &&
>> @@ -1417,8 +1418,8 @@ static int tlmi_analyze(void)
>>                  char *p;
>>
>>                  tlmi_priv.setting[i] = NULL;
>> -               status = tlmi_setting(i, &item, LENOVO_BIOS_SETTING_GUID);
>> -               if (ACPI_FAILURE(status))
>> +               ret = tlmi_setting(i, &item, LENOVO_BIOS_SETTING_GUID);
>> +               if (ret)
>>                          break;
>>                  if (!item)
>>                          break;
>>
>>
>>> Also current_value_show() is used by attr_current_val, the __ATTR_RW_MODE() macro arranges for that.
>>
>> Thanks.
>>
>> In this build:
>>
>> [root@...mtodorov marvin]# uname -rms
>> Linux 6.3.0-rc34tests-00001-g6981739a967c x86_64
>> [root@...mtodorov marvin]#
>>
>> ... the bug isn't present, so it might be something added recently:
>>
>> commit 8a02d70679fc1c434401863333c8ea7dbf201494
>> Author: Mark Pearson <mpearson-lenovo@...ebb.ca>
>> Date:   Sun Mar 19 20:32:21 2023 -0400
>>
>>      platform/x86: think-lmi: Add possible_values for ThinkStation
>>
>> commit cf337f27f3bfc4aeab4954c468239fd6233c7638
>> Author: Mark Pearson <mpearson-lenovo@...ebb.ca>
>> Date:   Sun Mar 19 20:32:20 2023 -0400
>>
>>      platform/x86: think-lmi: only display possible_values if available
>>
>> commit 45e21289bfc6e257885514790a8a8887da822d40
>> Author: Mark Pearson <mpearson-lenovo@...ebb.ca>
>> Date:   Sun Mar 19 20:32:19 2023 -0400
>>
>>      platform/x86: think-lmi: use correct possible_values delimiters
>>
>> commit 583329dcf22e568a328a944f20427ccfc95dce01
>> Author: Mark Pearson <mpearson-lenovo@...ebb.ca>
>> Date:   Sun Mar 19 20:32:18 2023 -0400
>>
>>      platform/x86: think-lmi: add missing type attribute
>>
>> I have CC:-ed the author of the commits.
>>
>> I can try bisect, but only after my day job.
> 
> I seem to have been right about the culprit commit.
> 
> Here is the bisect log:
> 
> mtodorov@...ac:~/linux/kernel/linux_torvalds$ git bisect log
> git bisect start '--' './drivers/platform/x86'
> # good: [caa0708a81d6a2217c942959ef40d515ec1d3108] bootconfig: Change message if no bootconfig with CONFIG_BOOT_CONFIG_FORCE=y
> git bisect good caa0708a81d6a2217c942959ef40d515ec1d3108
> # bad: [8a02d70679fc1c434401863333c8ea7dbf201494] platform/x86: think-lmi: Add possible_values for ThinkStation
> git bisect bad 8a02d70679fc1c434401863333c8ea7dbf201494
> # good: [1a0009abfa7893b9cfcd3884658af1cbee6b26ce] platform: mellanox: mlx-platform: Initialize shift variable to 0
> git bisect good 1a0009abfa7893b9cfcd3884658af1cbee6b26ce
> # good: [b7c994f8c35e916e27c60803bb21457bc1373500] platform/x86 (gigabyte-wmi): Add support for A320M-S2H V2
> git bisect good b7c994f8c35e916e27c60803bb21457bc1373500
> # good: [45e21289bfc6e257885514790a8a8887da822d40] platform/x86: think-lmi: use correct possible_values delimiters
> git bisect good 45e21289bfc6e257885514790a8a8887da822d40
> # good: [cf337f27f3bfc4aeab4954c468239fd6233c7638] platform/x86: think-lmi: only display possible_values if available
> git bisect good cf337f27f3bfc4aeab4954c468239fd6233c7638
> # first bad commit: [8a02d70679fc1c434401863333c8ea7dbf201494] platform/x86: think-lmi: Add possible_values for ThinkStation
> mtodorov@...ac:~/linux/kernel/linux_torvalds$
> 
> Please see below.
> 
> Apparently, this commit broke something on my Lenovo box:
> 
> diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c
> index e190fec26021..3f0641360251 100644
> --- a/drivers/platform/x86/think-lmi.c
> +++ b/drivers/platform/x86/think-lmi.c
> @@ -941,9 +941,6 @@ static ssize_t possible_values_show(struct kobject *kobj, struct kobj_attribute
>   {
>          struct tlmi_attr_setting *setting = to_tlmi_attr_setting(kobj);
> 
> -       if (!tlmi_priv.can_get_bios_selections)
> -               return -EOPNOTSUPP;
> -
>          return sysfs_emit(buf, "%s\n", setting->possible_values);
>   }
> 
> @@ -1052,6 +1049,18 @@ static struct kobj_attribute attr_current_val = __ATTR_RW_MODE(current_value, 06
> 
>   static struct kobj_attribute attr_type = __ATTR_RO(type);
> 
> +static umode_t attr_is_visible(struct kobject *kobj,
> +                                            struct attribute *attr, int n)
> +{
> +       struct tlmi_attr_setting *setting = to_tlmi_attr_setting(kobj);
> +
> +       /* We don't want to display possible_values attributes if not available */
> +       if ((attr == &attr_possible_values.attr) && (!setting->possible_values))
> +               return 0;
> +
> +       return attr->mode;
> +}
> +
>   static struct attribute *tlmi_attrs[] = {
>          &attr_displ_name.attr,
>          &attr_current_val.attr,
> @@ -1061,6 +1070,7 @@ static struct attribute *tlmi_attrs[] = {
>   };
> 
>   static const struct attribute_group tlmi_attr_group = {
> +       .is_visible = attr_is_visible,
>          .attrs = tlmi_attrs,
>   };
> 
> Hope this helps narrow down the problem.
> 
> Best regards,
> Mirsad
> 
>>>>>>>>>> Why aren't you looking at the wmi.c driver?  That should be
>>>>>>>>>> where the
>>>>>>>>>> issue is, not the driver core, right?
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> greg k-h
>>>>>>>>>
>>>>>>>>> Hi, Mr. Greg,
>>>>>>>>>
>>>>>>>>> Thanks for the quick reply.
>>>>>>>>>
>>>>>>>>> I have added CC: for additional developers per
>>>>>>>>> drivers/platform/x86/wmi.c,
>>>>>>>>> however, this seems to me like hieroglyphs. There is nothing
>>>>>>>>> obvious, but
>>>>>>>>> I had not noticed it with v6.3-rc3?
>>>>>>>>>
>>>>>>>>> Maybe, there seems to be something off:
>>>>>>>>>
>>>>>>>>>       949 static int wmi_dev_probe(struct device *dev)
>>>>>>>>>       950 {
>>>>>>>>>       951         struct wmi_block *wblock = dev_to_wblock(dev);
>>>>>>>>>       952         struct wmi_driver *wdriver =
>>>>>>>>> drv_to_wdrv(dev->driver);
>>>>>>>>>       953         int ret = 0;
>>>>>>>>>       954         char *buf;
>>>>>>>>>       955
>>>>>>>>>       956         if (ACPI_FAILURE(wmi_method_enable(wblock, true)))
>>>>>>>>>       957                 dev_warn(dev, "failed to enable device
>>>>>>>>> -- probing anyway\n");
>>>>>>>>>       958
>>>>>>>>>       959         if (wdriver->probe) {
>>>>>>>>>       960                 ret = wdriver->probe(dev_to_wdev(dev),
>>>>>>>>>       961 find_guid_context(wblock, wdriver));
>>>>>>>>>       962                 if (ret != 0)
>>>>>>>>>       963                         goto probe_failure;
>>>>>>>>>       964         }
>>>>>>>>>       965
>>>>>>>>>       966         /* driver wants a character device made */
>>>>>>>>>       967         if (wdriver->filter_callback) {
>>>>>>>>>       968                 /* check that required buffer size
>>>>>>>>> declared by driver or MOF */
>>>>>>>>>       969                 if (!wblock->req_buf_size) {
>>>>>>>>>       970 dev_err(&wblock->dev.dev,
>>>>>>>>>       971                                 "Required buffer size
>>>>>>>>> not set\n");
>>>>>>>>>       972                         ret = -EINVAL;
>>>>>>>>>       973                         goto probe_failure;
>>>>>>>>>       974                 }
>>>>>>>>>       975
>>>>>>>>>       976                 wblock->handler_data =
>>>>>>>>> kmalloc(wblock->req_buf_size,
>>>>>>>>>       977 GFP_KERNEL);
>>>>>>>>>       978                 if (!wblock->handler_data) {
>>>>>>>>>       979                         ret = -ENOMEM;
>>>>>>>>>       980                         goto probe_failure;
>>>>>>>>>       981                 }
>>>>>>>>>       982
>>>>>>>>>       983                 buf = kasprintf(GFP_KERNEL, "wmi/%s",
>>>>>>>>> wdriver->driver.name);
>>>>>>>>>       984                 if (!buf) {
>>>>>>>>>       985                         ret = -ENOMEM;
>>>>>>>>>       986                         goto probe_string_failure;
>>>>>>>>>       987                 }
>>>>>>>>>       988                 wblock->char_dev.minor =
>>>>>>>>> MISC_DYNAMIC_MINOR;
>>>>>>>>>       989                 wblock->char_dev.name = buf;
>>>>>>>>>       990                 wblock->char_dev.fops = &wmi_fops;
>>>>>>>>>       991                 wblock->char_dev.mode = 0444;
>>>>>>>>>       992                 ret = misc_register(&wblock->char_dev);
>>>>>>>>>       993                 if (ret) {
>>>>>>>>>       994                         dev_warn(dev, "failed to
>>>>>>>>> register char dev: %d\n", ret);
>>>>>>>>>       995                         ret = -ENOMEM;
>>>>>>>>>       996                         goto probe_misc_failure;
>>>>>>>>>       997                 }
>>>>>>>>>       998         }
>>>>>>>>>       999
>>>>>>>>>      1000         set_bit(WMI_PROBED, &wblock->flags);
>>>>>>>>>      1001         return 0;
>>>>>>>>>      1002
>>>>>>>>>      1003 probe_misc_failure:
>>>>>>>>>      1004         kfree(buf);
>>>>>>>>>      1005 probe_string_failure:
>>>>>>>>>      1006         kfree(wblock->handler_data);
>>>>>>>>>      1007 probe_failure:
>>>>>>>>>      1008         if (ACPI_FAILURE(wmi_method_enable(wblock,
>>>>>>>>> false)))
>>>>>>>>>      1009                 dev_warn(dev, "failed to disable
>>>>>>>>> device\n");
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> char *buf is passed to kfree(buf) uninitialised if
>>>>>>>>> wdriver->filter_callback
>>>>>>>>> is not set.
>>>>>>>>>
>>>>>>>>> It seems like a logical error per se, but I don't believe this is
>>>>>>>>> the cause
>>>>>>>>> of the leak?
>>>>>>>>
>>>>>>>> CORRECTION:
>>>>>>>>
>>>>>>>> I overlooked the "return 0" in line 1001.
>>>>>>>
>>>>>>> Yeah, and the memory looks to be freed properly in the
>>>>>>> wmi_dev_remove()
>>>>>>> callback, right?
>>>>>>
>>>>>> It would appear so. To verify that:
>>>>>>
>>>>>> Alloc:
>>>>>> 976        wblock->handler_data = kmalloc(wblock->req_buf_size,
>>>>>>                            GFP_KERNEL);
>>>>>>         <check>
>>>>>>
>>>>>> 983        buf = kasprintf(GFP_KERNEL, "wmi/%s", wdriver->driver.name);
>>>>>>         <check>
>>>>>> 989        wblock->char_dev.name = buf;
>>>>>>
>>>>>> In lines 1022-1023:
>>>>>>
>>>>>> 1022        kfree(wblock->char_dev.name);
>>>>>> 1023        kfree(wblock->handler_data);
>>>>>>
>>>>>>>> This is why I don't think things should be rushed, but analysed
>>>>>>>> with clear and
>>>>>>>> cold head. And with as many eyes as possible :)
>>>>>>>>
>>>>>>>> The driver stuff is my long-term research interest. To state the
>>>>>>>> obvious,
>>>>>>>> the printing and multimedia education and industry in general
>>>>>>>> would benefit from
>>>>>>>> the open-source drivers for many instruments that still work, but
>>>>>>>> are obsoleted
>>>>>>>> by the producer and require unsupported versions of the OS.
>>>>>>>>
>>>>>>>> Thank you again for reviewing the bug report, however, ATM I do
>>>>>>>> not think I have
>>>>>>>> what it takes to hunt down the memleak. :-/
>>>>>>>
>>>>>>> Do you have a reproducer that you can use to show the problem better?
>>>>>>
>>>>>> Unfortunately, the problem doesn't seem to appear during the run of
>>>>>> a particular
>>>>>> test, but immediately on startup of the OS. This makes it awkward to
>>>>>> pinpoint the
>>>>>> exact service that triggered memory leaks. But they would appear to
>>>>>> have to do
>>>>>> with the initialisation of the USB devices, wouldn't they?
>>>>>>
>>>>>> There seem to be strings:
>>>>>>
>>>>>> "USBPortAccess,Enabled;[Optional:"
>>>>>> "USBBIOSSupport,Enabled;[Optional"
>>>>>> "USBEnumerationDelay,Disabled;[Op"
>>>>>>
>>>>>> This seems to be happening during USB initialisation and before any
>>>>>> services.
>>>>>> But I might as well be wrong.
>>>>>>
>>>>>>> Or can you test kernel patches to verify the problem is fixed or
>>>>>>> not if
>>>>>>> we send you patches to test?
>>>>>>
>>>>>> Certainly, Lord willing, I can test the patches in the same
>>>>>> environment that
>>>>>> mainfeted the bug (or memleak).

mtodorov@...ac:~/linux/kernel/linux_torvalds$ git bisect log
git bisect start '--' './drivers/platform/x86'
# good: [caa0708a81d6a2217c942959ef40d515ec1d3108] bootconfig: Change message if no bootconfig with CONFIG_BOOT_CONFIG_FORCE=y
git bisect good caa0708a81d6a2217c942959ef40d515ec1d3108
# bad: [8a02d70679fc1c434401863333c8ea7dbf201494] platform/x86: think-lmi: Add possible_values for ThinkStation
git bisect bad 8a02d70679fc1c434401863333c8ea7dbf201494
# good: [1a0009abfa7893b9cfcd3884658af1cbee6b26ce] platform: mellanox: mlx-platform: Initialize shift variable to 0
git bisect good 1a0009abfa7893b9cfcd3884658af1cbee6b26ce
# good: [b7c994f8c35e916e27c60803bb21457bc1373500] platform/x86 (gigabyte-wmi): Add support for A320M-S2H V2
git bisect good b7c994f8c35e916e27c60803bb21457bc1373500
# good: [45e21289bfc6e257885514790a8a8887da822d40] platform/x86: think-lmi: use correct possible_values delimiters
git bisect good 45e21289bfc6e257885514790a8a8887da822d40
# good: [cf337f27f3bfc4aeab4954c468239fd6233c7638] platform/x86: think-lmi: only display possible_values if available
git bisect good cf337f27f3bfc4aeab4954c468239fd6233c7638
# first bad commit: [8a02d70679fc1c434401863333c8ea7dbf201494] platform/x86: think-lmi: Add possible_values for ThinkStation
mtodorov@...ac:~/linux/kernel/linux_torvalds$

So the commit that triggered the bug on the Lenovo desktop box was:

diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c
index 3f0641360251..c816646eb661 100644
--- a/drivers/platform/x86/think-lmi.c
+++ b/drivers/platform/x86/think-lmi.c
@@ -1450,6 +1450,26 @@ static int tlmi_analyze(void)
                         if (ret || !setting->possible_values)
                                 pr_info("Error retrieving possible values for %d : %s\n",
                                                 i, setting->display_name);
+               } else {
+                       /*
+                        * Older Thinkstations don't support the bios_selections API.
+                        * Instead they store this as a [Optional:Option1,Option2] section of the
+                        * name string.
+                        * Try and pull that out if it's available.
+                        */
+                       char *item, *optstart, *optend;
+
+                       if (!tlmi_setting(setting->index, &item, LENOVO_BIOS_SETTING_GUID)) {
+                               optstart = strstr(item, "[Optional:");
+                               if (optstart) {
+                                       optstart += strlen("[Optional:");
+                                       optend = strstr(optstart, "]");
+                                       if (optend)
+                                               setting->possible_values =
+                                                       kstrndup(optstart, optend - optstart,
+                                                                       GFP_KERNEL);
+                               }
+                       }
                 }
                 /*
                  * firmware-attributes requires that possible_values are separated by ';' but

Thousand apologies, once again.

Best regards,
Mirsad

-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ