lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84fcba70-396a-080c-9646-170f9925f099@roeck-us.net>
Date:   Wed, 29 Jun 2022 05:56:13 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     Dmitry Osipenko <dmitry.osipenko@...labora.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        Amit Kucheria <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>,
        Thierry Reding <thierry.reding@...il.com>,
        Jonathan Hunter <jonathanh@...dia.com>,
        Dmitry Osipenko <digetx@...il.com>,
        "open list:TEGRA ARCHITECTURE SUPPORT" <linux-tegra@...r.kernel.org>,
        Hardware Monitoring <linux-hwmon@...r.kernel.org>,
        rafael@...nel.org
Subject: Re: [PATCH 2/3] thermal/drivers/tegra: Remove get_trend function

On 6/29/22 02:35, Dmitry Osipenko wrote:
> On 6/28/22 21:43, Guenter Roeck wrote:
>> On Tue, Jun 28, 2022 at 08:10:30AM -0700, Guenter Roeck wrote:
>>> On Tue, Jun 28, 2022 at 02:44:31PM +0300, Dmitry Osipenko wrote:
>>>> On 6/28/22 11:41, Daniel Lezcano wrote:
>>>>>
>>>>> Thierry, Dmitry,
>>>>>
>>>>> are fine with this patch?
>>>>
>>>> Seems should be good. I couldn't test it using recent the linux-next
>>>> because of a lockup in LM90 driver. There were quite a lot of changes in
>>>> LM90 recently, adding Guenter.
>>>>
>>>
>>> Weird, I tested those changes to death with real hardware, and I don't
>>> see a code path where the mutex can be left in blocked state unless the
>>> underlying i2c driver locks up for some reason. What is the platform,
>>> and can you point me to the devicetree file ? Also, is there anything
>>> else lm90 or i2c related in the kernel log ?
>>>
>>
>> Follow-up question: I see that various Tegra systems use lm90 compatible
>> chips, and the interrupt line is in general wired up. Can you check if
>> you get lots of interrupts on that interrupt line ? Also, can you check
>> what happens if you read hwmon attributes directly ?
> 
> The number of interrupt fires is okay. It's a Nexus 7 Tegra30 tablet
> device that I'm using for the testing.
> 
> Today I enabled the lockdep and it immediately showed where the problem is:
> 

Yes, that is an obvious problem. Thanks a lot for testing!

Guenter

> ======================================================
> WARNING: possible circular locking dependency detected
> 5.19.0-rc4-next-20220628-00002-g94e5dbbe1c58-dirty #24 Not tainted
> ------------------------------------------------------
> irq/91-lm90/130 is trying to acquire lock:
> c27ba380 (&tz->lock){+.+.}-{3:3}, at: thermal_zone_device_update+0x2c/0x64
> 
>                 but task is already holding lock:
> c27b42c8 (&data->update_lock){+.+.}-{3:3}, at: lm90_irq_thread+0x2c/0x68
> 
>                 which lock already depends on the new lock.
> 
> 
>                 the existing dependency chain (in reverse order) is:
> 
>                 -> #1 (&data->update_lock){+.+.}-{3:3}:
>         __mutex_lock+0x9c/0x984
>         mutex_lock_nested+0x2c/0x34
>         lm90_read+0x44/0x3e8
>         hwmon_thermal_get_temp+0x58/0x8c
>         of_thermal_get_temp+0x38/0x44
>         thermal_zone_get_temp+0x5c/0x7c
>         thermal_zone_device_update.part.0+0x48/0x5fc
>         thermal_zone_device_set_mode+0xa0/0xe4
>         thermal_zone_device_enable+0x1c/0x20
>         thermal_zone_of_sensor_register+0x18c/0x19c
>         devm_thermal_zone_of_sensor_register+0x68/0xa4
>         __hwmon_device_register+0x704/0x91c
>         hwmon_device_register_with_info+0x6c/0x80
>         devm_hwmon_device_register_with_info+0x78/0xb4
>         lm90_probe+0x618/0x8c0
>         i2c_device_probe+0x170/0x2e0
>         really_probe+0xd8/0x300
>         __driver_probe_device+0x94/0xf4
>         driver_probe_device+0x40/0x118
>         __device_attach_driver+0xc8/0x10c
>         bus_for_each_drv+0x90/0xdc
>         __device_attach+0xbc/0x1d4
>         device_initial_probe+0x1c/0x20
>         bus_probe_device+0x98/0xa0
>         deferred_probe_work_func+0x8c/0xbc
>         process_one_work+0x2b8/0x774
>         worker_thread+0x17c/0x56c
>         kthread+0x108/0x13c
>         ret_from_fork+0x14/0x28
>         0x0
> 
>                 -> #0 (&tz->lock){+.+.}-{3:3}:
>         __lock_acquire+0x173c/0x3198
>         lock_acquire+0x128/0x3f0
>         __mutex_lock+0x9c/0x984
>         mutex_lock_nested+0x2c/0x34
>         thermal_zone_device_update+0x2c/0x64
>         hwmon_notify_event+0x128/0x138
>         lm90_update_alarms_locked+0x35c/0x3b8
>         lm90_irq_thread+0x38/0x68
>         irq_thread_fn+0x2c/0x8c
>         irq_thread+0x190/0x29c >         kthread+0x108/0x13c
>         ret_from_fork+0x14/0x28
>         0x0
> 
>                 other info that might help us debug this:
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(&data->update_lock);
>                                 lock(&tz->lock);
>                                 lock(&data->update_lock);
>    lock(&tz->lock);
> 
>                  *** DEADLOCK ***
> 
> 1 lock held by irq/91-lm90/130:
>   #0: c27b42c8 (&data->update_lock){+.+.}-{3:3}, at:
> lm90_irq_thread+0x2c/0x68
> 
>                 stack backtrace:
> CPU: 1 PID: 130 Comm: irq/91-lm90 Not tainted
> 5.19.0-rc4-next-20220628-00002-g94e5dbbe1c58-dirty #24
> Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> Backtrace:
>   dump_backtrace from show_stack+0x20/0x24
>   r7:c33d1b60 r6:00000080 r5:60000093 r4:c168c6a4
>   show_stack from dump_stack_lvl+0x68/0x98
>   dump_stack_lvl from dump_stack+0x18/0x1c
>   r7:c33d1b60 r6:c20328cc r5:c203c700 r4:c20328cc
>   dump_stack from print_circular_bug+0x2ec/0x33c
>   print_circular_bug from check_noncircular+0x104/0x168
>   r10:c1a14cc8 r9:c33d1240 r8:00000001 r7:00000000 r6:dfc3dcc0 r5:c33d1b60
>   r4:c33d1b80
>   check_noncircular from __lock_acquire+0x173c/0x3198
>   r7:c33d1b80 r6:c202bc98 r5:c33d1b60 r4:c21d92ac
>   __lock_acquire from lock_acquire+0x128/0x3f0
>   r10:60000013 r9:00000000 r8:00000000 r7:00000000 r6:dfc3dd40 r5:c19ac688
>   r4:c19ac688
>   lock_acquire from __mutex_lock+0x9c/0x984
>   r10:c27ba380 r9:00000000 r8:c21d92ac r7:c33d1240 r6:00000000 r5:00000000
>   r4:c27ba348
>   __mutex_lock from mutex_lock_nested+0x2c/0x34
>   r10:c27b4000 r9:00000000 r8:dfc3de87 r7:00000000 r6:c27ba348 r5:00000000
>   r4:c27ba000
>   mutex_lock_nested from thermal_zone_device_update+0x2c/0x64
>   thermal_zone_device_update from hwmon_notify_event+0x128/0x138
>   r7:00000000 r6:00000000 r5:c2d23ea4 r4:c33fd040
>   hwmon_notify_event from lm90_update_alarms_locked+0x35c/0x3b8
>   r8:c27b4378 r7:c2d23c08 r6:00000020 r5:00000000 r4:c27b4240
>   lm90_update_alarms_locked from lm90_irq_thread+0x38/0x68
>   r9:c01c2814 r8:00000001 r7:c33d2240 r6:c27b4290 r5:c27b4240 r4:c33fc200
>   lm90_irq_thread from irq_thread_fn+0x2c/0x8c
>   r7:c33d2240 r6:c27b4000 r5:c33d1240 r4:c33fc200
>   irq_thread_fn from irq_thread+0x190/0x29c
>   r7:c33d2240 r6:c33fc224 r5:c33d1240 r4:00000000
>   irq_thread from kthread+0x108/0x13c
>   r10:00000000 r9:df9ddbf4 r8:c31d2200 r7:c33fc200 r6:c01c2710 r5:c33d1240
>   r4:c33fc240
>   kthread from ret_from_fork+0x14/0x28
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ