lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6be28a20.9cf4.1916a257e4a.Coremail.andyshrk@163.com>
Date: Mon, 19 Aug 2024 18:18:47 +0800 (CST)
From: "Andy Yan" <andyshrk@....com>
To: "Cristian Ciocaltea" <cristian.ciocaltea@...labora.com>
Cc: "Mark Brown" <broonie@...nel.org>, 
	"Greg Kroah-Hartman" <gregkh@...uxfoundation.org>, 
	"Rafael J. Wysocki" <rafael@...nel.org>, 
	Heiko Stübner <heiko@...ech.de>, 
	"Andy Yan" <andy.yan@...k-chips.com>, kernel@...labora.com, 
	linux-kernel@...r.kernel.org, linux-rockchip@...ts.infradead.org
Subject: Re:Re: [PATCH RFC] regmap: maple: Switch to use irq-safe locking


Hi Cristian,

At 2024-08-17 04:11:27, "Cristian Ciocaltea" <cristian.ciocaltea@...labora.com> wrote:
>On 8/14/24 10:04 PM, Mark Brown wrote:
>> On Wed, Aug 14, 2024 at 01:20:21AM +0300, Cristian Ciocaltea wrote:
>
>[...]
>
>> I'd have a bigger question here which is why the driver is using a
>> dynamically allocated register cache in a hardirq context, especially
>> with no defaults provided?  Anything except the flat cache might do
>> allocations at runtime which might include in interrupt context unless
>> the caller is very careful and since the lockdep warning triggered it's
>> clear that this driver isn't.  The core will be doing atomic allocations
>> for MMIO but that's not something we want to be doing as a matter of
>> course...  I would generally expect drivers to try to ensure that any
>> registers are cached outside of the interrupt handler, usually by
>> specifying defaults or touching all registers during setup.
>> 
>> Without having done a full analysis it also looks like the marking of
>> volatile registers isn't right, it's not immediately clear that the
>> interrupt status and clear registers are volatile and they ought to be.
>> None of the registers accessed in interrupt context look like they
>> should be cached at all unless there's something triggered via the DRM
>> vblank calls.
>
>AFAIKT, all registers accessed in IRQ context are volatile, hence the
>register cache should not be involved at that point.
>
>The deadlock scenario indicated by lockdep actually points to the lock
>acquired by regcache_maple_exit(), which has been triggered during module
>unload operation, and the lock acquired by regcache_maple_write(), in the
>context of vop2_plane_atomic_update() called within the DRM stack.
>
>[   48.466666] -> (&mt->ma_lock){+...}-{2:2} {
>[   48.467066]    HARDIRQ-ON-W at:
>[   48.467360]                     lock_acquire+0x1d4/0x320
>[   48.467849]                     _raw_spin_lock+0x50/0x70
>[   48.468337]                     regcache_maple_exit+0x6c/0xe0
>[   48.468864]                     regcache_exit+0x8c/0xa8
>[   48.469344]                     regmap_exit+0x24/0x160
>[   48.469815]                     devm_regmap_release+0x1c/0x28
>[   48.470339]                     release_nodes+0x68/0xa8
>[   48.470818]                     devres_release_group+0x120/0x180
>[   48.471364]                     component_unbind+0x54/0x70
>[   48.471867]                     component_unbind_all+0xb0/0xe8
>[   48.472400]                     rockchip_drm_unbind+0x44/0x80 [rockchipdrm]
>[   48.473059]                     component_del+0xc8/0x158
>[   48.473545]                     dw_hdmi_rockchip_remove+0x28/0x40 [rockchipdrm]
>
>[...]
>
>[   48.482058]    INITIAL USE at:
>[   48.482344]                    lock_acquire+0x1d4/0x320
>[   48.482824]                    _raw_spin_lock+0x50/0x70
>[   48.483304]                    regcache_maple_write+0x27c/0x330
>[   48.483844]                    regcache_write+0x6c/0x88
>[   48.484323]                    _regmap_read+0x198/0x1c8
>[   48.484801]                    _regmap_update_bits+0xc0/0x148
>[   48.485327]                    regmap_field_update_bits_base+0x74/0xb0
>[   48.485919]                    vop2_plane_atomic_update+0x9e8/0x1490 [rockchipdrm]
>[   48.486631]                    drm_atomic_helper_commit_planes+0x190/0x2f8 [drm_kms_helper]
>
>I experimented with a reduced scope of this patch by limiting the use of
>the irq-safe lock to regcache_maple_exit() only, and I can confirm this 
>was enough to make lockdep happy.
>
>> It might be safer to fall back to the rbtree cache for this device since
>> rbtree doesn't force an extra level of locking on us, though like I say
>> I'm not convinced that what the driver is doing with caching is a super
>> good idea.  Though probably what the driver is doing should work.
>
>I actually gave the flat cache a try on a Rock 3A board and didn't
>encounter any (obvious) issues, but my testing capabilities are rather
>limited at the moment.
>
>@Andy: Could you, please, shed some light on the topic? i.e. the rational
>behind going for an rbtree cache over a flat one, since the latter would be
>better suited for MMIO devices.

I have encountered a similar issue when I add support for rk3588[0]

Now i can see this issue when rockchipdrm load with:
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCKDEP=y

But I can't reproduce this issue  at unload (with cmd: rmmod rockchipdrm)。
I need to take a deeper look to understanding the detail。


[0]https://patchwork.kernel.org/project/linux-rockchip/patch/20231217084415.2373043-1-andyshrk@163.com/



> 
>> My first thought here is that if we've got a regmap using spinlocks for
>> the regmap lock and a maple tree cache we should arrange things so that
>> the maple tree lock is used for the regmap's lock.  That would however
>> involve some unpleasant abstraction violation, and possibly some macro
>> fun since we'd need to elide the locking from the cache itself when
>> using the same lock at the regmap level.  I think that's going to be a
>> case of choosing the least unpleasant option.
>
>Thanks, Mark, for the detailed feedback on this!
>
>Regards,
>Cristian
>
>_______________________________________________
>Linux-rockchip mailing list
>Linux-rockchip@...ts.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-rockchip

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ