[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240315122527.344534-1-pl.dourneau@klervi.com>
Date: Fri, 15 Mar 2024 13:25:27 +0100
From: Pierre-Louis Dourneau <pl.dourneau@...rvi.com>
To: Ludovic.Desroches@...rochip.com
Cc: Pierre-Louis Dourneau <pl.dourneau@...rvi.com>,
sam@...nborg.org,
bbrezillon@...nel.org,
maarten.lankhorst@...ux.intel.com,
mripard@...nel.org,
tzimmermann@...e.de,
airlied@...il.com,
daniel@...ll.ch,
Nicolas.Ferre@...rochip.com,
alexandre.belloni@...tlin.com,
claudiu.beznea@...on.dev,
dri-devel@...ts.freedesktop.org,
linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org,
Manikandan.M@...rochip.com,
b.alcaina@...rvi.com,
a.lahache@...rvi.com,
n.georges@...rvi.com
Subject: Re: [PATCH] drm/atmel-hlcdc: Release CRTC commit when destroying plane state
On 3/12/24, Pierre-Louis Dourneau <pl.dourneau@...rvi.com> wrote:
> On 3/8/24, Ludovic.Desroches@...rochip.com <Ludovic.Desroches@...rochip.com> wrote:
> > This patch fixes the memory leak but introduces a crash on my side when
> > exiting a graphics app using the Microchip graphics library.
>
> We've tried to reproduce your crash with 6.1.22-linux4microchip-2023.04,
> to no avail. We'll try to upgrade to 6.1.55-linux4microchip-2023.10 (your
> version) and test again.
I was able to test a few more recent kernel versions[0] with the patch
applied. None yielded any crash, be it running Microchip's EGT samples[1]
or libdrm's modetest. Although, what I did manage to reproduce was a
refcount underflow similar to the one you had:
# modetest -M atmel-hlcdc -s 32:#0 -P 33@47:800x400@...4 -a
setting mode 1024x600-65.48Hz on connectors 32, crtc 47
testing 800x400@...4 on plane 33, crtc 47
[ 75.736699] ------------[ cut here ]------------
[ 75.741353] WARNING: CPU: 0 PID: 200 at lib/refcount.c:28 refcount_warn_saturate+0xf0/0x15c
[ 75.749731] refcount_t: underflow; use-after-free.
[ 75.754574] CPU: 0 PID: 200 Comm: modetest Not tainted 6.1.55-linux4microchip-2023.10 #4
[ 75.762915] Hardware name: Microchip SAM9X60
[ 75.767198] unwind_backtrace from show_stack+0x10/0x18
[ 75.772423] show_stack from dump_stack_lvl+0x28/0x34
[ 75.777479] dump_stack_lvl from __warn+0x8c/0xc0
[ 75.782187] __warn from warn_slowpath_fmt+0x74/0xa8
[ 75.787158] warn_slowpath_fmt from refcount_warn_saturate+0xf0/0x15c
[ 75.793611] refcount_warn_saturate from __drm_atomic_helper_plane_destroy_state+0xd0/0xd4
[ 75.801894] __drm_atomic_helper_plane_destroy_state from atmel_hlcdc_plane_atomic_destroy_state+0x38/0x48
[ 75.811573] atmel_hlcdc_plane_atomic_destroy_state from drm_atomic_state_default_clear+0x1c4/0x2fc
[ 75.820642] drm_atomic_state_default_clear from __drm_atomic_state_free+0x7c/0xb0
[ 75.828228] __drm_atomic_state_free from drm_mode_atomic_ioctl+0x868/0xb88
[ 75.835204] drm_mode_atomic_ioctl from drm_ioctl+0x200/0x3c4
[ 75.840960] drm_ioctl from sys_ioctl+0x240/0xb48
[ 75.845669] sys_ioctl from ret_fast_syscall+0x0/0x44
[ 75.850725] Exception stack(0xc8c91fa8 to 0xc8c91ff0)
[ 75.855794] 1fa0: 004b0ec0 00000003 00000003 c03864bc becd7bc8 becd7b98
[ 75.863992] 1fc0: 004b0ec0 00000003 becd7bc8 00000036 00000003 00000003 b6f22f80 00000400
[ 75.872183] 1fe0: b6f21e74 becd7a68 b6f07494 b6f61cc8
[ 75.877289] ---[ end trace 0000000000000000 ]---
Atomic Commit failed [1]
Same but without using the atomic uAPI (`-a` option removed):
# modetest -M atmel-hlcdc -s 32:#0 -P 33@47:800x400@...4
setting mode 1024x600-65.48Hz on connectors 32, crtc 47
testing 800x400@...4 overlay plane 33
failed to enable plane: Invalid argument
[ 98.542958] ------------[ cut here ]------------
[ 98.547547] WARNING: CPU: 0 PID: 28 at lib/refcount.c:28 refcount_warn_saturate+0xf0/0x15c
[ 98.555902] refcount_t: underflow; use-after-free.
[ 98.560698] CPU: 0 PID: 28 Comm: kworker/0:7 Tainted: G W 6.1.55-linux4microchip-2023.10 #6
[ 98.570695] Hardware name: Microchip SAM9X60
[ 98.574972] Workqueue: events drm_mode_rmfb_work_fn
[ 98.579859] unwind_backtrace from show_stack+0x10/0x18
[ 98.587615] show_stack from dump_stack_lvl+0x28/0x34
[ 98.595201] dump_stack_lvl from __warn+0x8c/0xc0
[ 98.602438] __warn from warn_slowpath_fmt+0x74/0xa8
[ 98.609937] warn_slowpath_fmt from refcount_warn_saturate+0xf0/0x15c
[ 98.618919] refcount_warn_saturate from __drm_atomic_helper_plane_destroy_state+0xd0/0xd4
[ 98.629740] __drm_atomic_helper_plane_destroy_state from atmel_hlcdc_plane_atomic_destroy_state+0x38/0x48
[ 98.641947] atmel_hlcdc_plane_atomic_destroy_state from drm_atomic_state_default_clear+0x1c4/0x2fc
[ 98.653545] drm_atomic_state_default_clear from __drm_atomic_state_free+0x7c/0xb0
[ 98.663660] __drm_atomic_state_free from drm_framebuffer_remove+0x48c/0x540
[ 98.673252] drm_framebuffer_remove from drm_mode_rmfb_work_fn+0x68/0x84
[ 98.682495] drm_mode_rmfb_work_fn from process_one_work+0x1b4/0x3f4
[ 98.691390] process_one_work from worker_thread+0x214/0x4e8
[ 98.699587] worker_thread from kthread+0xb4/0xd8
[ 98.706824] kthread from ret_from_fork+0x14/0x28
[ 98.714060] Exception stack(0xc88adfb0 to 0xc88adff8)
[ 98.719125] dfa0: 00000000 00000000 00000000 00000000
[ 98.727327] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 98.735520] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 98.742219] ---[ end trace 0000000000000000 ]---
The first one is not deterministic, you have to try a few times to trigger
it. The second one is a hit every time.
Same commands on a kernel without the patch don't report any underflow.
Note the commit in the first command also fails on a kernel without the
patch, which I guess is expected as plane 33 is the primary plane and I'm
trying set dimensions that do not match the size of the display. The commit
succeeds when invoking with the correct dimensions, but then I can't make
it produce an underflow. Same with the second command.
It seems to only trigger once per boot. Running the commands again does not
yield another underflow.
Looking at the disassembly, this is an underflow of the drm_crtc_commit's
refcount this time. In the warning you had, it was on a framebuffer object.
Anyway, I'll go back to the drawing board, study more closely the resource
release part of the driver. Thanks for having brought up the issues with
the patch.
[0]: Namely, linux4microchip-2023.10 (6.1.55), v6.8, and drm-next-2023-03-13
[1]: https://github.com/linux4sam/egt/tree/master/examples
Pierre-Louis
Powered by blists - more mailing lists