linux-kernel - dealock in drm_fb_helper_damage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+bJiZgT1W4JY+X=aZjbg8+X2fw7j2pxH_Hke_yn7R0Qnw@mail.gmail.com>
Date:   Sun, 13 Nov 2022 21:42:49 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Dave Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
        DRI <dri-devel@...ts.freedesktop.org>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Maxime Ripard <mripard@...nel.org>,
        Thomas Zimmermann <tzimmermann@...e.de>,
        LKML <linux-kernel@...r.kernel.org>
Subject: dealock in drm_fb_helper_damage_work

Hi,

I am getting the following deadlock on reservation_ww_class_mutex
while trying to boot next-20221111 kernel:

============================================
WARNING: possible recursive locking detected
6.1.0-rc4-next-20221111 #193 Not tainted
--------------------------------------------
kworker/4:1/81 is trying to acquire lock:
ffff88812ebe89a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at:
dma_resv_lock_interruptible include/linux/dma-resv.h:372 [inline]
ffff88812ebe89a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at:
ttm_bo_reserve include/drm/ttm/ttm_bo_driver.h:121 [inline]
ffff88812ebe89a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at:
drm_gem_vram_vmap+0xa4/0x590 drivers/gpu/drm/drm_gem_vram_helper.c:436

but task is already holding lock:
ffff88812ebe89a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at:
dma_resv_lock include/linux/dma-resv.h:345 [inline]
ffff88812ebe89a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at:
drm_gem_vmap_unlocked+0x3f/0xa0 drivers/gpu/drm/drm_gem.c:1195

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(reservation_ww_class_mutex);
  lock(reservation_ww_class_mutex);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

4 locks held by kworker/4:1/81:
 #0: ffff888100078d38 ((wq_completion)events){+.+.}-{0:0}, at:
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888100078d38 ((wq_completion)events){+.+.}-{0:0}, at:
arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
 #0: ffff888100078d38 ((wq_completion)events){+.+.}-{0:0}, at:
atomic_long_set include/linux/atomic/atomic-instrumented.h:1280
[inline]
 #0: ffff888100078d38 ((wq_completion)events){+.+.}-{0:0}, at:
set_work_data kernel/workqueue.c:636 [inline]
 #0: ffff888100078d38 ((wq_completion)events){+.+.}-{0:0}, at:
set_work_pool_and_clear_pending kernel/workqueue.c:663 [inline]
 #0: ffff888100078d38 ((wq_completion)events){+.+.}-{0:0}, at:
process_one_work+0x8e4/0x1720 kernel/workqueue.c:2260
 #1: ffffc9000694fda0
((work_completion)(&helper->damage_work)){+.+.}-{0:0}, at:
process_one_work+0x918/0x1720 kernel/workqueue.c:2264
 #2: ffff88812ebe8278 (&helper->lock){+.+.}-{3:3}, at:
drm_fbdev_damage_blit drivers/gpu/drm/drm_fbdev_generic.c:312 [inline]
 #2: ffff88812ebe8278 (&helper->lock){+.+.}-{3:3}, at:
drm_fbdev_fb_dirty+0x30e/0xcd0 drivers/gpu/drm/drm_fbdev_generic.c:342
 #3: ffff88812ebe89a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at:
dma_resv_lock include/linux/dma-resv.h:345 [inline]
 #3: ffff88812ebe89a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at:
drm_gem_vmap_unlocked+0x3f/0xa0 drivers/gpu/drm/drm_gem.c:1195

stack backtrace:
CPU: 4 PID: 81 Comm: kworker/4:1 Not tainted 6.1.0-rc4-next-20221111 #193
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Workqueue: events drm_fb_helper_damage_work
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x100/0x178 lib/dump_stack.c:106
 print_deadlock_bug kernel/locking/lockdep.c:2990 [inline]
 check_deadlock kernel/locking/lockdep.c:3033 [inline]
 validate_chain kernel/locking/lockdep.c:3818 [inline]
 __lock_acquire.cold+0x119/0x3b9 kernel/locking/lockdep.c:5055
 lock_acquire kernel/locking/lockdep.c:5668 [inline]
 lock_acquire+0x1e0/0x610 kernel/locking/lockdep.c:5633
 __mutex_lock_common kernel/locking/mutex.c:603 [inline]
 __ww_mutex_lock.constprop.0+0x1ba/0x2ee0 kernel/locking/mutex.c:754
 ww_mutex_lock_interruptible+0x37/0x140 kernel/locking/mutex.c:886
 dma_resv_lock_interruptible include/linux/dma-resv.h:372 [inline]
 ttm_bo_reserve include/drm/ttm/ttm_bo_driver.h:121 [inline]
 drm_gem_vram_vmap+0xa4/0x590 drivers/gpu/drm/drm_gem_vram_helper.c:436
 drm_gem_vmap+0xc5/0x1b0 drivers/gpu/drm/drm_gem.c:1166
 drm_gem_vmap_unlocked+0x4a/0xa0 drivers/gpu/drm/drm_gem.c:1196
 drm_client_buffer_vmap+0x45/0xd0 drivers/gpu/drm/drm_client.c:326
 drm_fbdev_damage_blit drivers/gpu/drm/drm_fbdev_generic.c:314 [inline]
 drm_fbdev_fb_dirty+0x31e/0xcd0 drivers/gpu/drm/drm_fbdev_generic.c:342
 drm_fb_helper_damage_work+0x27a/0x5d0 drivers/gpu/drm/drm_fb_helper.c:388
 process_one_work+0xa33/0x1720 kernel/workqueue.c:2289
 worker_thread+0x67d/0x10e0 kernel/workqueue.c:2436
 kthread+0x2e4/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>

The config is:
https://gist.githubusercontent.com/dvyukov/2897b21db809075a22db0370c495ed2d/raw/9b2535b2ba77bb57e4f1ba2b909ad4075b6e2c6a/gistfile1.txt

Qemu command line:
qemu-system-x86_64 -enable-kvm -machine q35,nvdimm -cpu
max,migratable=off -smp 18 \
-m 72G -hda buildroot-amd64-2021.08 -kernel arch/x86/boot/bzImage -nographic \
-net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic,model=virtio-net-pci \
-append "console=ttyS0 root=/dev/sda1 earlyprintk=serial rodata=n \
oops=panic panic_on_warn=1 panic=86400 coredump_filter=0xffff"