lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 1 Feb 2022 15:40:19 +0800
From:   Jia-Ju Bai <baijiaju1990@...il.com>
To:     alexander.deucher@....com, christian.koenig@....com,
        Xinhui.Pan@....com, airlied@...ux.ie, daniel@...ll.ch
Cc:     amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: [BUG] gpu: drm: radeon: two possible deadlocks involving locking and
 waiting

Hello,

My static analysis tool reports a possible deadlock in the radeon driver 
in Linux 5.16:

#BUG 1
radeon_dpm_change_power_state_locked()
   mutex_lock(&rdev->ring_lock); --> Line 1133 (Lock A)
   radeon_fence_wait_empty()
     radeon_fence_wait_seq_timeout()
       wait_event_timeout(rdev->fence_queue, ...) --> Line 504 (Wait X)

radeon_ring_backup()
   mutex_lock(&rdev->ring_lock); --> Line 289(Lock A)
   radeon_fence_count_emitted()
     radeon_fence_process()
       wake_up_all(&rdev->fence_queue); --> Line 323 (Wake X)

When radeon_dpm_change_power_state_locked() is executed, "Wait X" is 
performed by holding "Lock A". If radeon_ring_backup() is executed at 
this time, "Wake X" cannot be performed to wake up "Wait X" in 
radeon_dpm_change_power_state_locked(), because "Lock A" has been 
already hold by radeon_dpm_change_power_state_locked(), causing a 
possible deadlock.
I find that "Wait X" is performed with a timeout MAX_SCHEDULE_TIMEOUT, 
to relieve the possible deadlock; but I think this timeout can cause 
inefficient execution.

#BUG 2
radeon_ring_lock()
   mutex_lock(&rdev->ring_lock); --> Line 147 (Lock A)
   radeon_ring_alloc()
     radeon_fence_wait_next()
       radeon_fence_wait_seq_timeout()
         wait_event_timeout(rdev->fence_queue, ...) --> Line 504 (Wait X)

radeon_ring_backup()
   mutex_lock(&rdev->ring_lock); --> Line 289(Lock A)
   radeon_fence_count_emitted()
     radeon_fence_process()
       wake_up_all(&rdev->fence_queue); --> Line 323 (Wake X)

When radeon_ring_lock() is executed, "Wait X" is performed by holding 
"Lock A". If radeon_ring_backup() is executed at this time, "Wake X" 
cannot be performed to wake up "Wait X" in radeon_ring_lock(), because 
"Lock A" has been already hold by radeon_ring_lock(), causing a possible 
deadlock.
I find that "Wait X" is performed with a timeout MAX_SCHEDULE_TIMEOUT, 
to relieve the possible deadlock; but I think this timeout can cause 
inefficient execution.

I am not quite sure whether these possible problems are real and how to 
fix them if they are real.
Any feedback would be appreciated, thanks :)


Best wishes,
Jia-Ju Bai

Powered by blists - more mailing lists