lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7cdc2d3f-df52-f7a9-15bf-fe4bc01d3c4f@amd.com>
Date:   Tue, 1 Feb 2022 08:56:17 +0100
From:   Christian König <christian.koenig@....com>
To:     Jia-Ju Bai <baijiaju1990@...il.com>, alexander.deucher@....com,
        Xinhui.Pan@....com, airlied@...ux.ie, daniel@...ll.ch
Cc:     amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] gpu: drm: radeon: two possible deadlocks involving locking
 and waiting

Hi Jia-Ju,

interesting that you have found those issues with an automated tool.

And yes that is a well design flaw within the radeon driver which can 
happen on hardware faults, e.g. when radeon_ring_backup() needs to be 
called.

But that happens so rarely and the driver is not developed further that 
we decided to not address this any more.

Regards,
Christian.

Am 01.02.22 um 08:40 schrieb Jia-Ju Bai:
> Hello,
>
> My static analysis tool reports a possible deadlock in the radeon 
> driver in Linux 5.16:
>
> #BUG 1
> radeon_dpm_change_power_state_locked()
>   mutex_lock(&rdev->ring_lock); --> Line 1133 (Lock A)
>   radeon_fence_wait_empty()
>     radeon_fence_wait_seq_timeout()
>       wait_event_timeout(rdev->fence_queue, ...) --> Line 504 (Wait X)
>
> radeon_ring_backup()
>   mutex_lock(&rdev->ring_lock); --> Line 289(Lock A)
>   radeon_fence_count_emitted()
>     radeon_fence_process()
>       wake_up_all(&rdev->fence_queue); --> Line 323 (Wake X)
>
> When radeon_dpm_change_power_state_locked() is executed, "Wait X" is 
> performed by holding "Lock A". If radeon_ring_backup() is executed at 
> this time, "Wake X" cannot be performed to wake up "Wait X" in 
> radeon_dpm_change_power_state_locked(), because "Lock A" has been 
> already hold by radeon_dpm_change_power_state_locked(), causing a 
> possible deadlock.
> I find that "Wait X" is performed with a timeout MAX_SCHEDULE_TIMEOUT, 
> to relieve the possible deadlock; but I think this timeout can cause 
> inefficient execution.
>
> #BUG 2
> radeon_ring_lock()
>   mutex_lock(&rdev->ring_lock); --> Line 147 (Lock A)
>   radeon_ring_alloc()
>     radeon_fence_wait_next()
>       radeon_fence_wait_seq_timeout()
>         wait_event_timeout(rdev->fence_queue, ...) --> Line 504 (Wait X)
>
> radeon_ring_backup()
>   mutex_lock(&rdev->ring_lock); --> Line 289(Lock A)
>   radeon_fence_count_emitted()
>     radeon_fence_process()
>       wake_up_all(&rdev->fence_queue); --> Line 323 (Wake X)
>
> When radeon_ring_lock() is executed, "Wait X" is performed by holding 
> "Lock A". If radeon_ring_backup() is executed at this time, "Wake X" 
> cannot be performed to wake up "Wait X" in radeon_ring_lock(), because 
> "Lock A" has been already hold by radeon_ring_lock(), causing a 
> possible deadlock.
> I find that "Wait X" is performed with a timeout MAX_SCHEDULE_TIMEOUT, 
> to relieve the possible deadlock; but I think this timeout can cause 
> inefficient execution.
>
> I am not quite sure whether these possible problems are real and how 
> to fix them if they are real.
> Any feedback would be appreciated, thanks :)
>
>
> Best wishes,
> Jia-Ju Bai
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ