linux-kernel - Re: [PATCH] scsi: ufs: Fix a possible dead lock in clock scaling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <644dcd92-25ae-e951-d9f3-607306a02370@acm.org>
Date:   Wed, 29 Sep 2021 11:15:18 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     Can Guo <cang@...eaurora.org>
Cc:     asutoshd@...eaurora.org, nguyenb@...eaurora.org,
        hongwus@...eaurora.org, ziqichen@...eaurora.org,
        linux-scsi@...r.kernel.org, kernel-team@...roid.com,
        Alim Akhtar <alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Bean Huo <beanhuo@...ron.com>,
        Stanley Chu <stanley.chu@...iatek.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] scsi: ufs: Fix a possible dead lock in clock scaling

On 9/28/21 8:31 PM, Can Guo wrote:
> On 2021-09-18 01:27, Bart Van Assche wrote:
>> On 9/16/21 6:51 PM, Can Guo wrote:
>>> Assume a scenario where task A and B call ufshcd_devfreq_scale()
>>> simultaneously. After task B calls downgrade_write() [1], but before it
>>> calls down_read() [3], if task A calls down_write() [2], when task B calls
>>> down_read() [3], it will lead to dead lock.
>>
>> Something is wrong with the above description. The downgrade_write() call is
>> not followed by down_read() but by up_read(). Additionally, I don't see how
>> concurrent calls of ufshcd_devfreq_scale() could lead to a deadlock.
> 
> As mentioned in the commit msg, the down_read() [3] is from ufshcd_wb_ctrl().
> 
> Task A -
> down_write [2]
> ufshcd_clock_scaling_prepare
> ufshcd_devfreq_scale
> ufshcd_clkscale_enable_store
> 
> Task B -
> down_read [3]
> ufshcd_exec_dev_cmd
> ufshcd_query_flag
> ufshcd_wb_ctrl
> downgrade_write [1]
> ufshcd_devfreq_scale
> ufshcd_devfreq_target
> devfreq_set_target
> update_devfreq
> devfreq_performance_handler
> governor_store
> 
> 
>> If one thread calls downgrade_write() and another thread calls down_write()
>> immediately, that down_write() call will block until the other thread has called up_read()
>> without triggering a deadlock.
> 
> Since the down_write() caller is blocked, the down_read() caller, which comes after
> down_write(), is blocked too, no? downgrade_write() keeps lock owner as it is, but
> it does not change the fact that readers and writers can be blocked by each other.

Please use the upstream function names when posting upstream patches. I think that
ufshcd_wb_ctrl() has been renamed into ufshcd_wb_toggle().

So the deadlock is caused by nested locking - one task holding a reader lock, another
task calling down_write() and next the first task grabbing the reader lock recursively?
I prefer one of the following two solutions above the patch that has been posted since
I expect that both alternatives will result in easier to maintain UFS code:
- Fix the down_read() implementation. Making down_read() wait in case of nested locking
   seems wrong to me.
- Modify the UFS driver such that it does not lock hba->clk_scaling_lock recursively.

Thanks,

Bart.