linux-kernel - Re: [BUG] bus: mhi: possible ABBA deadlock in mhi_pm_m0_transition() and mhi_send

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210716064512.GE3323@workstation>
Date:   Fri, 16 Jul 2021 12:15:12 +0530
From:   Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
To:     Bhaumik Bhatt <bbhatt@...eaurora.org>
Cc:     Jia-Ju Bai <baijiaju1990@...il.com>, hemantk@...eaurora.org,
        linux-arm-msm@...r.kernel.org,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] bus: mhi: possible ABBA deadlock in mhi_pm_m0_transition()
 and mhi_send_cmd()

On Thu, Jul 15, 2021 at 09:57:22AM -0700, Bhaumik Bhatt wrote:
> On 2021-07-15 02:45 AM, Jia-Ju Bai wrote:
> > Hello,
> > 
> > I find there is a possible ABBA deadlock in the MHI driver in Linux
> > 5.10:
> > 
> > In mhi_pm_m0_transition():
> > 262:     read_lock_bh(&mhi_cntrl->pm_lock);
> > 281:     spin_lock_irq(&mhi_cmd->lock);
> > 
> > In mhi_send_cmd():
> > 1181:   spin_lock_bh(&mhi_cmd->lock);
> > 1207:   read_lock_bh(&mhi_cntrl->pm_lock);
> > 
> > When mhi_pm_m0_transition() and mhi_send_cmd() are concurrently
> > executed, the deadlock can occur.
> > 
> > I check the code and find a possible case of such concurrent execution:
> > 
> > #CPU1:
> > mhi_poll (mhi_event->process_event(...))
> >   mhi_process_ctrl_ev_ring
> >     mhi_pm_m0_transition
> > 
> > #CPU2:
> > mhi_prepare_for_transfer
> >   mhi_prepare_channel
> >     mhi_send_cmd
> > 
> > Note that mhi_poll() and mhi_prepare_for_transfer() are both exported
> > by EXPORT_SYMBOL.
> > Thus, I guess these two functions could be concurrently called by a MHI
> > driver.
> > 
> > I am not quite sure whether this possible deadlock is real and how to
> > fix it if it is real.
> > Any feedback would be appreciated, thanks :)
> > 
> > 
> > Best wishes,
> > Jia-Ju Bai
> 
> Few pointers from your example:
> 
> 1. mhi_poll() is currently not used by any client upstream yet.

Then this shouldn't be added in first place... :/

> 2. Polling is not to be used for single event ring (shared control + data)
> cases
> since it is meant to be for client drivers with dedicated data packets only.
> 3. mhi_send_cmd() will always be called after an mhi_pm_m0_transition() has
> completed by design since we wait for the device to be held in M0 prior to
> it.
> 

But client can be unloaded during M0 event!

Anyway, I don't think the deadlock scenario is valid because of the usage
of "read_lock_bh()". So if "mhi_send_cmd()" has acquired
"spin_lock_bh(&mhi_cmd->lock)", it can always acquire
"read_lock_bh(&mhi_cntrl->pm_lock)" as multiple readers can acquire the
read lock.

Deadlock would only occur if one of the functions take write lock.

Thanks for auditing.

Regards,
Mani

> Would like to know what Mani and Hemant have to say. I don't think we can
> run in
> to the scenario from your example so we should be safe.
> 
> Thanks,
> Bhaumik
> ---
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project