[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220207064504.GC1905@thinkpad>
Date: Mon, 7 Feb 2022 12:15:04 +0530
From: Manivannan Sadhasivam <mani@...nel.org>
To: Daniel Thompson <daniel.thompson@...aro.org>
Cc: Jia-Ju Bai <baijiaju1990@...il.com>, hemantk@...eaurora.org,
bbhatt@...eaurora.org, loic.poulain@...aro.org,
jhugo@...eaurora.org, linux-arm-msm@...r.kernel.org,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] bus: mhi: possible deadlock in mhi_pm_disable_transition()
and mhi_async_power_up()
On Tue, Feb 01, 2022 at 05:15:40PM +0000, Daniel Thompson wrote:
> On Sat, Jan 29, 2022 at 10:56:30AM +0800, Jia-Ju Bai wrote:
> > Hello,
> >
> > My static analysis tool reports a possible deadlock in the mhi driver in
> > Linux 5.10:
> >
> > mhi_async_power_up()
> > mutex_lock(&mhi_cntrl->pm_mutex); --> Line 933 (Lock A)
> > wait_event_timeout(mhi_cntrl->state_event, ...) --> Line 985 (Wait X)
> > mutex_unlock(&mhi_cntrl->pm_mutex); --> Line 1040 (Unlock A)
> >
> > mhi_pm_disable_transition()
> > mutex_lock(&mhi_cntrl->pm_mutex); --> Line 463 (Lock A)
> > wake_up_all(&mhi_cntrl->state_event); --> Line 474 (Wake X)
> > mutex_unlock(&mhi_cntrl->pm_mutex); --> Line 524 (Unlock A)
> > wake_up_all(&mhi_cntrl->state_event); --> Line 526 (Wake X)
> >
> > When mhi_async_power_up() is executed, "Wait X" is performed by holding
> > "Lock A". If mhi_pm_disable_transition() is concurrently executed at this
> > time, "Wake X" cannot be performed to wake up "Wait X" in
> > mhi_async_power_up(), because "Lock A" is already hold by
> > mhi_async_power_up(), causing a possible deadlock.
> > I find that "Wait X" is performed with a timeout, to relieve the possible
> > deadlock; but I think this timeout can cause inefficient execution.
> >
> > I am not quite sure whether this possible problem is real and how to fix it
> > if it is real.
> > Any feedback would be appreciated, thanks :)
>
> Interesting find but I think it would be better to run your tool
> against more recent kernels to confirm any problem reports. In this
> case the code you mention looks like it was removed in v5.17-rc1
> (and should eventually make its way to the stable kernels too).
>
Hmm, looks like the commit didn't apply cleanly to 5.10:
https://www.spinics.net/lists/stable/msg526754.html
Let send the fix up version.
Thanks,
Mani
>
> Daniel.
Powered by blists - more mailing lists