[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8c049ce-07e1-8b34-678d-41b3d6d41983@broadcom.com>
Date: Sun, 26 May 2019 20:42:21 +0200
From: Arend Van Spriel <arend.vanspriel@...adcom.com>
To: Brian Masney <masneyb@...tation.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Franky Lin <franky.lin@...adcom.com>,
Hante Meuleman <hante.meuleman@...adcom.com>,
Chi-Hsien Lin <chi-hsien.lin@...ress.com>,
Wright Feng <wright.feng@...ress.com>
Cc: ulf.hansson@...aro.org, faiz_abbas@...com,
linux-mmc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-msm@...r.kernel.org, Kalle Valo <kvalo@...eaurora.org>,
linux-wireless@...r.kernel.org,
brcm80211-dev-list.pdl@...adcom.com,
brcm80211-dev-list@...ress.com, netdev@...r.kernel.org
Subject: Re: Issue with Broadcom wireless in 5.2rc1 (was Re: [PATCH] mmc:
sdhci: queue work after sdhci_defer_done())
On 5/26/2019 2:21 PM, Brian Masney wrote:
> + Broadcom wireless maintainers
>
> On Fri, May 24, 2019 at 11:49:58AM -0400, Brian Masney wrote:
>> On Fri, May 24, 2019 at 03:17:13PM +0300, Adrian Hunter wrote:
>>> On 24/05/19 2:10 PM, Brian Masney wrote:
>>>> WiFi stopped working on the LG Nexus 5 phone and the issue was bisected
>>>> to the commit c07a48c26519 ("mmc: sdhci: Remove finish_tasklet") that
>>>> moved from using a tasklet to a work queue. That patch also changed
>>>> sdhci_irq() to return IRQ_WAKE_THREAD instead of finishing the work when
>>>> sdhci_defer_done() is true. Change it to queue work to the complete work
>>>> queue if sdhci_defer_done() is true so that the functionality is
>>>> equilivent to what was there when the finish_tasklet was present. This
>>>> corrects the WiFi breakage on the Nexus 5 phone.
>>>>
>>>> Signed-off-by: Brian Masney <masneyb@...tation.org>
>>>> Fixes: c07a48c26519 ("mmc: sdhci: Remove finish_tasklet")
>>>> ---
>>>> [ ... ]
>>>>
>>>> drivers/mmc/host/sdhci.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
>>>> index 97158344b862..3563c3bc57c9 100644
>>>> --- a/drivers/mmc/host/sdhci.c
>>>> +++ b/drivers/mmc/host/sdhci.c
>>>> @@ -3115,7 +3115,7 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
>>>> continue;
>>>>
>>>> if (sdhci_defer_done(host, mrq)) {
>>>> - result = IRQ_WAKE_THREAD;
>>>> + queue_work(host->complete_wq, &host->complete_work);
>>>
>>> The IRQ thread has a lot less latency than the work queue, which is why it
>>> is done that way.
>>>
>>> I am not sure why you say this change is equivalent to what was there
>>> before, nor why it fixes your problem.
>>>
>>> Can you explain some more?
>>
>> [ ... ]
>>
>> drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c calls
>> sdio_claim_host() and it appears to never return.
>
> When the brcmfmac driver is loaded, the firmware is requested from disk,
> and that's when the deadlock occurs in 5.2rc1. Specifically:
>
> 1) brcmf_sdio_download_firmware() in
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c calls
> sdio_claim_host()
>
> 2) brcmf_sdio_firmware_callback() is called and brcmf_sdiod_ramrw()
> tries to claim the host, but has to wait since its already claimed
> in #1 and the deadlock occurs.
This does not make any sense to me. brcmf_sdio_download_firmware() is
called from brcmf_sdio_firmware_callback() so they are in the same
context. So #2 is not waiting for #1, but something else I would say.
Also #2 calls sdio_claim_host() after brcmf_sdio_download_firmware has
completed so definitely not waiting for #1.
> I tried to release the host before the firmware is requested, however
> parts of brcmf_chip_set_active() needs the host to be claimed, and a
> similar deadlock occurs in brcmf_sdiod_ramrw() if I claim the host
> before calling brcmf_chip_set_active().
>
> I started to look at moving the sdio_{claim,release}_host() calls out of
> brcmf_sdiod_ramrw() but there's a fair number of callers, so I'd like to
> get feedback about the best course of action here.
Long ago Franky reworked the sdio critical sections requiring sdio
claim/release and I am pretty sure they are correct.
Could you try with lockdep kernel and see if that brings any more
information. In the mean time I will update my dev branch to 5.2-rc1 and
see if I can find any clues.
Regards,
Arend
Powered by blists - more mailing lists