lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 28 May 2021 01:47:51 +0300
From:   Dmitry Osipenko <digetx@...il.com>
To:     Arend van Spriel <arend.vanspriel@...adcom.com>,
        Franky Lin <franky.lin@...adcom.com>,
        Hante Meuleman <hante.meuleman@...adcom.com>,
        Chi-Hsien Lin <chi-hsien.lin@...ress.com>,
        Wright Feng <wright.feng@...ress.com>,
        Kalle Valo <kvalo@...eaurora.org>
Cc:     "linux-wireless@...r.kernel.org" <linux-wireless@...r.kernel.org>,
        "brcm80211-dev-list.pdl@...adcom.com" 
        <brcm80211-dev-list.pdl@...adcom.com>,
        "brcm80211-dev-list@...ress.com" <brcm80211-dev-list@...ress.com>,
        netdev <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout (WiFi
 dies)

27.05.2021 19:42, Arend van Spriel пишет:
> On 5/26/2021 5:10 PM, Dmitry Osipenko wrote:
>> Hello,
>>
>> After updating to Ubuntu 21.04 I found two problems related to the
>> BRCMF_C_GET_ASSOCLIST using an older BCM4329 SDIO WiFi.
>>
>> 1. The kernel is spammed with:
>>
>>   ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST
>> unsupported, err=-52
>>   ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST
>> unsupported, err=-52
>>   ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST
>> unsupported, err=-52
>>
>> Which happens apparently due to a newer NetworkManager version that
>> pokes dump_station() periodically. I sent [1] that fixes this noise.
>>
>> [1]
>> https://patchwork.kernel.org/project/linux-wireless/list/?series=480715
> 
> Right. I noticed this one and did not have anything to add to the
> review/suggestion.

Please feel free to add yours r-b to the patches if they are good to you.

>> 2. The other much worse problem is that WiFi eventually dies now with
>> these errors:
>>
>> ...
>>   ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST
>> unsupported, err=-52
>>   brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
>>   ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST
>> unsupported, err=-110
>>   ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg
>> failed w/status -110
>>
>>  From this point all firmware calls start to fail with err=-110 and
>> WiFi doesn't work anymore. This problem is reproducible with 5.13-rc
>> and current -next, I haven't checked older kernel versions. Somehow
>> it's worse using a recent -next, WiFi dies quicker.
>>
>> What's interesting is that I see that there is always a pending signal
>> in brcmf_sdio_dcmd_resp_wait() when timeout happens. It looks like the
>> timeout happens when there is access to a swap partition, which stalls
>> system for a second or two, but this is not 100%. Increasing
>> DCMD_RESP_TIMEOUT doesn't help.
> 
> The timeout error (-110) can have two root causes that I am aware off.
> Either the firmware died or the SDIO layer has gone haywire. Not sure if
> that swap partition is on eMMC device, but if so it could be related.
> You could try generating device coredump. If that also gives -110 errors
> we know it is the SDIO layer.

Coredump is a good idea, thank you. The swap partition is on external SD
card, everything else is on eMMC.

>> Please let me know if you have any ideas of how to fix this trouble
>> properly or if you need need any more info.
>>
>> Removing BRCMF_C_GET_ASSOCLIST firmware call entirely from the driver
>> fixes the problem.
> 
> My guess is that reducing interaction with firmware is what is avoiding
> the issue and not so much this specific firmware command. As always it
> is good to know the conditions in which the issue occurs. What is the
> hardware platform you are running Ubuntu on? Stuff like that.

That's an older Acer A500 NVIDIA Tegra20 tablet device [1]. I may also
try to reproduce problem on Tegra30 Nexus 7 with BCM4330.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts

Thank you very much for the suggestions. I will try to collect more info
and come back with the report.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ