lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOiHx=n7EwK2B9CnBR07FVA=sEzFagb8TkS4XC_qBNq8OwcYUg@mail.gmail.com>
Date:   Wed, 22 Mar 2023 21:52:27 +0100
From:   Jonas Gorski <jonas.gorski@...il.com>
To:     Larry Finger <Larry.Finger@...inger.net>
Cc:     Hyeonggon Yoo <42.hyeyoo@...il.com>, netdev@...r.kernel.org,
        linux-wireless@...r.kernel.org, Ping-Ke Shih <pkshih@...ltek.com>
Subject: Re: [BUG v6.2.7] Hitting BUG_ON() on rtw89 wireless driver startup

On Wed, 22 Mar 2023 at 18:03, Larry Finger <Larry.Finger@...inger.net> wrote:
>
> On 3/22/23 10:54, Hyeonggon Yoo wrote:
> >
> > Hello folks,
> > I've just encountered weird bug when booting Linux v6.2.7
> >
> > config: attached
> > dmesg: attached
> >
> > I'm not sure exactly how to trigger this issue yet because it's not
> > stably reproducible. (just have encountered randomly when logging in)
> >
> > At quick look it seems to be related to rtw89 wireless driver or network subsystem.
>
> Your bug is weird indeed, and it does come from rtw89_8852be. My distro has not
> yet released kernel 6.2.7, but I have not seen this problem with mainline
> kernels throughout the 6.2 or 6.3 development series.

Looking at the rtw89 driver's probe function, the bug is probably a
simple race condition:

int rtw89_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
    ...
    ret = rtw89_core_register(rtwdev); <- calls ieee80211_register_hw();
    ...
    rtw89_core_napi_init(rtwdev);
    ...
}

so it registers the wifi device first, making it visible to userspace,
and then initializes napi.

So there is a window where a fast userspace may already try to
interact with the device before the driver got around to initializing
the napi parts, and then it explodes. At least that is my theory for
the issue.

Switching the order of these two functions should avoid it in theory,
as long as rtw89_core_napi_init() doesn't depend on anything
rtw89_core_register() does.

FWIW, registering the irq handler only after registering the device
also seems suspect, and should probably also happen before that.

Regards
Jonas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ