lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 24 Apr 2023 22:11:58 +0300
From:   Fedor Pchelkin <pchelkin@...ras.ru>
To:     Toke Høiland-Jørgensen <toke@...e.dk>,
        Kalle Valo <kvalo@...nel.org>
Cc:     Kalle Valo <kvalo@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Senthil Balasubramanian <senthilkumar@...eros.com>,
        "John W. Linville" <linville@...driver.com>,
        Vasanthakumar Thiagarajan <vasanth@...eros.com>,
        Sujith <Sujith.Manoharan@...eros.com>,
        linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Alexey Khoroshilov <khoroshilov@...ras.ru>,
        lvc-project@...uxtesting.org,
        syzbot+df61b36319e045c00a08@...kaller.appspotmail.com
Subject: Re: [PATCH 2/3] wifi: ath9k: fix races between ath9k_wmi_cmd and
 ath9k_wmi_ctrl_rx

This problem is realy subtle, I suppose. In the v2 commit info, which I'll
send in the next mail, the race condition is described which can lead to
invalid behaviour.

Couldn't reproduce that particular problem on real hardware, but if
force timeouts to wmi cmd completions, local KMSan catches some uninit
values.

The synchronization between ath9k_wmi_cmd and ath9k_wmi_ctrl_rx on
timeouts is good, especially after 8a2f35b98306 ("wifi: ath9k: Fix
potential stack-out-of-bounds write in ath9k_wmi_rsp_callback()").

And I think the only place where the fuzzer can provoke failure is when
wmi->last_seq_id in callback is checked before it is assigned zero inside
ath9k_wmi_cmd() during timeout exit. This scenario is more thoroughly
described in patch v2.

Well, the issue seems to be rare and I don't know how to properly test it
on real hardware.

I've made some checks on a basic driver workflow, and there weren't any
stalls or explicit failures, and the patch seems to close that tiny race
condition window. But, anyway, it requires more discussion.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ