lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CA+ASDXMR-Aa9322QjUTxiD2zwXDUig1eyG7GAAJJDvuUg1AXdA@mail.gmail.com>
Date:   Tue, 2 Jun 2020 14:01:12 -0700
From:   Brian Norris <briannorris@...omium.org>
To:     Luis Chamberlain <mcgrof@...nel.org>
Cc:     jeyu@...nel.org, "David S. Miller" <davem@...emloft.net>,
        kuba@...nel.org, linux-wireless <linux-wireless@...r.kernel.org>,
        aquini@...hat.com, linux-doc@...r.kernel.org, peterz@...radead.org,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        linux@...inikbrodowski.net,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        glider@...gle.com, GR-everest-linux-l2@...vell.com,
        mchehab+samsung@...nel.org, will@...nel.org,
        michael.chan@...adcom.com, Rob Herring <robh@...nel.org>,
        paulmck@...nel.org, bhe@...hat.com, corbet@....net,
        mchehab+huawei@...nel.org, ath10k <ath10k@...ts.infradead.org>,
        derosier@...il.com, Takashi Iwai <tiwai@...e.de>, mingo@...hat.com,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Sami Tolvanen <samitolvanen@...gle.com>, yzaikin@...gle.com,
        dyoung@...hat.com, pmladek@...e.com, elver@...gle.com,
        sburla@...vell.com, aelior@...vell.com,
        Kees Cook <keescook@...omium.org>,
        Arnd Bergmann <arnd@...db.de>, sfr@...b.auug.org.au,
        gpiccoli@...onical.com, Steven Rostedt <rostedt@...dmis.org>,
        fmanlunas@...vell.com, cai@....pw, tglx@...utronix.de,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Johannes Berg <johannes@...solutions.net>,
        Kalle Valo <kvalo@...eaurora.org>,
        "<netdev@...r.kernel.org>" <netdev@...r.kernel.org>,
        rdunlap@...radead.org, schlad@...e.de,
        Doug Anderson <dianders@...omium.org>, vkoul@...nel.org,
        mhiramat@...nel.org, Andrew Morton <akpm@...ux-foundation.org>,
        dchickles@...vell.com, bauerman@...ux.ibm.com
Subject: Re: [PATCH v3 5/8] ath10k: use new taint_firmware_crashed()

On Tue, May 26, 2020 at 7:58 AM Luis Chamberlain <mcgrof@...nel.org> wrote:
>
> This makes use of the new taint_firmware_crashed() to help
> annotate when firmware for device drivers crash. When firmware
> crashes devices can sometimes become unresponsive, and recovery
> sometimes requires a driver unload / reload and in the worst cases
> a reboot.

Just for the record, the underlying problem you seem to be complaining
about does not appear to be a firmware crash at all. It does happen to
result in a firmware crash report much later on (because when the PCIe
endpoint is this hosed, sooner or later the driver thinks the firmware
is dead), but it's not likely the root cause. More below.

> Using a taint flag allows us to annotate when this happens clearly.
>
> I have run into this situation with this driver with the latest
> firmware as of today, May 21, 2020 using v5.6.0, leaving me at
> a state at which my only option is to reboot. Driver removal and
> addition does not fix the situation. This is reported on kernel.org
> bugzilla korg#207851 [0].

I took a look, and replied there:
https://bugzilla.kernel.org/show_bug.cgi?id=207851#c2

Per the above, it seems more likely you have a PCI or power management
problem, not an ath10k or ath10k-firmware problem.

> But this isn't the first firmware crash reported,
> others have been filed before and none of these bugs have yet been
> addressed [1] [2] [3].  Including my own I see these firmware crash
> reports:

Yes, firmware does crash. Sometimes repeatedly. It also happens to be
closed source, so it's nearly impossible for the average Linux dev to
debug. But FWIW, those 3 all appear to be recoverable -- and then they
crash again a few minutes later. So just as claimed on prior
iterations of this patchset, ath10k is doing fine at recovery [*] --
it's "only" the firmware that's a problem. (And, if a WiFi firmware
doesn't like something in the RF environment...it's totally
understandable that the crash will happen more than once. Of course
that sucks, but it's not unexpected.) Crucially, rebooting won't
really do anything to help these people, AIUI.

Maybe what you really want is to taint the kernel every time a
non-free firmware is loaded ;)

I'd also note that those 3 reports are 3 years old. There have been
many ath10k-firmware updates since then, so it's not necessarily fair
to dig those back up. Also, bugzilla.kernel.org is totally ignored by
many linux-wireless@ folks. But I digress...

All in all, I have no interest in this proposal, for many of the
reasons already mentioned on previous iterations. It's way too coarse
and won't be useful in understanding what's going on in a system, IMO,
at least for ath10k. But it's also easy enough to ignore, so if it
makes somebody happy to claim a taint, then so be it.

Regards,
Brian

[*] Although, at least one of those doesn't appear to be as "clean" of
a recovery attempt as typical. Maybe there are some lurking driver
bugs in there too.


>   * korg#207851 [0]
>   * korg#197013 [1]
>   * korg#201237 [2]
>   * korg#195987 [3]
>
> [0] https://bugzilla.kernel.org/show_bug.cgi?id=207851
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=197013
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=201237
> [3] https://bugzilla.kernel.org/show_bug.cgi?id=195987

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ