[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e3962839410ba396a21edf8a6c481ec42ada1bdc.camel@gmail.com>
Date: Fri, 27 Nov 2020 03:31:05 +0900
From: Tsuchiya Yuto <kitakar@...il.com>
To: Brian Norris <briannorris@...omium.org>
Cc: Amitkumar Karwar <amitkarwar@...il.com>,
Ganapathi Bhat <ganapathi.bhat@....com>,
Xinming Hu <huxinming820@...il.com>,
Kalle Valo <kvalo@...eaurora.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
linux-wireless <linux-wireless@...r.kernel.org>,
"<netdev@...r.kernel.org>" <netdev@...r.kernel.org>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Maximilian Luz <luzmaximilian@...il.com>,
Andy Shevchenko <andriy.shevchenko@...el.com>, verdre@...d.nl
Subject: Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter
On Fri, 2020-11-20 at 12:53 -0800, Brian Norris wrote:
> (Sorry if anything's a bit slow here. I don't really have time to
> write out full proposals myself.)
Don’t worry, I (and other owners) am already glad to hear from upstream
devs, including you :)
(and I'm also sorry for the late reply from me. It was difficult to find
spare time these days.)
> On Fri, Oct 30, 2020 at 3:30 AM Tsuchiya Yuto <kitakar@...il.com> wrote:
> > Let me know if splitting this patch like this works. 1) The first patch
> > is to add this module parameter but don't change the default behavior.
>
> That *could* be OK with me, although it's already been said that there
> are many people who dislike extra module parameters. I also don't see
> why this needs to be a module parameter at all. If you do #2 right,
> you don't really need this, as there are already several standard ways
> of doing this (e.g., via Kconfig, or via nl80211 on a per-device
> basis).
>
> > 2) The second patch is to change the parameter value depending on the
> > DMI matching or something so that it doesn't break the existing users.
>
> Point 2 sounds good, and this is the key point. Note that you can do
> point 2 without making it a module parameter. Just keep a flag in the
> driver-private structures.
I chose to also provide the module parameter because I thought it would
be a little bit complicated for users to re-enable this dump feature if
I chose not to provide this parameter.
If I don't provide the parameter but still want to allow users to
re-enable this dump feature, we can provide a way to change the bit flags
of quirks (via a new debugfs entry or another module parameter). That
said, is there a way to easily change the quirk flags only for this dump
feature?
For example, assume that the following three quirks are enabled by default:
/* quirks */
#define QUIRK_FW_RST_D3COLD BIT(0)
#define QUIRK_NO_DEVICE_DUMP BIT(1)
#define QUIRK_FOO BIT(2)
.driver_data = (void *)(QUIRK_FW_RST_D3COLD |
QUIRK_NO_DEVICE_DUMP |
QUIRK_FOO),
card->quirks = (uintptr_t)dmi_id->driver_data;
/* and assume that card->quirks can be changed by users by a file named
* "quirks" under debugfs.
*/
So, the card->quirks will be "7" in decimal by default. Then, if users
want to re-enable the dump feature, as far as I know, users need to know
the default value "7", then need to know that device_dump is controlled
by bit 1. Finally, users can set the new quirk flags "5" in decimal (101
in binary).
echo 5 > /sys/kernel/debug/mwifiex/mlan0/quirks
I'm glad if there is another nice way to control only the device_dump
quirk flag, if we only provide a way to change quirk flags.
But at the same time I also think that if someone dare to want to
re-enable this feature, maybe the person won't feel it's complicated haha.
So, I'll drop the device_dump module parameter and switch to use the quirk
framework, adding a way to change the quirk flags value by users.
That said, we may even drop disabling the dump. Take a look at my comment
I wrote below.
> > But what I want to say here as well is that, if the firmware can be fixed,
> > we don't need a patch like this.
>
> Sure. That's also where we don't necessarily need more ways to control
> this from user space (e.g., module parameters), but just better
> detection of currently broken systems (in the driver). And if firmware
> ever gets fixed, we can undo the "broken device" detection.
There are two types of firmware crashes we observes:
1) cmd_timedout (other than ps_mode-related)
2) Firmware wakeup failed (ps_mode-related)
The #2 is observed when we enabled ps_mode. The #1 is observed for the
other causes. And hopefully, verdre (in Cc) found a "fix" [1] for the
#1 fw crash. We are trying the fix now.
The pull req (still WIP) basically addresses fw crashing by using
"non-posted" register writes and uninterruptible wait queue for PCI
operations in the kernel driver side.
We still don't have any ideas to "fix" the #2 fw crash, but now we don't
see the #1 crash anymore so far.
I'd like to see where the attempt goes before I start working on this
patch here again.
Thank you, everyone.
[1] https://github.com/linux-surface/kernel/pull/70
[WIP] Properly fix wifi firmware crashes by jonas2515 · Pull Request #70 · linux-surface/kernel
> Brian
Powered by blists - more mailing lists