[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200526145815.6415-6-mcgrof@kernel.org>
Date: Tue, 26 May 2020 14:58:12 +0000
From: Luis Chamberlain <mcgrof@...nel.org>
To: jeyu@...nel.org, davem@...emloft.net, kuba@...nel.org
Cc: michael.chan@...adcom.com, dchickles@...vell.com,
sburla@...vell.com, fmanlunas@...vell.com, aelior@...vell.com,
GR-everest-linux-l2@...vell.com, kvalo@...eaurora.org,
johannes@...solutions.net, akpm@...ux-foundation.org,
arnd@...db.de, rostedt@...dmis.org, mingo@...hat.com,
aquini@...hat.com, cai@....pw, dyoung@...hat.com, bhe@...hat.com,
peterz@...radead.org, tglx@...utronix.de, gpiccoli@...onical.com,
pmladek@...e.com, tiwai@...e.de, schlad@...e.de,
andriy.shevchenko@...ux.intel.com, derosier@...il.com,
keescook@...omium.org, daniel.vetter@...ll.ch, will@...nel.org,
mchehab+samsung@...nel.org, vkoul@...nel.org,
mchehab+huawei@...nel.org, robh@...nel.org, mhiramat@...nel.org,
sfr@...b.auug.org.au, linux@...inikbrodowski.net,
glider@...gle.com, paulmck@...nel.org, elver@...gle.com,
bauerman@...ux.ibm.com, yamada.masahiro@...ionext.com,
samitolvanen@...gle.com, yzaikin@...gle.com, dvyukov@...gle.com,
rdunlap@...radead.org, corbet@....net, dianders@...omium.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, Luis Chamberlain <mcgrof@...nel.org>,
linux-wireless@...r.kernel.org, ath10k@...ts.infradead.org
Subject: [PATCH v3 5/8] ath10k: use new taint_firmware_crashed()
This makes use of the new taint_firmware_crashed() to help
annotate when firmware for device drivers crash. When firmware
crashes devices can sometimes become unresponsive, and recovery
sometimes requires a driver unload / reload and in the worst cases
a reboot.
Using a taint flag allows us to annotate when this happens clearly.
I have run into this situation with this driver with the latest
firmware as of today, May 21, 2020 using v5.6.0, leaving me at
a state at which my only option is to reboot. Driver removal and
addition does not fix the situation. This is reported on kernel.org
bugzilla korg#207851 [0]. But this isn't the first firmware crash reported,
others have been filed before and none of these bugs have yet been
addressed [1] [2] [3]. Including my own I see these firmware crash
reports:
* korg#207851 [0]
* korg#197013 [1]
* korg#201237 [2]
* korg#195987 [3]
[0] https://bugzilla.kernel.org/show_bug.cgi?id=207851
[1] https://bugzilla.kernel.org/show_bug.cgi?id=197013
[2] https://bugzilla.kernel.org/show_bug.cgi?id=201237
[3] https://bugzilla.kernel.org/show_bug.cgi?id=195987
Cc: linux-wireless@...r.kernel.org
Cc: ath10k@...ts.infradead.org
Cc: Kalle Valo <kvalo@...eaurora.org>
Acked-by: Rafael Aquini <aquini@...hat.com>
Signed-off-by: Luis Chamberlain <mcgrof@...nel.org>
---
drivers/net/wireless/ath/ath10k/pci.c | 2 ++
drivers/net/wireless/ath/ath10k/sdio.c | 2 ++
drivers/net/wireless/ath/ath10k/snoc.c | 1 +
3 files changed, 5 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 1d941d53fdc9..818c3acc2468 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1767,6 +1767,7 @@ static void ath10k_pci_fw_dump_work(struct work_struct *work)
scnprintf(guid, sizeof(guid), "n/a");
ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+ taint_firmware_crashed();
ath10k_print_driver_info(ar);
ath10k_pci_dump_registers(ar, crash_data);
ath10k_ce_dump_registers(ar, crash_data);
@@ -2837,6 +2838,7 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar,
if (ret) {
if (ath10k_pci_has_fw_crashed(ar)) {
ath10k_warn(ar, "firmware crashed during chip reset\n");
+ taint_firmware_crashed();
ath10k_pci_fw_crashed_clear(ar);
ath10k_pci_fw_crashed_dump(ar);
}
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index e2aff2254a40..8b2fc0b89be4 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -794,6 +794,7 @@ static int ath10k_sdio_mbox_proc_dbg_intr(struct ath10k *ar)
/* TODO: Add firmware crash handling */
ath10k_warn(ar, "firmware crashed\n");
+ taint_firmware_crashed();
/* read counter to clear the interrupt, the debug error interrupt is
* counter 0.
@@ -915,6 +916,7 @@ static int ath10k_sdio_mbox_proc_cpu_intr(struct ath10k *ar)
if (cpu_int_status & MBOX_CPU_STATUS_ENABLE_ASSERT_MASK) {
ath10k_err(ar, "firmware crashed!\n");
queue_work(ar->workqueue, &ar->restart_work);
+ taint_firmware_crashed();
}
return ret;
}
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index 354d49b1cd45..071ee7607a4c 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -1451,6 +1451,7 @@ void ath10k_snoc_fw_crashed_dump(struct ath10k *ar)
scnprintf(guid, sizeof(guid), "n/a");
ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+ taint_firmware_crashed();
ath10k_print_driver_info(ar);
ath10k_msa_dump_memory(ar, crash_data);
mutex_unlock(&ar->dump_mutex);
--
2.26.2
Powered by blists - more mailing lists