[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <714c239d-5813-5333-9267-9684ec1b0f4d@quicinc.com>
Date: Mon, 3 Apr 2023 21:55:56 +0530
From: Mukesh Ojha <quic_mojha@...cinc.com>
To: <agross@...nel.org>, <andersson@...nel.org>,
<konrad.dybcio@...aro.org>, <corbet@....net>,
<keescook@...omium.org>, <tony.luck@...el.com>,
<gpiccoli@...lia.com>, <catalin.marinas@....com>, <will@...nel.org>
CC: <linux-arm-msm@...r.kernel.org>,
<linux-remoteproc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-hardening@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>, <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v2 0/6] Add basic Minidump kernel driver support
Gentle ping;
-Mukesh
On 3/22/2023 7:00 PM, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined data
> for first level of debugging on end user devices running on Qualcomm SoCs.
> It is built on the premise that System on Chip (SoC) or subsystem part of
> SoC crashes, due to a range of hardware and software bugs. Hence, the
> ability to collect accurate data is only a best-effort. The data collected
> could be invalid or corrupted, data collection itself could fail, and so on.
>
> Qualcomm devices in engineering mode provides a mechanism for generating
> full system ramdumps for post mortem debugging. But in some cases it's
> however not feasible to capture the entire content of RAM. The minidump
> mechanism provides the means for selecting which snippets should be
> included in the ramdump.
>
> The core of minidump feature is part of Qualcomm's boot firmware code.
> It initializes shared memory (SMEM), which is a part of DDR and
> allocates a small section of SMEM to minidump table i.e also called
> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
> their own table of segments to be included in the minidump and all get
> their reference from G-ToC. Each segment/region has some details like
> name, physical address and it's size etc. and it could be anywhere
> scattered in the DDR.
>
> Existing upstream Qualcomm remoteproc driver[1] already supports minidump
> feature for remoteproc instances like ADSP, MODEM, ... where predefined
> selective segments of subsystem region can be dumped as part of
> coredump collection which generates smaller size artifacts compared to
> complete coredump of subsystem on crash.
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
>
> In addition to managing and querying the APSS minidump description,
> the Linux driver maintains a ELF header in a segment. This segment
> gets updated with section/program header whenever a new entry gets
> registered.
>
> Patch 1/6 is very trivial change.
> Patch 2/6 moves the minidump specific data structure and macro to
> qcom_minidump.h so that (4/6) minidump driver can use.
> Patch 3/6 documents qualcomm minidump guide for users.
> Patch 4/6 implements qualcomm minidump kernel driver and exports
> symbol which other minidump kernel client can use.
> Patch 5/6 enables the qualcomm minidump driver.
> Patch 6/6 Use the exported symbol from minidump driver in qcom_common
> for querying minidump descriptor for a subsystem.
>
> Testing of the patches has been done on sm8450 target with the help
> of out of tree patch which helps to set the download mode and storage
> type and to warm reset the device.
>
> Download mode setting patches are floating here,
> https://lore.kernel.org/lkml/1679070482-8391-1-git-send-email-quic_mojha@quicinc.com/
>
> Default storage type is set to via USB, so minidump would be
> downloaded with the help of x86_64 machine running PCAT attached
> to Qualcomm device which has backed minidump boot firmware
> support(more can be found patch 3/6)
>
> Below patch [1] is to warm reset Qualcomm device which has upstream qcom
> watchdog driver support.
>
> After applying all patches, we can boot the device and can execute
> following command.
>
> echo mini > /sys/module/qcom_scm/parameters/download_mode
> echo c > /proc/sysrq-trigger
>
> This will make the device go to download mode and collect the
> minidump on to the attached x86 machine running the Qualcomm
> PCAT tool.
>
> We will see a bunch of predefined registered region as binary
> blobs starts with md_*. A sample client example to dump a linux
> region has been given in 3/6.
>
> [1]
> --------------------------->8-------------------------------------
>
> commit f1124ccebd47550b4c9627aa162d9cdceba2b76f
> Author: Mukesh Ojha <quic_mojha@...cinc.com>
> Date: Thu Mar 16 14:08:35 2023 +0530
>
> do not merge: watchdog bite on panic
>
> Signed-off-by: Mukesh Ojha <quic_mojha@...cinc.com>
>
> diff --git a/drivers/watchdog/qcom-wdt.c b/drivers/watchdog/qcom-wdt.c
> index 0d2209c..767e84a 100644
> --- a/drivers/watchdog/qcom-wdt.c
> +++ b/drivers/watchdog/qcom-wdt.c
> @@ -12,6 +12,7 @@
> #include <linux/platform_device.h>
> #include <linux/watchdog.h>
> #include <linux/of_device.h>
> +#include <linux/panic.h>
>
> enum wdt_reg {
> WDT_RST,
> @@ -114,12 +115,28 @@ static int qcom_wdt_set_pretimeout(struct watchdog_device *wdd,
> return qcom_wdt_start(wdd);
> }
>
> +static void qcom_wdt_bite_on_panic(struct qcom_wdt *wdt)
> +{
> + writel(0, wdt_addr(wdt, WDT_EN));
> + writel(1, wdt_addr(wdt, WDT_BITE_TIME));
> + writel(1, wdt_addr(wdt, WDT_RST));
> + writel(QCOM_WDT_ENABLE, wdt_addr(wdt, WDT_EN));
> +
> + wmb();
> +
> + while(1)
> + udelay(1);
> +}
> +
> static int qcom_wdt_restart(struct watchdog_device *wdd, unsigned long action,
> void *data)
> {
> struct qcom_wdt *wdt = to_qcom_wdt(wdd);
> u32 timeout;
>
> + if (in_panic)
> + qcom_wdt_bite_on_panic(wdt);
> +
> /*
> * Trigger watchdog bite:
> * Setup BITE_TIME to be 128ms, and enable WDT.
> diff --git a/include/linux/panic.h b/include/linux/panic.h
> index 979b776..f913629 100644
> --- a/include/linux/panic.h
> +++ b/include/linux/panic.h
> @@ -22,6 +22,7 @@ extern int panic_on_oops;
> extern int panic_on_unrecovered_nmi;
> extern int panic_on_io_nmi;
> extern int panic_on_warn;
> +extern bool in_panic;
>
> extern unsigned long panic_on_taint;
> extern bool panic_on_taint_nousertaint;
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 487f5b0..714f7f4 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -65,6 +65,8 @@ static unsigned int warn_limit __read_mostly;
>
> int panic_timeout = CONFIG_PANIC_TIMEOUT;
> EXPORT_SYMBOL_GPL(panic_timeout);
> +bool in_panic = false;
> +EXPORT_SYMBOL_GPL(in_panic);
>
> #define PANIC_PRINT_TASK_INFO 0x00000001
> #define PANIC_PRINT_MEM_INFO 0x00000002
> @@ -261,6 +263,7 @@ void panic(const char *fmt, ...)
> int old_cpu, this_cpu;
> bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
>
> + in_panic = true;
> if (panic_on_warn) {
> /*
> * This thread may hit another WARN() in the panic path.
> --------------------------------------------------------------------------
>
> Changes in v2:
> - Addressed review comment made by [quic_tsoni/bmasney] to add documentation.
> - Addressed comments made by [srinivas.kandagatla]
> - Dropped pstore 6/6 from the last series, till i get conclusion to get pstore
> region in minidump.
> - Fixed issue reported by kernel test robot.
>
>
> Changes in v1: https://lore.kernel.org/lkml/1676978713-7394-1-git-send-email-quic_mojha@quicinc.com/
>
> Mukesh Ojha (6):
> remoteproc: qcom: Expand MD_* as MINIDUMP_*
> remoteproc: qcom: Move minidump specific data to qcom_minidump.h
> docs: qcom: Add qualcomm minidump guide
> soc: qcom: Add Qualcomm minidump kernel driver
> arm64: defconfig: Enable Qualcomm minidump driver
> remoterproc: qcom: refactor to leverage exported minidump symbol
>
> Documentation/admin-guide/qcom_minidump.rst | 240 +++++++++++++
> arch/arm64/configs/defconfig | 1 +
> drivers/remoteproc/qcom_common.c | 75 +---
> drivers/soc/qcom/Kconfig | 14 +
> drivers/soc/qcom/Makefile | 1 +
> drivers/soc/qcom/qcom_minidump.c | 537 ++++++++++++++++++++++++++++
> include/soc/qcom/minidump.h | 40 +++
> include/soc/qcom/qcom_minidump.h | 88 +++++
> 8 files changed, 927 insertions(+), 69 deletions(-)
> create mode 100644 Documentation/admin-guide/qcom_minidump.rst
> create mode 100644 drivers/soc/qcom/qcom_minidump.c
> create mode 100644 include/soc/qcom/minidump.h
> create mode 100644 include/soc/qcom/qcom_minidump.h
>
Powered by blists - more mailing lists