lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <714c239d-5813-5333-9267-9684ec1b0f4d@quicinc.com>
Date:   Mon, 3 Apr 2023 21:55:56 +0530
From:   Mukesh Ojha <quic_mojha@...cinc.com>
To:     <agross@...nel.org>, <andersson@...nel.org>,
        <konrad.dybcio@...aro.org>, <corbet@....net>,
        <keescook@...omium.org>, <tony.luck@...el.com>,
        <gpiccoli@...lia.com>, <catalin.marinas@....com>, <will@...nel.org>
CC:     <linux-arm-msm@...r.kernel.org>,
        <linux-remoteproc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-hardening@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>, <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v2 0/6] Add basic Minidump kernel driver support

Gentle ping;

-Mukesh

On 3/22/2023 7:00 PM, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined data
> for first level of debugging on end user devices running on Qualcomm SoCs.
> It is built on the premise that System on Chip (SoC) or subsystem part of
> SoC crashes, due to a range of hardware and software bugs. Hence, the
> ability to collect accurate data is only a best-effort. The data collected
> could be invalid or corrupted, data collection itself could fail, and so on.
> 
> Qualcomm devices in engineering mode provides a mechanism for generating
> full system ramdumps for post mortem debugging. But in some cases it's
> however not feasible to capture the entire content of RAM. The minidump
> mechanism provides the means for selecting which snippets should be
> included in the ramdump.
> 
> The core of minidump feature is part of Qualcomm's boot firmware code.
> It initializes shared memory (SMEM), which is a part of DDR and
> allocates a small section of SMEM to minidump table i.e also called
> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
> their own table of segments to be included in the minidump and all get
> their reference from G-ToC. Each segment/region has some details like
> name, physical address and it's size etc. and it could be anywhere
> scattered in the DDR.
> 
> Existing upstream Qualcomm remoteproc driver[1] already supports minidump
> feature for remoteproc instances like ADSP, MODEM, ... where predefined
> selective segments of subsystem region can be dumped as part of
> coredump collection which generates smaller size artifacts compared to
> complete coredump of subsystem on crash.
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
> 
> In addition to managing and querying the APSS minidump description,
> the Linux driver maintains a ELF header in a segment. This segment
> gets updated with section/program header whenever a new entry gets
> registered.
> 
> Patch 1/6 is very trivial change.
> Patch 2/6 moves the minidump specific data structure and macro to
>   qcom_minidump.h so that (4/6) minidump driver can use.
> Patch 3/6 documents qualcomm minidump guide for users.
> Patch 4/6 implements qualcomm minidump kernel driver and exports
>   symbol which other minidump kernel client can use.
> Patch 5/6 enables the qualcomm minidump driver.
> Patch 6/6 Use the exported symbol from minidump driver in qcom_common
>   for querying minidump descriptor for a subsystem.
> 
> Testing of the patches has been done on sm8450 target with the help
> of out of tree patch which helps to set the download mode and storage
> type and to warm reset the device.
> 
> Download mode setting patches are floating here,
> https://lore.kernel.org/lkml/1679070482-8391-1-git-send-email-quic_mojha@quicinc.com/
> 
> Default storage type is set to via USB, so minidump would be
> downloaded with the help of x86_64 machine running PCAT attached
> to Qualcomm device which has backed minidump boot firmware
> support(more can be found patch 3/6)
> 
> Below patch [1] is to warm reset Qualcomm device which has upstream qcom
> watchdog driver support.
> 
> After applying all patches, we can boot the device and can execute
> following command.
> 
> echo mini > /sys/module/qcom_scm/parameters/download_mode
> echo c > /proc/sysrq-trigger
> 
> This will make the device go to download mode and collect the
> minidump on to the attached x86 machine running the Qualcomm
> PCAT tool.
> 
> We will see a bunch of predefined registered region as binary
> blobs starts with md_*. A sample client example to dump a linux
> region has been given in 3/6.
> 
> [1]
> --------------------------->8-------------------------------------
> 
> commit f1124ccebd47550b4c9627aa162d9cdceba2b76f
> Author: Mukesh Ojha <quic_mojha@...cinc.com>
> Date:   Thu Mar 16 14:08:35 2023 +0530
> 
>      do not merge: watchdog bite on panic
> 
>      Signed-off-by: Mukesh Ojha <quic_mojha@...cinc.com>
> 
> diff --git a/drivers/watchdog/qcom-wdt.c b/drivers/watchdog/qcom-wdt.c
> index 0d2209c..767e84a 100644
> --- a/drivers/watchdog/qcom-wdt.c
> +++ b/drivers/watchdog/qcom-wdt.c
> @@ -12,6 +12,7 @@
>   #include <linux/platform_device.h>
>   #include <linux/watchdog.h>
>   #include <linux/of_device.h>
> +#include <linux/panic.h>
> 
>   enum wdt_reg {
>          WDT_RST,
> @@ -114,12 +115,28 @@ static int qcom_wdt_set_pretimeout(struct watchdog_device *wdd,
>          return qcom_wdt_start(wdd);
>   }
> 
> +static void qcom_wdt_bite_on_panic(struct qcom_wdt *wdt)
> +{
> +       writel(0, wdt_addr(wdt, WDT_EN));
> +       writel(1, wdt_addr(wdt, WDT_BITE_TIME));
> +       writel(1, wdt_addr(wdt, WDT_RST));
> +       writel(QCOM_WDT_ENABLE, wdt_addr(wdt, WDT_EN));
> +
> +       wmb();
> +
> +       while(1)
> +               udelay(1);
> +}
> +
>   static int qcom_wdt_restart(struct watchdog_device *wdd, unsigned long action,
>                              void *data)
>   {
>          struct qcom_wdt *wdt = to_qcom_wdt(wdd);
>          u32 timeout;
> 
> +       if (in_panic)
> +               qcom_wdt_bite_on_panic(wdt);
> +
>          /*
>           * Trigger watchdog bite:
>           *    Setup BITE_TIME to be 128ms, and enable WDT.
> diff --git a/include/linux/panic.h b/include/linux/panic.h
> index 979b776..f913629 100644
> --- a/include/linux/panic.h
> +++ b/include/linux/panic.h
> @@ -22,6 +22,7 @@ extern int panic_on_oops;
>   extern int panic_on_unrecovered_nmi;
>   extern int panic_on_io_nmi;
>   extern int panic_on_warn;
> +extern bool in_panic;
> 
>   extern unsigned long panic_on_taint;
>   extern bool panic_on_taint_nousertaint;
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 487f5b0..714f7f4 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -65,6 +65,8 @@ static unsigned int warn_limit __read_mostly;
> 
>   int panic_timeout = CONFIG_PANIC_TIMEOUT;
>   EXPORT_SYMBOL_GPL(panic_timeout);
> +bool in_panic = false;
> +EXPORT_SYMBOL_GPL(in_panic);
> 
>   #define PANIC_PRINT_TASK_INFO          0x00000001
>   #define PANIC_PRINT_MEM_INFO           0x00000002
> @@ -261,6 +263,7 @@ void panic(const char *fmt, ...)
>          int old_cpu, this_cpu;
>          bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
> 
> +       in_panic = true;
>          if (panic_on_warn) {
>                  /*
>                   * This thread may hit another WARN() in the panic path.
> --------------------------------------------------------------------------
> 
> Changes in v2:
>   - Addressed review comment made by [quic_tsoni/bmasney] to add documentation.
>   - Addressed comments made by [srinivas.kandagatla]
>   - Dropped pstore 6/6 from the last series, till i get conclusion to get pstore
>     region in minidump.
>   - Fixed issue reported by kernel test robot.
> 
> 
> Changes in v1: https://lore.kernel.org/lkml/1676978713-7394-1-git-send-email-quic_mojha@quicinc.com/
> 
> Mukesh Ojha (6):
>    remoteproc: qcom: Expand MD_* as MINIDUMP_*
>    remoteproc: qcom: Move minidump specific data to qcom_minidump.h
>    docs: qcom: Add qualcomm minidump guide
>    soc: qcom: Add Qualcomm minidump kernel driver
>    arm64: defconfig: Enable Qualcomm minidump driver
>    remoterproc: qcom: refactor to leverage exported minidump symbol
> 
>   Documentation/admin-guide/qcom_minidump.rst | 240 +++++++++++++
>   arch/arm64/configs/defconfig                |   1 +
>   drivers/remoteproc/qcom_common.c            |  75 +---
>   drivers/soc/qcom/Kconfig                    |  14 +
>   drivers/soc/qcom/Makefile                   |   1 +
>   drivers/soc/qcom/qcom_minidump.c            | 537 ++++++++++++++++++++++++++++
>   include/soc/qcom/minidump.h                 |  40 +++
>   include/soc/qcom/qcom_minidump.h            |  88 +++++
>   8 files changed, 927 insertions(+), 69 deletions(-)
>   create mode 100644 Documentation/admin-guide/qcom_minidump.rst
>   create mode 100644 drivers/soc/qcom/qcom_minidump.c
>   create mode 100644 include/soc/qcom/minidump.h
>   create mode 100644 include/soc/qcom/qcom_minidump.h
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ