[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18a4c59b-0dca-fceb-5a39-3abc3a5b611c@quicinc.com>
Date: Thu, 25 May 2023 21:29:06 +0530
From: Mukesh Ojha <quic_mojha@...cinc.com>
To: Randy Dunlap <rdunlap@...radead.org>, <agross@...nel.org>,
<andersson@...nel.org>, <konrad.dybcio@...aro.org>,
<corbet@....net>, <keescook@...omium.org>, <tony.luck@...el.com>,
<gpiccoli@...lia.com>, <catalin.marinas@....com>,
<will@...nel.org>, <krzysztof.kozlowski+dt@...aro.org>,
<robh+dt@...nel.org>, <linus.walleij@...aro.org>,
<linux-gpio@...r.kernel.org>, <srinivas.kandagatla@...aro.org>
CC: <linux-arm-msm@...r.kernel.org>,
<linux-remoteproc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-hardening@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>, <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v3 03/18] docs: qcom: Add qualcomm minidump guide
On 5/14/2023 12:16 AM, Randy Dunlap wrote:
>
>
> On 5/3/23 10:02, Mukesh Ojha wrote:
>> Add the qualcomm minidump guide for the users which
>> tries to cover the dependency and the way to test
>> and collect minidump on Qualcomm supported platforms.
>>
>> Signed-off-by: Mukesh Ojha <quic_mojha@...cinc.com>
>> ---
>> Documentation/admin-guide/qcom_minidump.rst | 246 ++++++++++++++++++++++++++++
>> 1 file changed, 246 insertions(+)
>> create mode 100644 Documentation/admin-guide/qcom_minidump.rst
>>
>> diff --git a/Documentation/admin-guide/qcom_minidump.rst b/Documentation/admin-guide/qcom_minidump.rst
>> new file mode 100644
>> index 0000000..062c797
>> --- /dev/null
>> +++ b/Documentation/admin-guide/qcom_minidump.rst
>> @@ -0,0 +1,246 @@
>> +Qualcomm Minidump Feature
>> +=========================
>> +
>> +Introduction
>> +------------
>> +
>> +Minidump is a best effort mechanism to collect useful and predefined
>> +data for first level of debugging on end user devices running on
>> +Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
>> +or subsystem part of SoC crashes, due to a range of hardware and
>> +software bugs. Hence, the ability to collect accurate data is only
>> +a best-effort. The data collected could be invalid or corrupted, data
>> +collection itself could fail, and so on.
>> +
>> +Qualcomm devices in engineering mode provides a mechanism for generating
>> +full system ramdumps for post mortem debugging. But in some cases it's
>
> RAM dumps for {post-mortem or postmortem} debugging.
>
>
>> +however not feasible to capture the entire content of RAM. The minidump
>> +mechanism provides the means for selecting region should be included in
>> +the ramdump.
>> +
>> +::
>> +
>> + +-----------------------------------------------+
>> + | DDR +-------------+ |
>> + | | SS0-ToC| |
>> + | +----------------+ +----------------+ | |
>> + | |Shared memory | | SS1-ToC| | |
>> + | |(SMEM) | | | | |
>> + | | | +-->|--------+ | | |
>> + | |G-ToC | | | SS-ToC \ | | |
>> + | |+-------------+ | | | +-----------+ | | |
>> + | ||-------------| | | | |-----------| | | |
>> + | || SS0-ToC | | | +-|<|SS1 region1| | | |
>> + | ||-------------| | | | | |-----------| | | |
>> + | || SS1-ToC |-|>+ | | |SS1 region2| | | |
>> + | ||-------------| | | | |-----------| | | |
>> + | || SS2-ToC | | | | | ... | | | |
>> + | ||-------------| | | | |-----------| | | |
>> + | || ... | | |-|<|SS1 regionN| | | |
>> + | ||-------------| | | | |-----------| | | |
>> + | || SSn-ToC | | | | +-----------+ | | |
>> + | |+-------------+ | | | | | |
>> + | | | | |----------------| | |
>> + | | | +>| regionN | | |
>> + | | | | |----------------| | |
>> + | +----------------+ | | | | |
>> + | | |----------------| | |
>> + | +>| region1 | | |
>> + | |----------------| | |
>> + | | | | |
>> + | |----------------|-+ |
>> + | | region5 | |
>> + | |----------------| |
>> + | | | |
>> + | Region information +----------------+ |
>> + | +---------------+ |
>> + | |region name | |
>> + | |---------------| |
>> + | |region address | |
>> + | |---------------| |
>> + | |region size | |
>> + | +---------------+ |
>> + +-----------------------------------------------+
>> + G-ToC: Global table of content
>
> contents
> ?
>
>> + SS-ToC: Subsystem table of content
>
> contents
> ?
>
>> + SS0-SSn: Subsystem numbered from 0 to n
>> +
>> +The core of minidump feature is part of Qualcomm's boot firmware code.
>> +It initializes shared memory(SMEM), which is a part of DDR and
>
> memory (SMEM),
>
>> +allocates a small section of it to minidump table i.e also called
>
> table, i.e.
>
>> +global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
>
> contents
>
>> +their own table of segments to be included in the minidump, all
>
> its own table
>
>> +references from a descriptor in SMEM (G-ToC). Each segment/region has
>> +some details like name, physical address and it's size etc. and it
>
> its
>
>> +could be anywhere scattered in the DDR.
>> +
>> +Minidump kernel driver concept
>> +------------------------------
>> +
>> +Qualcomm minidump kernel driver adds the capability to add linux region
>
> Linux
>
>> +to be dumped as part of ram dump collection. At the moment, shared memory
>
> RAM
>
>> +driver creates plaform device for minidump driver and give a means to
>
> platform
>
>> +APSS minidump to initialize itself on probe.
>> +
>> +This driver provides ``qcom_apss_minidump_region_register`` and
>> +``qcom_apss_minidump_region_unregister`` API's to register and unregister
>> +apss minidump region. It also gives a mechanism to update physical/virtual
>
> APSS
>
>> +address for the client whose addresses keeps on changing e.g Current stack
>
> changing, e.g., current stack
>
>> +address of task keep on changing on context switch for each core. So these
>
> keeps
>
>> +clients can update their addresses with ``qcom_apss_minidump_update_region``
>> +API.
>> +
>> +The driver also supports registration for the clients who came before
>> +minidump driver was initialized. It maintains pending list of clients
>> +who came before minidump and once minidump is initialized it registers
>> +them in one go.
>> +
>> +To simplify post mortem debugging, driver creates and maintain an ELF
>
> choose one: postmortem or post-mortem
>
>> +header as first region that gets updated each time a new region gets
>> +registered.
>> +
>> +The solution supports extracting the ramdump/minidump produced either
>
> RAM dump/minidump
>
>> +over USB or stored to an attached storage device.
>> +
>> +Dependency of minidump kernel driver
>> +------------------------------------
>> +
>> +It is to note that whole of minidump thing depends on Qualcomm boot
>
> s/thing //
>
>> +firmware whether it supports minidump or not. So, if the minidump
>> +smem id is present in shared memory, it indicates that minidump
>
> SMEM ID
>
>> +is supported from boot firmware and it is possible to dump linux
>
> Linux
>
>> +(APSS) region as part of minidump collection.
>> +
>> +How a kernel client driver can register region with minidump
>> +------------------------------------------------------------
>> +
>> +Client driver can use ``qcom_apss_minidump_region_register`` API's to
>> +register and ``qcom_apss_minidump_region_unregister`` to unregister
>> +their region from minidump driver.
>> +
>> +Client need to fill their region by filling qcom_apss_minidump_region
>
> needs
>
>> +structure object which consist of the region name, region's
>
> consists
>
>> +virtual and physical address and its size.
>> +
>> +Below is one sample client driver snippet which try to allocate
>
> tries
>
>> +a region from kernel heap of certain size and it writes a certain
>> +known pattern (that can help in verification after collection
>> +that we got the exact pattern, what we wrote) and registers it with
>> +minidump.
>> +
>> + .. code-block:: c
>> +
>> + #include <soc/qcom/qcom_minidump.h>
>> + [...]
>> +
>> +
>> + [... inside a function ...]
>> + struct qcom_apss_minidump_region region;
>> +
>> + [...]
>> +
>> + client_mem_region = kzalloc(region_size, GFP_KERNEL);
>> + if (!client_mem_region)
>> + return -ENOMEM;
>> +
>> + [... Just write a pattern ...]
>> + memset(client_mem_region, 0xAB, region_size);
>> +
>> + [... Fill up the region object ...]
>> + strlcpy(region.name, "REGION_A", sizeof(region.name));
>> + region.virt_addr = client_mem_region;
>> + region.phys_addr = virt_to_phys(client_mem_region);
>> + region.size = region_size;
>> +
>> + ret = qcom_apss_minidump_region_register(®ion);
>> + if (ret < 0) {
>> + pr_err("failed to add region in minidump: err: %d\n", ret);
>> + return ret;
>> + }
>> +
>> + [...]
>> +
>> +
>> +Test
>> +----
>> +
>> +Existing Qualcomm devices already supports entire ddr dump (also called
>
> DDR
>
>> +full dump) by writing appropriate value to Qualcomm's top control and
>> +status register(tcsr) in driver/firmware/qcom_scm.c .
>
> register (tcsr)
>
>> +
>> +SCM device Tree bindings required to support download mode
>> +For example (sm8450) ::
>> +
>> + / {
>> +
>> + [...]
>> +
>> + firmware {
>> + scm: scm {
>> + compatible = "qcom,scm-sm8450", "qcom,scm";
>> + [... tcsr register ... ]
>> + qcom,dload-mode = <&tcsr 0x13000>;
>> +
>> + [...]
>> + };
>> + };
>> +
>> + [...]
>> +
>> + soc: soc@0 {
>> +
>> + [...]
>> +
>> + tcsr: syscon@...0000 {
>> + compatible = "qcom,sm8450-tcsr", "syscon";
>> + reg = <0x0 0x1fc0000 0x0 0x30000>;
>> + };
>> +
>> + [...]
>> + };
>> + [...]
>> +
>> + };
>> +
>> +User of minidump can pass qcom_scm.download_mode="mini" to kernel
>> +commandline to set the current download mode to minidump.
>> +Similarly, "full" is passed to set the download mode to full dump
>> +where entire ddr dump will be collected while setting it "full,mini"
>
> DDR
>
>> +will collect minidump along with fulldump.
>> +
>> +Writing to sysfs node can also be used to set the mode to minidump.
>> +
>> +::
>> + echo "mini" > /sys/module/qcom_scm/parameter/download_mode
>> +
>> +Once the download mode is set, any kind of crash will make the device collect
>> +respective dump as per set download mode.
>> +
>> +Dump collection
>> +---------------
>> +
>> +The solution supports extracting the minidump produced either over USB or
>> +stored to an attached storage device.
>> +
>> +By default, dumps are downloaded via USB to the attached x86_64 machine
>> +running PCAT (Qualcomm tool) software. Upon download, we will see
>> +a set of binary blobs starts with name md_* in PCAT configured directory
>
> starting
>
>> +in x86_64 machine, so for above example from the client it will be
>> +md_REGION_A.BIN. This binary blob depends on region content to determine
>> +whether it needs external parser support to get the content of the region,
>> +so for simple plain ASCII text we don't need any parsing and the content
>> +can be seen just opening the binary file.
>> +
>> +To collect the dump to attached storage type, one need to write appropriate
>
> needs
>
>> +value to IMEM register, in that case dumps are collected in rawdump
>> +partition on the target device itself.
>> +
>> +One need to read the entire rawdump partition and pull out content to
>
> needs
>
>> +save it onto the attached x86_64 machine over USB. Later, this rawdump
>> +can be pass it to another tool dexter.exe(Qualcomm tool) which converts
>
> passed dexter.exe (Qualcomm tool)
>
>> +this into the similar binary blobs which we have got it when download type
>> +was set to USB i.e a set of registered region as blobs and their name
>
> USB, i.e. regions
>
>
>> +starts with md_*.
>> +
>> +Replacing the dexter.exe with some open source tool can be added as future
>> +scope of this document.
>
Thanks for the review, applied the change for the next version.
-- Mukesh
Powered by blists - more mailing lists