linux-kernel - Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc30660f-dd72-aade-6346-a93c6ad4b695@quicinc.com>
Date:   Fri, 30 Jun 2023 21:34:08 +0530
From:   Mukesh Ojha <quic_mojha@...cinc.com>
To:     Rob Herring <robh+dt@...nel.org>,
        Greg KH <gregkh@...uxfoundation.org>
CC:     <corbet@....net>, <agross@...nel.org>, <andersson@...nel.org>,
        <konrad.dybcio@...aro.org>, <krzysztof.kozlowski+dt@...aro.org>,
        <conor+dt@...nel.org>, <keescook@...omium.org>,
        <tony.luck@...el.com>, <gpiccoli@...lia.com>,
        <mathieu.poirier@...aro.org>, <catalin.marinas@....com>,
        <will@...nel.org>, <linus.walleij@...aro.org>,
        <andy.shevchenko@...il.com>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linux-arm-msm@...r.kernel.org>,
        <devicetree@...r.kernel.org>, <linux-hardening@...r.kernel.org>,
        <linux-remoteproc@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-gpio@...r.kernel.org>
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related
 support



On 6/29/2023 4:42 AM, Rob Herring wrote:
> On Wed, Jun 28, 2023 at 9:45 AM Greg KH <gregkh@...uxfoundation.org> wrote:
>>
>> On Wed, Jun 28, 2023 at 06:04:27PM +0530, Mukesh Ojha wrote:
>>> Minidump is a best effort mechanism to collect useful and predefined data
>>> for first level of debugging on end user devices running on Qualcomm SoCs.
>>> It is built on the premise that System on Chip (SoC) or subsystem part of
>>> SoC crashes, due to a range of hardware and software bugs. Hence, the
>>> ability to collect accurate data is only a best-effort. The data collected
>>> could be invalid or corrupted, data collection itself could fail, and so on.
>>>
>>> Qualcomm devices in engineering mode provides a mechanism for generating
>>> full system ramdumps for post mortem debugging. But in some cases it's
>>> however not feasible to capture the entire content of RAM. The minidump
>>> mechanism provides the means for selecting which snippets should be
>>> included in the ramdump.
>>>
>>> Minidump kernel driver implementation is divided into two parts for
>>> simplicity, one is minidump core which can also be called minidump
>>> frontend(As API gets exported from this driver for registration with
>>> backend) and the other part is minidump backend i.e, where the underlying
>>> implementation of minidump will be there. There could be different way
>>> how the backend is implemented like Shared memory, Memory mapped IO
>>> or Resource manager(gunyah) based where the guest region information is
>>> passed to hypervisor via hypercalls.
>>>
>>>      Minidump Client-1     Client-2      Client-5    Client-n
>>>               |               |              |             |
>>>               |               |    ...       |   ...       |
>>>               |               |              |             |
>>>               |               |              |             |
>>>               |               |              |             |
>>>               |               |              |             |
>>>               |               |              |             |
>>>               |               |              |             |
>>>               |           +---+--------------+----+        |
>>>               +-----------+  qcom_minidump(core)  +--------+
>>>                           |                       |
>>>                           +------+-----+------+---+
>>>                                  |     |      |
>>>                                  |     |      |
>>>                  +---------------+     |      +--------------------+
>>>                  |                     |                           |
>>>                  |                     |                           |
>>>                  |                     |                           |
>>>                  v                     v                           v
>>>       +-------------------+      +-------------------+     +------------------+
>>>       |qcom_minidump_smem |      |qcom_minidump_mmio |     | qcom_minidump_rm |
>>>       |                   |      |                   |     |                  |
>>>       +-------------------+      +-------------------+     +------------------+
>>>         Shared memory              Memory mapped IO           Resource manager
>>>          (backend)                   (backend)                   (backend)
>>>
>>>
>>> Here, we will be giving all analogy of backend with SMEM as it is the
>>> only implemented backend at present but general idea remains the same.
>>
>> If you only have one "backend" then you don't need the extra compexity
>> here at all, just remove that whole middle layer please and make this
>> much simpler and smaller and easier to review and possibly accept.
> 
> pstore already supports backends. Why aren't the above backends just
> pstore backends rather than having an intermediate pstore backend in
> RAM which then somehow gets moved into these minidump backends.

It can't be another pstore backend since, pstore backend(ram) is for
the system where there is a guarantees of fixed ram range content 
persist across boot, but that is not true with minidump, there is no
pstorefs kind of support with minidump.

Instead, the whole idea of backend/front-end of minidump here, is
entirely related to minidump and here minidump backend could be
anything like could be Shared memory in DDR which is shared across
multiple subsystem (CPUSS) is one of them or it could be something
where you want to collect minidump for a guest vm which guest
does not have access to backend instead it may be the hypervisor
do the register the region for the guest.

Pstore(ram) is one of the clients of minidump where, we want to collect
the console/pmsg/ftrace/dmesg logs for production devices where
DDR dump is difficult and the reason of doing this to not re-invent
the wheel and it just need physical address/size .

  +---------+     +---------+   +--------+     +---------+
  | console |     | pmsg    |   | ftrace |     | dmesg   |
  +---------+     +---------+   +--------+     +---------+
        |             |             |              |
        |             |             |              |
        +------------------------------------------+
                           |
                          \ /
                   +----------------+
             (1)   |pstore frontends|
                   +----------------+
                           |
                          \ /
                  +------------------- +
             (2)  | pstore backend(ram)|
                  +--------------------+
                           |
                          \ /
                  +--------------------+
             (3)  |qcom_pstore_minidump|
                  +--------------------+

-Mukesh
> 
>> We don't add layers when they are not needed, and never when there is no
>> actual user.  If you need the extra "complexity" later, then add it
>> later when it is needed as who knows when that will ever be.
>>
>> Please redo this series based on that, thanks.
> 
> My bigger issue with this whole series is what would this all look
> like if every SoC vendor upstreamed their own custom dumping
> mechanism. That would be a mess. (I have similar opinions on the
> $soc-vendor hypervisors.)
> 
> Rob