lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230515065212.GA522679@hu-pkondeti-hyd.qualcomm.com>
Date:   Mon, 15 May 2023 12:22:12 +0530
From:   Pavan Kondeti <quic_pkondeti@...cinc.com>
To:     Mukesh Ojha <quic_mojha@...cinc.com>
CC:     <keescook@...omium.org>, <gregkh@...uxfoundation.org>,
        <tony.luck@...el.com>, <gpiccoli@...lia.com>,
        <johannes@...solutions.net>, <rafael@...nel.org>,
        <linux-hardening@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <quic_shashim@...cinc.com>, <quic_pkondeti@...cinc.com>
Subject: Re: RFC design of device coredump collection on panic in Pstore

On Mon, May 08, 2023 at 09:21:00PM +0530, Mukesh Ojha wrote:
> Hi All,
> 
> We are exploring a way where device driver(registered/interested and optional could be
> callback) can dump their data(consist of list of buffers or hardware registers) for post
> mortam debugging of a crash via dumping the content to pstore region. Some of the example
> data could be like clock dump/regulators/ etc.
> 
> These stuff should be already part of entire RAM dump but in some cases it's however not
> feasible to capture the entire content of RAM, so was thinking if pstore region can be
> used to capture above information either in the form of elf or binary and how pstore
> can get this in human readable format can be discussed further.
> 
> Also, existing devcoredump driver does not cover panic usecases so we thought of something
> like below RFC design where,
> 
> 1. Device_coredump allocates some configurable contigous memory that can be controlled
>    via CONFIG or bootargs and later registers for panic notifiers.
> 2. Notifier gets added.
> 3. Pstore adds device_coredump as its front-end via dumper registration similar
>    to kmsg being dump today.
> 4. Successful registration of dumper.
> 5. A device driver(A-Z) can register their buffer to be dumped as part of panic.
> 6. buffer gets added to the dump list.
> 7. Panic occurs.
> 8. iterate over registered drivers and copy their dump list to its own memory and if
>    it crosses device core dump memory log an error stop iterating.
> 9. Similar to devcore_dump() inline with kmsg_dump()
> 10.Copy the content to pstore region and this could be elf or raw binary that can be
>    discussed.
> 
> 
> Device coredump memory(1) size could be passed from pstore and should be same as size
> of devcoredump frontend size given in DT or some other way.
> 
> Let me know your concern and view on this.       
>                     
> 
>                     pstore               device_coredump      deviceA..Z          panic
>                                                                   ┼                  │
>                        │                      │       panic_notifier(1)              │
>                        │                      ├───────────────────┬─────────────────►│
>                        │                      │ notifier added    │ (2)              │
>                        │                      │◄──────────────────┼──────────────────┤
>       pstore can give  │       (3)            │                   │                  │
>       its region for   │ dumper_registration  │                   │                  │
>       dump             ├─────────────────────►│                   │                  │
>                        │       (4)            │                   │                  │
>                        │◄─────────────────────┤                   │                  │
>                        │  register_dumper     │                   │                  │
>                        │                      │                   │                  │
>                        │                      │                   │                  │
>                        │                      │                   │ init             │
>                        │                      │                ───┼──                │
>                        │                      │       (5)         │                  │
>                        │                      │devcoredump_register(dev,buf));       │
>                        │                      │◄──────────────────┐                  │
>                        │                      │       (6)         │                  │
>                        │                      ├──────────────────►│                  │
>                        │                      │device/buf gets add│d                 │
>                        │                      │ to the list       │                  │panic
>                        │                      │                   │                ──┼───
>                        │                      │                   │                  │
>                        │                      │(7) panic_notifier │     call         │
>                        │                      │◄──────────────────┼──────────────────┤
>                        │                      │ prepare coredump  │for the deviceA-Z │
>                        │                      ├──────────────────►├────┐             │
>                        │                      │   (8)             │    │             │
>                        │                      │                   │    │             │
>                        │                      │                   │    │             │
>                        │                      │                   │◄───┘             │
>              dumper will                      │                   │            (9)   │dev_coredump()
>              write all the device dump to     │◄──────────────────┼──────────────────┤
>               pstore   ┌◄───┬─────────────────┤                   │                  │
>               region   │    │    (10)         │                   │                  │
>                        │    │                 │                   │                  │
>                        │    │                 │                   │                  │
>                        │◄───┘                 │                   │                  │
>                        │                      │                   │                  │
>                        │                      │                   
> 
> 

I really like the idea of providing an interface for device drivers to
dump specific data for offline use. Currently in MSM/QCOM downstream
kernel, we do register for panic notifiers and dump relavant data for
later parsing/analysis. In some cases, this message gets printed to console
so that it would also available in pstore / extracted dmesg. However
having separate record/buffer for individual drivers would help.

I believe dev_coredump() is just an analogy here as it is meant for
taking a dump of the device when the driver thinks its device/firmware
is hung/crashed etc, AFAICT. IIUC, What we are looking for here a different
interface/semantics i.e querying devices to dump any information when
the *system* is crashed/paniced. May be something like dev_panicdump()
and a separate method in device_driver.

Also, are there any use cases for this dev_panicdump() outside the
pstore()? For ex: it would still be helpful to gather all the relavant
data of a device and put it a separate buffer. The list of such buffers
can be parsed offline via standard ramdump analysis.

Thanks,
Pavan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ