lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250217101706.2104498-1-eugen.hristev@linaro.org>
Date: Mon, 17 Feb 2025 12:16:56 +0200
From: Eugen Hristev <eugen.hristev@...aro.org>
To: linux-arm-msm@...r.kernel.org,
	linux-hardening@...r.kernel.org,
	kees@...nel.org
Cc: linux-kernel@...r.kernel.org,
	johannes@...solutions.net,
	gregkh@...uxfoundation.org,
	rafael@...nel.org,
	dakr@...nel.org,
	andersson@...nel.org,
	konradybcio@...nel.org,
	tony.luck@...el.com,
	gpiccoli@...lia.com,
	pmladek@...e.com,
	rostedt@...dmis.org,
	john.ogness@...utronix.de,
	senozhatsky@...omium.org,
	quic_mojha@...cinc.com,
	linux-arm-kernel@...ts.infradead.org,
	kernel@...cinc.com,
	Eugen Hristev <eugen.hristev@...aro.org>
Subject: [RFC][PATCH 00/10] pstore: directly mapped regions

Hello,

This series comes as an RFC proposed solution to enhance pstore and
devcoredump with the following functionality:
It is interesting to mark specific memory regions in the kernel
as core areas, such that in the even of a cataclysmic event like
panic, deadlock, hardware fault, those areas can be retrieved by
an external agent, that can be JTAG, firmware, additional debug
processor, etc.
Even though pstore panic handlers attempt to copy data into the
persistent storage, this is not always working, because the kernel
could be in a deadlock, have irqs disabled, panic handler could
crash itself, or even some memory areas could be corrupted badly
enough that the recovery mechanism doesn't work.
This patch series attempts to solve this by reusing existing
infrastructure in pstore and devcoredump, and provide a copy-free
possibility to have core kernel data exposed for creating debug
information like a coredump or list of binary elements with data.

How this works:
Drivers and key kernel infrastructure, like for example the dmesg
buffer (ftrace buffer is still work in progress and not in the series),
would register themselves into pstore, with a simple function
call taking a name, a pointer to the area and size.
That would be all, and the patch labeled "EXAMPLE" shows a driver
doing exactly that, registering it's own dev struct with string markers
that can help identifying the data later once it's being retrieved.
Further, pstore and pstore/zone would keep track of all registered
areas inside the intermediary back-end. There would not be any copy of
the data.
Pstore would require a new type of back-end, named directly mapped
(because there is no copy, and each zone directly maps the incoming
buffer). If this back-end is available, then pstore further calls
the back-end to register the area.
The back-end itself is then free to do whatever it wishes with the
received area. It can (possibly) do a copy to store data, it can
have a timer to check the area every x time periods, or store the
pointers inside a table of contents (which is what the qcom smem is
doing).
The qcom smem driver registers itself into pstore as a new type of
back-end called shared memory. It's very similar with the pstore/blk
but much simplified, to cope with just a shared memory storage.
The areas metadata are stored in a shared memory that is being managed
by the qcom_smem driver. It's called shared memory because it's being
shared between the main processor running the kernel and additional
secondary remote processors running firmware.
In this qcom smem backend case, the firmware runs when there is a
panic event, of watchdog timeout, and uses the table of contents to
retrieve useful kernel debug data.

The advantages of this proposed idea are: no kernel panic handler
intervention. Everything is being marked and prepared ahead of time,
before there is an actual crash. There is no overhead associated,
simply the areas are being marked and metadata created, no buffers
associated, no memory areas reserved for it. Memory areas being marked,
there is no possibility of getting older useless data, that was copied
before the actual events leading to a crash would occur, so the data
would be the real actual data the kernel was working on.
Of course, this would not work if there is no external agent to parse
the areas and get the useful data. However as the hardware becomes
more and more advanced, in the cases where this is available, kernel
debugging had no way of using this advantage. As the hardware evolves,
so is the kernel.

There is another aspect to underline: the directly mapped zones
change the semantics of pstore using them. Pstore relies on user being
able to read the data from a previous kernel run once the system
has rebooted. This is not valid with the directly mapped zones unless
the JTAG/firmware would copy this data into a persistent area and make
it available upon reboot (which is not impossible).

I would like to thank everyone reviewing and chiping in on the series.

The back-end implementation for qcom_smem is based on the minidump
patch series and driver written by Mukesh Ojha, thanks:
https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/

Eugen

Eugen Hristev (10):
  pstore/zone: move pstore_device_info into zone header
  pstore/smem: add new pstore/smem type of pstore
  pstore/zone: introduce directly mapped zones
  qcom: smem: add pstore smem backend
  pstore: implement core area registration
  qcom: smem: enable smem pstore backend
  printk: export symbols for buffer address and length functions
  pstore: register kmsg into directly mapped zones if available
  devcoredump: add devcd_{un}register_core_area API
  rng: qcom_rng: EXAMPLE: registering dev structure

 drivers/base/devcoredump.c     |  13 ++
 drivers/crypto/qcom-rng.c      |  11 ++
 drivers/soc/qcom/Kconfig       |   9 +
 drivers/soc/qcom/Makefile      |   6 +-
 drivers/soc/qcom/smem.c        |  10 ++
 drivers/soc/qcom/smem_md.c     | 306 +++++++++++++++++++++++++++++++++
 drivers/soc/qcom/smem_pstore.c | 112 ++++++++++++
 fs/pstore/Kconfig              |  13 ++
 fs/pstore/Makefile             |   3 +
 fs/pstore/platform.c           |  84 +++++++++
 fs/pstore/smem.c               | 115 +++++++++++++
 fs/pstore/zone.c               | 122 +++++++++++--
 include/linux/devcoredump.h    |   3 +
 include/linux/pstore.h         |  20 +++
 include/linux/pstore_blk.h     |  14 --
 include/linux/pstore_smem.h    |   9 +
 include/linux/pstore_zone.h    |  17 ++
 include/linux/soc/qcom/smem.h  |  43 +++++
 kernel/printk/printk.c         |   2 +
 19 files changed, 885 insertions(+), 27 deletions(-)
 create mode 100644 drivers/soc/qcom/smem_md.c
 create mode 100644 drivers/soc/qcom/smem_pstore.c
 create mode 100644 fs/pstore/smem.c
 create mode 100644 include/linux/pstore_smem.h

-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ