lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ae93899e-372e-425a-ae23-deb4bbab3eeb@rivosinc.com>
Date: Fri, 19 Sep 2025 09:17:50 +0200
From: Clément Léger <cleger@...osinc.com>
To: Yunhui Cui <cuiyunhui@...edance.com>, paul.walmsley@...ive.com,
 palmer@...belt.com, aou@...s.berkeley.edu, alex@...ti.fr, conor@...nel.org,
 atishp@...osinc.com, ajones@...tanamicro.com, apatel@...tanamicro.com,
 mchitale@...tanamicro.com, linux-riscv@...ts.infradead.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 1/1] drivers: firmware: riscv: add unknown NMI support



On 19/09/2025 09:00, Yunhui Cui wrote:
> Unknown NMI can force the kernel to respond (e.g., panic) when the
> system encounters unrecognized critical hardware events, aiding in
> troubleshooting system faults. This is implemented via the Supervisor
> Software Events (SSE) framework.
> 
> Signed-off-by: Yunhui Cui <cuiyunhui@...edance.com>
> ---
>  arch/riscv/include/asm/sbi.h     |  1 +
>  drivers/firmware/riscv/Kconfig   | 10 +++++
>  drivers/firmware/riscv/Makefile  |  1 +
>  drivers/firmware/riscv/sse_nmi.c | 77 ++++++++++++++++++++++++++++++++
>  4 files changed, 89 insertions(+)
>  create mode 100644 drivers/firmware/riscv/sse_nmi.c
> 
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 874cc1d7603a5..5801f90a88f62 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -481,6 +481,7 @@ enum sbi_sse_attr_id {
>  
>  #define SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS	0x00000000
>  #define SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP		0x00000001
> +#define SBI_SSE_EVENT_LOCAL_UNKNOWN_NMI			0x00000002

Was this submitted to the PRS WG ? This a specification modification so
it should go through the usual process.

>  #define SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS	0x00008000
>  #define SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW	0x00010000
>  #define SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS	0x00100000
> diff --git a/drivers/firmware/riscv/Kconfig b/drivers/firmware/riscv/Kconfig
> index ed5b663ac5f91..746bac862ac46 100644
> --- a/drivers/firmware/riscv/Kconfig
> +++ b/drivers/firmware/riscv/Kconfig
> @@ -12,4 +12,14 @@ config RISCV_SBI_SSE
>  	  this option provides support to register callbacks on specific SSE
>  	  events.
>  
> +config RISCV_SSE_UNKNOWN_NMI
> +	bool "Enable SBI Supervisor Software Events unknown NMI support"
> +	depends on RISCV_SBI_SSE
> +	default y
> +	help
> +	  This option enables support for delivering unknown Non-Maskable Interrupt (NMI)
> +	  notifications via the Supervisor Software Events (SSE) framework. When enabled,
> +	  unknown NMIs can trigger kernel responses (e.g., panic) for unrecognized critical
> +	  hardware events, aiding in system fault diagnosis.
> +
>  endmenu
> diff --git a/drivers/firmware/riscv/Makefile b/drivers/firmware/riscv/Makefile
> index c8795d4bbb2ea..9242c6cd5e3e9 100644
> --- a/drivers/firmware/riscv/Makefile
> +++ b/drivers/firmware/riscv/Makefile
> @@ -1,3 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  
>  obj-$(CONFIG_RISCV_SBI_SSE)		+= riscv_sbi_sse.o
> +obj-$(CONFIG_RISCV_SSE_UNKNOWN_NMI)	+= sse_nmi.o
> diff --git a/drivers/firmware/riscv/sse_nmi.c b/drivers/firmware/riscv/sse_nmi.c
> new file mode 100644
> index 0000000000000..43063f42efff0
> --- /dev/null
> +++ b/drivers/firmware/riscv/sse_nmi.c
> @@ -0,0 +1,77 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +
> +#include <linux/mm.h>
> +#include <linux/nmi.h>
> +#include <linux/riscv_sbi_sse.h>
> +#include <linux/sched/debug.h>
> +#include <linux/sysctl.h>
> +
> +#include <asm/irq_regs.h>
> +#include <asm/sbi.h>
> +
> +int panic_on_unknown_nmi = 1;
> +struct sse_event *evt;
> +static struct ctl_table_header *unknown_nmi_sysctl_header;
> +
> +const struct ctl_table unknown_nmi_table[] = {
> +	{
> +		.procname	= "panic_enable",
> +		.data		= &panic_on_unknown_nmi,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= SYSCTL_ZERO,
> +		.extra2		= SYSCTL_ONE,
> +	},
> +};
> +
> +static void nmi_handler(struct pt_regs *regs)
> +{
> +	pr_emerg("NMI received for unknown on CPU %d.\n", smp_processor_id());
> +
> +	if (panic_on_unknown_nmi)
> +		nmi_panic(regs, "NMI: Not continuing");
> +
> +	pr_emerg("Dazed and confused, but trying to continue\n");
> +}

I'm dazed and confused as well ;) What's the point of this except
interrupting the kernel with a panic ? It seems like it's a better idea
to let the firmware handle that properly and display whatever
information are needed. Was your idea to actually force the kernel to
enter in some debug mode ?

Thanks,

Clément

> +
> +static int nmi_sse_handler(u32 evt, void *arg, struct pt_regs *regs)
> +{
> +	nmi_handler(regs);
> +
> +	return 0;
> +}
> +
> +static int sse_nmi_init(void)
> +{
> +	int ret;
> +
> +	evt = sse_event_register(SBI_SSE_EVENT_LOCAL_UNKNOWN_NMI, 0,
> +				 nmi_sse_handler, NULL);
> +	if (IS_ERR(evt))
> +		return PTR_ERR(evt);
> +
> +	ret = sse_event_enable(evt);
> +	if (ret) {
> +		sse_event_unregister(evt);
> +		return ret;
> +	}
> +
> +	unknown_nmi_sysctl_header = register_sysctl("kernel", unknown_nmi_table);
> +	if (!unknown_nmi_sysctl_header) {
> +		sse_event_disable(evt);
> +		sse_event_unregister(evt);
> +		return -ENOMEM;
> +	}
> +
> +	pr_info("Using SSE for NMI event delivery\n");
> +
> +	return 0;
> +}
> +
> +static int __init unknow_nmi_init(void)
> +{
> +	return sse_nmi_init();
> +}
> +
> +late_initcall(unknow_nmi_init);


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ