linux-kernel - RE: [PATCH 2/2] x86/MCE: Add command line option to extend MCE Records pool

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <SJ1PR11MB6083B3511D18787BE823AB2CFC482@SJ1PR11MB6083.namprd11.prod.outlook.com>
Date: Mon, 12 Feb 2024 17:29:31 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>, "Naik, Avadhut" <avadnaik@....com>
CC: "Mehta, Sohil" <sohil.mehta@...el.com>, "x86@...nel.org" <x86@...nel.org>,
	"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"yazen.ghannam@....com" <yazen.ghannam@....com>, Avadhut Naik
	<avadhut.naik@....com>
Subject: RE: [PATCH 2/2] x86/MCE: Add command line option to extend MCE
 Records pool

> And here's the simplest scheme: you don't extend the buffer. On
> overflow, you say "Buffer full, here's the MCE" and you dump the error
> long into dmesg. Problem solved.
>
> A slicker deduplication scheme would be even better, tho. Maybe struct
> mce.count which gets incremented instead of adding the error record to
> the buffer again. And so on...

Walking the structures already allocated from the genpool in the #MC
handler may be possible, but what is the criteria for "duplicates"?
Do we avoid entering duplicates into the pool altogether? Or when the pool
is full overwrite a duplicate?

How about compile time allocation of extra space. Algorithm below for
illustrative purposes only. May need some more thought about how
to scale up.

-Tony

[Diff pasted into Outlook, chances that it will automatically apply = 0%]

diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c
index fbe8b61c3413..0fc2925c0839 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -16,10 +16,15 @@
  * used to save error information organized in a lock-less list.
  *
  * This memory pool is only to be used to save MCE records in MCE context.
- * MCE events are rare, so a fixed size memory pool should be enough. Use
- * 2 pages to save MCE events for now (~80 MCE records at most).
+ * MCE events are rare, so a fixed size memory pool should be enough.
+ * Low CPU count systems allocate 2 pages (enough for ~64 "struct mce"
+ * records). Large systems scale up the allocation based on CPU count.
  */
+#if CONFIG_NR_CPUS < 128
 #define MCE_POOLSZ     (2 * PAGE_SIZE)
+#else
+#define MCE_POOLSZ     (CONFIG_NR_CPUS / 64 * PAGE_SIZE)
+#endif

 static struct gen_pool *mce_evt_pool;
 static LLIST_HEAD(mce_event_llist);
[agluck@...uck-desk3 mywork]$ vi arch/x86/kernel/cpu/mce/genpool.c
[agluck@...uck-desk3 mywork]$ git diff
diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c
index fbe8b61c3413..47bf677578ca 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -16,10 +16,15 @@
  * used to save error information organized in a lock-less list.
  *
  * This memory pool is only to be used to save MCE records in MCE context.
- * MCE events are rare, so a fixed size memory pool should be enough. Use
- * 2 pages to save MCE events for now (~80 MCE records at most).
+ * MCE events are rare, but scale up with CPU count.  Low CPU count
+ * systems allocate 2 pages (enough for ~64 "struct mce" records). Large
+ * systems scale up the allocation based on CPU count.
  */
+#if CONFIG_NR_CPUS < 128
 #define MCE_POOLSZ     (2 * PAGE_SIZE)
+#else
+#define MCE_POOLSZ     (CONFIG_NR_CPUS / 64 * PAGE_SIZE)
+#endif

 static struct gen_pool *mce_evt_pool;
 static LLIST_HEAD(mce_event_llist);