linux-kernel - Re: [PATCH] x86/mce: Dynamically size space for machine check records

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZeC-YGnnYAMh5kPn@agluck-desk3>
Date: Thu, 29 Feb 2024 09:26:56 -0800
From: Tony Luck <tony.luck@...el.com>
To: "Naik, Avadhut" <avadnaik@....com>
Cc: Borislav Petkov <bp@...en8.de>, "Mehta, Sohil" <sohil.mehta@...el.com>,
	"x86@...nel.org" <x86@...nel.org>,
	"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"yazen.ghannam@....com" <yazen.ghannam@....com>,
	Avadhut Naik <avadhut.naik@....com>
Subject: Re: [PATCH] x86/mce: Dynamically size space for machine check records

On Thu, Feb 29, 2024 at 12:42:38AM -0600, Naik, Avadhut wrote:
> Hi,
> 
> On 2/28/2024 17:14, Tony Luck wrote:
> > Systems with a large number of CPUs may generate a large
> > number of machine check records when things go seriously
> > wrong. But Linux has a fixed buffer that can only capture
> > a few dozen errors.
> > 
> > Allocate space based on the number of CPUs (with a minimum
> > value based on the historical fixed buffer that could store
> > 80 records).
> > 
> > Signed-off-by: Tony Luck <tony.luck@...el.com>
> > ---
> > 
> > Discussion earlier concluded with the realization that it is
> > safe to dynamically allocate the mce_evt_pool at boot time.
> > So here's a patch to do that. Scaling algorithm here is a
> > simple linear "4 records per possible CPU" with a minimum
> > of 80 to match the legacy behavior. I'm open to other
> > suggestions.
> > 
> > Note that I threw in a "+1" to the return from ilog2() when
> > calling gen_pool_create(). From reading code, and running
> > some tests, it appears that the min_alloc_order argument
> > needs to be large enough to allocate one of the mce_evt_llist
> > structures.
> > 
> > Some other gen_pool users in Linux may also need this "+1".
> > 
> 
> Somewhat confused here. Weren't we also exploring ways to avoid
> duplicate records from being added to the genpool? Has something
> changed in that regard?

I'm going to cover this in the reply to Boris.

> > +	mce_numrecords = max(80, num_possible_cpus() * 4);
> > +	mce_poolsz = mce_numrecords * (1 << order);
> > +	mce_pool = kmalloc(mce_poolsz, GFP_KERNEL);
> 
> To err on the side of caution, wouldn't kzalloc() be a safer choice here?

Seems like too much caution. When mce_gen_pool_add() allocates
an entry from the pool it does:

	memcpy(&node->mce, mce, sizeof(*mce));
	llist_add(&node->llnode, &mce_event_llist);

between those two lines, every field in the struct mce_evt_llist
is written (including any holes in the struct mce part of the structure).

-Tony