[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZeC-YGnnYAMh5kPn@agluck-desk3>
Date: Thu, 29 Feb 2024 09:26:56 -0800
From: Tony Luck <tony.luck@...el.com>
To: "Naik, Avadhut" <avadnaik@....com>
Cc: Borislav Petkov <bp@...en8.de>, "Mehta, Sohil" <sohil.mehta@...el.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"yazen.ghannam@....com" <yazen.ghannam@....com>,
Avadhut Naik <avadhut.naik@....com>
Subject: Re: [PATCH] x86/mce: Dynamically size space for machine check records
On Thu, Feb 29, 2024 at 12:42:38AM -0600, Naik, Avadhut wrote:
> Hi,
>
> On 2/28/2024 17:14, Tony Luck wrote:
> > Systems with a large number of CPUs may generate a large
> > number of machine check records when things go seriously
> > wrong. But Linux has a fixed buffer that can only capture
> > a few dozen errors.
> >
> > Allocate space based on the number of CPUs (with a minimum
> > value based on the historical fixed buffer that could store
> > 80 records).
> >
> > Signed-off-by: Tony Luck <tony.luck@...el.com>
> > ---
> >
> > Discussion earlier concluded with the realization that it is
> > safe to dynamically allocate the mce_evt_pool at boot time.
> > So here's a patch to do that. Scaling algorithm here is a
> > simple linear "4 records per possible CPU" with a minimum
> > of 80 to match the legacy behavior. I'm open to other
> > suggestions.
> >
> > Note that I threw in a "+1" to the return from ilog2() when
> > calling gen_pool_create(). From reading code, and running
> > some tests, it appears that the min_alloc_order argument
> > needs to be large enough to allocate one of the mce_evt_llist
> > structures.
> >
> > Some other gen_pool users in Linux may also need this "+1".
> >
>
> Somewhat confused here. Weren't we also exploring ways to avoid
> duplicate records from being added to the genpool? Has something
> changed in that regard?
I'm going to cover this in the reply to Boris.
> > + mce_numrecords = max(80, num_possible_cpus() * 4);
> > + mce_poolsz = mce_numrecords * (1 << order);
> > + mce_pool = kmalloc(mce_poolsz, GFP_KERNEL);
>
> To err on the side of caution, wouldn't kzalloc() be a safer choice here?
Seems like too much caution. When mce_gen_pool_add() allocates
an entry from the pool it does:
memcpy(&node->mce, mce, sizeof(*mce));
llist_add(&node->llnode, &mce_event_llist);
between those two lines, every field in the struct mce_evt_llist
is written (including any holes in the struct mce part of the structure).
-Tony
Powered by blists - more mailing lists