lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231214165846.GOZXs0RqkbpovdQWKH@fat_crate.local>
Date:   Thu, 14 Dec 2023 17:58:46 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Tony Luck <tony.luck@...el.com>
Cc:     Yazen Ghannam <yazen.ghannam@....com>,
        Smita.KoralahalliChannabasappa@....com,
        dave.hansen@...ux.intel.com, x86@...nel.org,
        linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        patches@...ts.linux.dev
Subject: Re: [PATCH v9 2/3] x86/mce: Add per-bank CMCI storm mitigation

On Mon, Nov 27, 2023 at 04:42:02PM -0800, Tony Luck wrote:
> On Mon, Nov 27, 2023 at 12:14:28PM -0800, Tony Luck wrote:
> > On Mon, Nov 27, 2023 at 11:50:26AM -0800, Tony Luck wrote:
> > > On Tue, Nov 21, 2023 at 12:54:48PM +0100, Borislav Petkov wrote:
> > > > On Tue, Nov 14, 2023 at 02:04:46PM -0800, Tony Luck wrote:
> > > But it isn't doing the same thing.  The timer calls:
> > > 
> > > 	machine_check_poll(0, this_cpu_ptr(&mce_poll_banks));
> > > 
> > > and cmci_mc_poll_banks() calls:
> > > 
> > > 	machine_check_poll(0, this_cpu_ptr(&mce_poll_banks));
> 
>         machine_check_poll(0, this_cpu_ptr(&mce_banks_owned));

Hmm, so I applied your v10 and this call with mce_banks_owned is done in
cmci_recheck() only. Which is on some init path.

The thresholding interrupt calls it too.

The timer ends up calling mc_poll_banks_default() which does

	machine_check_poll(0, this_cpu_ptr(&mce_poll_banks));

I presume we don't do:

        if (!cmci_supported(&banks)) {
                mc_poll_banks = cmci_mc_poll_banks;
                return;
        }

usually on Intel. And even if we did, cmci_mc_poll_banks() calls

	machine_check_poll(0, this_cpu_ptr(&mce_poll_banks));

too.

So regardless what machine you have, you do call the mc_poll_banks
pointer which in both cases does

	machine_check_poll(0, this_cpu_ptr(&mce_poll_banks));

The *thresholding* interrupt does

	machine_check_poll(0, this_cpu_ptr(&mce_banks_owned));

and you're saying

	mce_poll_banks and mce_banks_owned

are disjoint.

That's what you mean, right?

Because if so, yes, that makes sense. If the sets of MCA banks polled
and handled in the thresholding interrupt are disjoint, we should be ok.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ