lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230623144542.GBZJWwFvA+1uXC1A1g@fat_crate.local>
Date:   Fri, 23 Jun 2023 16:45:42 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Smita.KoralahalliChannabasappa@....com
Cc:     Tony Luck <tony.luck@...el.com>,
        Yazen Ghannam <yazen.ghannam@....com>,
        dave.hansen@...ux.intel.com, x86@...nel.org,
        linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        patches@...ts.linux.dev
Subject: Re: [PATCH v6 3/4] x86/mce: Handle AMD threshold interrupt storms

On Fri, Jun 16, 2023 at 11:27:43AM -0700, Tony Luck wrote:
> +static void _reset_block(struct threshold_block *block)
> +{
> +	struct thresh_restart tr;
> +
> +	memset(&tr, 0, sizeof(tr));
> +	tr.b = block;
> +	threshold_restart_bank(&tr);
> +}

> +
> +static void toggle_interrupt_reset_block(struct threshold_block *block, bool on)
> +{
> +	if (!block)
> +		return;
> +
> +	block->interrupt_enable = !!on;
> +	_reset_block(block);
> +}
> +
> +void mce_amd_handle_storm(int bank, bool on)
> +{
> +	struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
> +	struct threshold_bank **bp = this_cpu_read(threshold_banks);
> +	unsigned long flags;
> +
> +	if (!bp)
> +		return;
> +
> +	local_irq_save(flags);
> +
> +	first_block = bp[bank]->blocks;
> +	if (!first_block)
> +		goto end;
> +
> +	toggle_interrupt_reset_block(first_block, on);
> +
> +	list_for_each_entry_safe(block, tmp, &first_block->miscj, miscj)
> +		toggle_interrupt_reset_block(block, on);
> +end:
> +	local_irq_restore(flags);
> +}

There's already other code which does this threshold block control. Pls
refactor and unify it instead of adding almost redundant similar functions.

>  static void mce_threshold_block_init(struct threshold_block *b, int offset)
>  {
>  	struct thresh_restart tr = {
> @@ -868,6 +909,7 @@ static void amd_threshold_interrupt(void)
>  	struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
>  	struct threshold_bank **bp = this_cpu_read(threshold_banks);
>  	unsigned int bank, cpu = smp_processor_id();
> +	u64 status;
>  
>  	/*
>  	 * Validate that the threshold bank has been initialized already. The
> @@ -881,6 +923,13 @@ static void amd_threshold_interrupt(void)
>  		if (!(per_cpu(bank_map, cpu) & BIT_ULL(bank)))
>  			continue;
>  
> +		rdmsrl(mca_msr_reg(bank, MCA_STATUS), status);
> +		track_cmci_storm(bank, status);

So this is called from interrupt context.

There's another track_cmci_storm() from machine_check_poll() which can
happen in process context.

And there's no sync (locking) between the two. Not good.

Why are even two calls needed on AMD?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ