lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 21 Apr 2024 12:07:30 +0200
From: Borislav Petkov <bp@...en8.de>
To: Serge Semin <fancer.lancer@...il.com>
Cc: Michal Simek <michal.simek@....com>,
	Alexander Stein <alexander.stein@...tq-group.com>,
	Tony Luck <tony.luck@...el.com>, James Morse <james.morse@....com>,
	Mauro Carvalho Chehab <mchehab@...nel.org>,
	Robert Richter <rric@...nel.org>, Dinh Nguyen <dinguyen@...nel.org>,
	Punnaiah Choudary Kalluri <punnaiah.choudary.kalluri@...inx.com>,
	Arnd Bergmann <arnd@...db.de>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	linux-arm-kernel@...ts.infradead.org, linux-edac@...r.kernel.org,
	linux-kernel@...r.kernel.org, Sherry Sun <sherry.sun@....com>,
	Borislav Petkov <bp@...e.de>
Subject: Re: [PATCH v5 01/20] EDAC/synopsys: Fix ECC status data and IRQ
 disable race condition

On Tue, Apr 16, 2024 at 01:06:11PM +0300, Serge Semin wrote:
> It looks indeed crazy because the method is called enable_intr() and
> is called in the IRQ handler. Right, re-enabling the IRQ in the handler
> doesn't look good. But under the hood it was just a way to fix the
> problem described in the commit you cited. enable_intr() just gets
> back the IRQ Enable flags cleared a bit before in the
> zynqmp_get_error_info() method.
> 
> The root cause of the problem is that the IRQ status/clear flags:
> ECCCLR.ecc_corrected_err_clr	(R/W1C)
> ECCCLR.ecc_uncorrected_err_clr	(R/W1C)
> ECCCLR.ecc_corr_err_cnt_clr	(R/W1C)
> ECCCLR.ecc_uncorr_err_cnt_clr	(R/W1C)
> etc
> 
> and the IRQ enable/disable flags (since v3.10a):
> ECCCLR.ecc_corrected_err_intr_en	(R/W)
> ECCCLR.ecc_uncorrected_err_intr_en	(R/W)
> 
> reside in a single register - ECCCLR (Synopsys has renamed it to
> ECCCTL since v3.10a due to adding the IRQ En/Dis flags).
> 
> Thus any concurrent access to that CSR like "Clear IRQ
> status/counters" and "IRQ disable/enable" need to be protected from
> the race condition.

Ok, let's pick this apart one-by-one. I'll return to the rest you're
explaining as needed.

So, can writes to the status/counter bits while writing the *same* bit
to the IRQ enable/disable bit prevent any race conditions?

Meaning, you only change the status and counter bits and you preserve
the same value in the IRQ disable/enable bit?

IOW, I'm thinking of shadowing that ECCCTL in software so that we update
it from the shadowed value.

Because, AFAIU, the spinlock won't help if you grab it, clear the
status/counter bits and disable the interrupt in the process. You want
to only clear the status/counter bits and leave the interrupt enabled.

Right?

IOW, in one single write you do:

ECCCLR.ecc_corrected_err_clr=1
ECCCLR.ecc_uncorrected_err_clr=1
ECCCLR.ecc_corr_err_cnt_clr=1
ECCCLR.ecc_uncorr_err_cnt_clr=1
ECCCLR.ecc_corrected_err_intr_en=1
ECCCLR.ecc_uncorrected_err_intr_en=1

?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ