lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Apr 2021 14:28:28 -0700
From:   Jue Wang <juew@...gle.com>
To:     tony.luck@...el.com
Cc:     bp@...en8.de, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        luto@...nel.org, naoya.horiguchi@....com, x86@...nel.org,
        yaoaili@...gsoft.com
Subject: Re: [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery

On Thu, 25 Mar 2021 17:02:35 -0700, Tony Luck wrote:
...

> But there are places in the kernel where the code assumes that this
> EFAULT return was simply because of a page fault. The code takes some
> action to fix that, and then retries the access. This results in a second
> machine check.

What about return EHWPOISON instead of EFAULT and update the callers
to handle EHWPOISON explicitly: i.e., not retry but give up on the page?

My main concern is that the strong assumptions that the kernel can't hit more
than a fixed number of poisoned cache lines before turning to user space
may simply not be true.

When DIMM goes bad, it can easily affect an entire bank or entire ram device
chip. Even with memory interleaving, it's possible that a kernel control path
touches lots of poisoned cache lines in the buffer it is working through.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ