linux-kernel - RE: [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <89a1b424a211459ab522c0d2c3e8fc98@intel.com>
Date:   Thu, 8 Apr 2021 16:06:10 +0000
From:   "Luck, Tony" <tony.luck@...el.com>
To:     Borislav Petkov <bp@...en8.de>
CC:     "x86@...nel.org" <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Andy Lutomirski <luto@...nel.org>,
        Aili Yao <yaoaili@...gsoft.com>,
        HORIGUCHI NAOYA( 堀口　直也) 
        <naoya.horiguchi@....com>
Subject: RE: [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user
 recovery

> What I'm still unclear on, does this new version address that
> "mysterious" hang or panic which the validation team triggered or you
> haven't checked yet?

No :-(

They are triggering some case where multiple threads in a process hit the same
poison, and somehow memory_failure() fails to complete offlining the page. At this
point any other threads that hit that page get the early return from memory_failure
(because the page flags say it is poisoned) ... and so we loop.

But the "recover from cases where multiple machine checks happen
simultaneously" case is orthogonal to the "do the right thing to recover
when the kernel touches poison at a user address". So I think we can
tackle them separately

-Tony