lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 21 Oct 2022 04:41:04 +0000
From:   "Luck, Tony" <tony.luck@...el.com>
To:     David Laight <David.Laight@...LAB.COM>,
        Shuai Xue <xueshuai@...ux.alibaba.com>
CC:     Naoya Horiguchi <naoya.horiguchi@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        "Matthew Wilcox" <willy@...radead.org>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Nicholas Piggin <npiggin@...il.com>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>
Subject: RE: [PATCH v2] mm, hwpoison: Try to recover from copy-on write faults

>> When we do return to user mode the task is going to be busy servicing
>> a SIGBUS ... so shouldn't try to touch the poison page before the
>> memory_failure() called by the worker thread cleans things up.
>
> What about an RT process on a busy system?
> The worker threads are pretty low priority.

Most tasks don't have a SIGBUS handler ... so they just die without possibility of accessing poison

If this task DOES have a SIGBUS handler, and that for some bizarre reason just does a "return"
so the task jumps back to the instruction that cause the COW then there is a 63/64
likelihood that it is touching a different cache line from the poisoned one.

In the 1/64 case ... its probably a simple store (since there was a COW, we know it was trying to
modify the page) ... so won't generate another machine check (those only happen for reads).

But maybe it is some RMW instruction ... then, if all the above options didn't happen ... we
could get another machine check from the same address. But then we just follow the usual
recovery path.

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ