lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Jan 2021 09:00:14 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     "Luck, Tony" <tony.luck@...el.com>
Cc:     Borislav Petkov <bp@...en8.de>, X86 ML <x86@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Darren Hart <dvhart@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-edac <linux-edac@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

> On Jan 11, 2021, at 2:21 PM, Luck, Tony <tony.luck@...el.com> wrote:
>
> On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote:
>>
>>>> On Jan 11, 2021, at 1:45 PM, Tony Luck <tony.luck@...el.com> wrote:
>>>
>>> Recovery action when get_user() triggers a machine check uses the fixup
>>> path to make get_user() return -EFAULT.  Also queue_task_work() sets up
>>> so that kill_me_maybe() will be called on return to user mode to send a
>>> SIGBUS to the current process.
>>>
>>> But there are places in the kernel where the code assumes that this
>>> EFAULT return was simply because of a page fault. The code takes some
>>> action to fix that, and then retries the access. This results in a second
>>> machine check.
>>>
>>> While processing this second machine check queue_task_work() is called
>>> again. But since this uses the same callback_head structure that
>>> was used in the first call, the net result is an entry on the
>>> current->task_works list that points to itself.
>>
>> Is this happening in pagefault_disable context or normal sleepable fault context?  If the latter, maybe we should reconsider finding a way for the machine check code to do its work inline instead of deferring it.
>
> The first machine check is in pagefault_disable() context.
>
> static int get_futex_value_locked(u32 *dest, u32 __user *from)
> {
>        int ret;
>
>        pagefault_disable();
>        ret = __get_user(*dest, from);

I have very mixed feelings as to whether we should even try to recover
from the first failure like this.  If we actually want to try to
recover, perhaps we should instead arrange for the second MCE to
recover successfully instead of panicking.

--Andy


>        pagefault_enable();
>
>        return (ret == -ENXIO) ? ret : ret ? -EFAULT : 0;
> }
>
> -Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ