lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 20 Jun 2022 09:53:26 +0800
From:   Tong Tiangen <tongtiangen@...wei.com>
To:     Mark Rutland <mark.rutland@....com>
CC:     James Morse <james.morse@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Ingo Molnar" <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Robin Murphy <robin.murphy@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "Catalin Marinas" <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        "Alexander Viro" <viro@...iv.linux.org.uk>,
        Michael Ellerman <mpe@...erman.id.au>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>, <x86@...nel.org>,
        "H . Peter Anvin" <hpa@...or.com>, <linuxppc-dev@...ts.ozlabs.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        Xie XiuQi <xiexiuqi@...wei.com>,
        Guohanjun <guohanjun@...wei.com>
Subject: Re: [PATCH -next v5 6/8] arm64: add support for machine check error
 safe



在 2022/6/18 20:52, Mark Rutland 写道:
> On Sat, Jun 18, 2022 at 05:18:55PM +0800, Tong Tiangen wrote:
>> 在 2022/6/17 16:55, Mark Rutland 写道:
>>> On Sat, May 28, 2022 at 06:50:54AM +0000, Tong Tiangen wrote:
>>>> +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
>>>> +				     struct pt_regs *regs, int sig, int code)
>>>> +{
>>>> +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
>>>> +		return false;
>>>> +
>>>> +	if (user_mode(regs) || !current->mm)
>>>> +		return false;
>>>
>>> What's the `!current->mm` check for? >>
>> At first, I considered that only user processes have the opportunity to
>> recover when they trigger memory error.
>>
>> But it seems that this restriction is unreasonable. When the kernel thread
>> triggers memory error, it can also be recovered. for instance:
>>
>> https://lore.kernel.org/linux-mm/20220527190731.322722-1-jiaqiyan@google.com/
>>
>> And i think if(!current->mm) shoud be added below:
>>
>> if(!current->mm) {
>> 	set_thread_esr(0, esr);
>> 	arm64_force_sig_fault(...);
>> }
>> return true;
> 
> Why does 'current->mm' have anything to do with this, though?

Sorry, typo, my original logic was:
if(current->mm) {
	[...]
}

> 
> There can be kernel threads with `current->mm` set in unusual circumstances
> (and there's a lot of kernel code out there which handles that wrong), so if
> you want to treat user tasks differently, we should be doing something like
> checking PF_KTHREAD, or adding something like an is_user_task() helper.
> 

OK, i do want to treat user tasks differently here and didn't take into 
account what you said. will be fixed next version according to your 
suggestiong.

As follows:
if (!(current->flags & PF_KTHREAD)) {
   set_thread_esr(0, esr);
   arm64_force_sig_fault(...);
}
return true;


> [...]
> 
>>>> +
>>>> +	if (apei_claim_sea(regs) < 0)
>>>> +		return false;
>>>> +
>>>> +	if (!fixup_exception_mc(regs))
>>>> +		return false;
>>>
>>> I thought we still wanted to signal the task in this case? Or do you expect to
>>> add that into `fixup_exception_mc()` ?
>>
>> Yeah, here return false and will signal to task in do_sea() ->
>> arm64_notify_die().
> 
> I mean when we do the fixup.
> 
> I thought the idea was to apply the fixup (to stop the kernel from crashing),
> but still to deliver a fatal signal to the user task since we can't do what the
> user task asked us to.
> 

Yes, that's what i mean. :)

>>>> +
>>>> +	set_thread_esr(0, esr);
>>>
>>> Why are we not setting the address? Is that deliberate, or an oversight?
>>
>> Here set fault_address to 0, i refer to the logic of arm64_notify_die().
>>
>> void arm64_notify_die(...)
>> {
>>           if (user_mode(regs)) {
>>                   WARN_ON(regs != current_pt_regs());
>>                   current->thread.fault_address = 0;
>>                   current->thread.fault_code = err;
>>
>>                   arm64_force_sig_fault(signo, sicode, far, str);
>>           } else {
>>                   die(str, regs, err);
>>           }
>> }
>>
>> I don't know exactly why and do you know why arm64_notify_die() did this? :)
> 
> To be honest, I don't know, and that looks equally suspicious to me.
> 
> Looking at the git history, that was added in commit:
> 
>    9141300a5884b57c ("arm64: Provide read/write fault information in compat signal handlers")
> 
> ... so maybe Catalin recalls why.
> 
> Perhaps the assumption is just that this will be fatal and so unimportant? ...
> but in that case the same logic would apply to the ESR value, so it's not clear
> to me.

OK, let's proceed as set to 0, if there is any change later, the two 
positions shall be changed together.

Thanks,
Tong.

> 
> Mark.
> 
> .

Powered by blists - more mailing lists