lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 12 Apr 2023 15:21:38 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Liu Shixin <liushixin2@...wei.com>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Tony Luck <tony.luck@...el.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Muchun Song <muchun.song@...ux.dev>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next] mm: hwpoison: support recovery from HugePage
 copy-on-write faults

On 04/12/23 14:57, Andrew Morton wrote:
> On Wed, 12 Apr 2023 11:13:50 -0700 Mike Kravetz <mike.kravetz@...cle.com> wrote:
> 
> > On 04/11/23 17:27, Liu Shixin wrote:
> > > Patch a873dfe1032a ("mm, hwpoison: try to recover from copy-on write faults")
> > > introduced a new copy_user_highpage_mc() function, and fix the kernel crash
> > > when the kernel is copying a normal page as the result of a copy-on-write
> > > fault and runs into an uncorrectable error. But it doesn't work for HugeTLB.
> > 
> > Andrew asked about user-visible effects.  Perhaps, a better way of
> > stating this in the commit message might be:
> > 
> > Commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on write
> > faults") introduced the routine copy_user_highpage_mc() to gracefully
> > handle copying of user pages with uncorrectable errors.  Previously,
> > such copies would result in a kernel crash.  hugetlb has separate code
> > paths for copy-on-write and does not benefit from the changes made in
> > commit a873dfe1032a.

I was just going to suggest adding the line,

Hence, copy-on-write of hugetlb user pages with uncorrectable errors            
will result in a kernel crash as was the case with 'normal' pages before        
commit a873dfe1032a.

However, I'm guessing it might be more clear if we start with the
runtime effects.  Something like:

copy-on-write of hugetlb user pages with uncorrectable errors will result
in a kernel crash.  This is because the copy is performed in kernel mode
and in general we can not handle accessing memory with such errors while
in kernel mode.  Commit a873dfe1032a ("mm, hwpoison: try to recover from
copy-on write faults") introduced the routine copy_user_highpage_mc() to
gracefully handle copying of user pages with uncorrectable errors.  However,
the separate hugetlb copy-on-write code paths were not modified as part
of commit a873dfe1032a.

> > 
> > Modify hugetlb copy-on-write code paths to use copy_mc_user_highpage()
> > so that they can also gracefully handle uncorrectable errors in user
> > pages.  This involves changing the hugetlb specific routine
> > ?copy_user_folio()? from type void to int so that it can return an error.
> > Modify the hugetlb userfaultfd code in the same way so that it can return
> > -EHWPOISON if it encounters an uncorrectable error.
> 
> Thanks, but...  what are the runtime effects?  What does hugetlb
> presently do when encountering these uncorrectable error?

-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ