[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230412233719.GC4759@monkey>
Date: Wed, 12 Apr 2023 16:37:19 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Tony Luck <tony.luck@...el.com>
Cc: Liu Shixin <liushixin2@...wei.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
Miaohe Lin <linmiaohe@...wei.com>,
Muchun Song <muchun.song@...ux.dev>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next] mm: hwpoison: support recovery from HugePage
copy-on-write faults
On 04/12/23 15:56, Andrew Morton wrote:
> On Wed, 12 Apr 2023 15:21:38 -0700 Mike Kravetz <mike.kravetz@...cle.com> wrote:
>
> > > > Commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on write
> > > > faults") introduced the routine copy_user_highpage_mc() to gracefully
> > > > handle copying of user pages with uncorrectable errors. Previously,
> > > > such copies would result in a kernel crash. hugetlb has separate code
> > > > paths for copy-on-write and does not benefit from the changes made in
> > > > commit a873dfe1032a.
> >
> > I was just going to suggest adding the line,
> >
> > Hence, copy-on-write of hugetlb user pages with uncorrectable errors
> > will result in a kernel crash as was the case with 'normal' pages before
> > commit a873dfe1032a.
> >
> > However, I'm guessing it might be more clear if we start with the
> > runtime effects. Something like:
> >
> > copy-on-write of hugetlb user pages with uncorrectable errors will result
> > in a kernel crash. This is because the copy is performed in kernel mode
> > and in general we can not handle accessing memory with such errors while
> > in kernel mode. Commit a873dfe1032a ("mm, hwpoison: try to recover from
> > copy-on write faults") introduced the routine copy_user_highpage_mc() to
> > gracefully handle copying of user pages with uncorrectable errors. However,
> > the separate hugetlb copy-on-write code paths were not modified as part
> > of commit a873dfe1032a.
>
> Sounds good. So I assume cc:stable is desirable.
I do not think cc:stable is necessary/desirable. Why?
a873dfe1032a was an enhancement to better handle copying pages with memory
errors in the kernel. IIUC, we never handled that situation in the past.
I would not call the fact that it did not take hugetlb into account a bug.
Although, some might argue that it should have addressed all callers of
copy_user_highpage which would have included hugetlb. IMO, There would be
little to gain by backporing to 6.1 as the issue of copying pages with
errors has existed forever. Perhaps Tony will comment as I was not involved
in a873dfe1032a.
> I can't actually get the patch to apply to anything. Can we please
> have a redo against current -linus?
--
Mike Kravetz
Powered by blists - more mailing lists