[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.11.2103102214170.7159@eggly.anvils>
Date: Wed, 10 Mar 2021 23:22:18 -0800 (PST)
From: Hugh Dickins <hughd@...gle.com>
To: Naoya Horiguchi <nao.horiguchi@...il.com>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...nel.org>,
Oscar Salvador <osalvador@...e.de>,
Tony Luck <tony.luck@...el.com>,
Matthew Wilcox <willy@...radead.org>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
Jue Wang <juew@...gle.com>, Greg Thelen <gthelen@...gle.com>,
Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH v1] mm, hwpoison: enable error handling on shmem thp
On Tue, 9 Feb 2021, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@....com>
>
> Currently hwpoison code checks PageAnon() for thp and refuses to handle
> errors on non-anonymous thps (just for historical reason). We now
> support non-anonymou thp like shmem one, so this patch suggests to enable
> to handle shmem thps. Fortunately, we already have can_split_huge_page()
> to check if a give thp is splittable, so this patch relies on it.
Fortunately? I don't understand. Why call can_split_huge_page()
at all, instead of simply trying split_huge_page() directly?
And could it do better than -EBUSY when split_huge_page() fails?
>
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@....com>
Thanks for trying to add shmem+file THP support, but I think this
does not work as intended - Andrew, if Naoya agrees, please drop from
mmotm for now, the fixup needed will be more than a line or two.
I'm not much into memory-failure myself, but Jue discovered that the
SIGBUS never arrives: because split_huge_page() on a shmem or file
THP unmaps all its pmds and ptes, and (unlike with anon) leaves them
unmapped - in normal circumstances, to be faulted back on demand.
So the page_mapped() check in hwpoison_user_mappings() fails,
and the intended SIGBUS is not delivered.
(Or, is it acceptable that the SIGBUS is not delivered to those who
have the huge page mapped: should it get delivered later, to anyone
who faults back in the bad 4k?)
We believe the tokill list has to be set up earlier, before
split_huge_page() is called, then passed in to hwpoison_user_mappings().
Sorry, we don't have a proper patch for that right now, but I expect
you can see what needs to be done. But something we found on the way,
we do have a patch for: add_to_kill() uses page_address_in_vma(), but
that has not been used on file THP tails before - fix appended at the
end below, so as not to waste your time on that bit.
> ---
> mm/memory-failure.c | 34 +++++++++-------------------------
> 1 file changed, 9 insertions(+), 25 deletions(-)
>
> diff --git v5.11-rc6/mm/memory-failure.c v5.11-rc6_patched/mm/memory-failure.c
> index e9481632fcd1..8c1575de0507 100644
> --- v5.11-rc6/mm/memory-failure.c
> +++ v5.11-rc6_patched/mm/memory-failure.c
> @@ -956,20 +956,6 @@ static int __get_hwpoison_page(struct page *page)
> {
> struct page *head = compound_head(page);
>
> - if (!PageHuge(head) && PageTransHuge(head)) {
> - /*
> - * Non anonymous thp exists only in allocation/free time. We
> - * can't handle such a case correctly, so let's give it up.
> - * This should be better than triggering BUG_ON when kernel
> - * tries to touch the "partially handled" page.
> - */
> - if (!PageAnon(head)) {
> - pr_err("Memory failure: %#lx: non anonymous thp\n",
> - page_to_pfn(page));
> - return 0;
> - }
> - }
> -
> if (get_page_unless_zero(head)) {
> if (head == compound_head(page))
> return 1;
> @@ -1197,21 +1183,19 @@ static int identify_page_state(unsigned long pfn, struct page *p,
>
> static int try_to_split_thp_page(struct page *page, const char *msg)
> {
> - lock_page(page);
> - if (!PageAnon(page) || unlikely(split_huge_page(page))) {
> - unsigned long pfn = page_to_pfn(page);
> + struct page *head;
>
> + lock_page(page);
> + head = compound_head(page);
> + if (PageTransHuge(head) && can_split_huge_page(head, NULL) &&
> + !split_huge_page(page)) {
> unlock_page(page);
> - if (!PageAnon(page))
> - pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
> - else
> - pr_info("%s: %#lx: thp split failed\n", msg, pfn);
> - put_page(page);
> - return -EBUSY;
> + return 0;
> }
> + pr_info("%s: %#lx: thp split failed\n", msg, page_to_pfn(page));
> unlock_page(page);
> -
> - return 0;
> + put_page(page);
> + return -EBUSY;
> }
>
> static int memory_failure_hugetlb(unsigned long pfn, int flags)
> --
> 2.25.1
[PATCH] mm: fix page_address_in_vma() on file THP tails
From: Jue Wang <juew@...gle.com>
Anon THP tails were already supported, but memory-failure now needs to use
page_address_in_vma() on file THP tails, which its page->mapping check did
not permit: fix it.
Signed-off-by: Jue Wang <juew@...gle.com>
Signed-off-by: Hugh Dickins <hughd@...gle.com>
---
mm/rmap.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- 5.12-rc2/mm/rmap.c 2021-02-28 16:58:57.950450151 -0800
+++ linux/mm/rmap.c 2021-03-10 20:29:21.591475177 -0800
@@ -717,11 +717,11 @@ unsigned long page_address_in_vma(struct
if (!vma->anon_vma || !page__anon_vma ||
vma->anon_vma->root != page__anon_vma->root)
return -EFAULT;
- } else if (page->mapping) {
- if (!vma->vm_file || vma->vm_file->f_mapping != page->mapping)
- return -EFAULT;
- } else
+ } else if (!vma->vm_file) {
+ return -EFAULT;
+ } else if (vma->vm_file->f_mapping != compound_head(page)->mapping) {
return -EFAULT;
+ }
address = __vma_address(page, vma);
if (unlikely(address < vma->vm_start || address >= vma->vm_end))
return -EFAULT;
Powered by blists - more mailing lists