[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220608061635.GA1413099@hori.linux.bs1.fc.nec.co.jp>
Date: Wed, 8 Jun 2022 06:16:36 +0000
From: HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@....com>
To: Miaohe Lin <linmiaohe@...wei.com>
CC: Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Liu Shixin <liushixin2@...wei.com>,
Yang Shi <shy828301@...il.com>,
Oscar Salvador <osalvador@...e.de>,
Muchun Song <songmuchun@...edance.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH v1 5/5] mm, hwpoison: enable memory error handling on 1GB
hugepage
On Tue, Jun 07, 2022 at 10:11:24PM +0800, Miaohe Lin wrote:
> On 2022/6/2 13:06, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@....com>
> >
> > Now error handling code is prepared, so remove the blocking code and
> > enable memory error handling on 1GB hugepage.
> >
>
> I'm nervous about this change. It seems there are many code paths not awared of pud swap entry.
> I browsed some of them:
> apply_to_pud_range called from apply_to_page_range:
>
> apply_to_pud_range:
> next = pud_addr_end(addr, end);
> if (pud_none(*pud) && !create)
> continue;
> if (WARN_ON_ONCE(pud_leaf(*pud)))
> return -EINVAL;
> if (!pud_none(*pud) && WARN_ON_ONCE(pud_bad(*pud))) {
> if (!create)
> continue;
> pud_clear_bad(pud);
> }
> err = apply_to_pmd_range(mm, pud, addr, next,
> fn, data, create, mask);
>
> For !pud_present case, it will mostly reach apply_to_pmd_range and call pmd_offset on it. And invalid
> pointer will be de-referenced.
apply_to_pmd_range() has BUG_ON(pud_huge(*pud)) and apply_to_pte_range() has
BUG_ON(pmd_huge(*pmd)), so this page table walking code seems to not expect
to handle pmd/pud level mapping.
>
> Another example might be copy_pud_range and so on. So I think it might not be prepared to enable the
> 1GB hugepage or all of these places should be fixed?
I think that most of page table walker for user address space should first
check is_vm_hugetlb_page() and call hugetlb specific walking code for vma
with VM_HUGETLB.
copy_page_range() is a good example. It calls copy_hugetlb_page_range()
for vma with VM_HUGETLB and the function should support hwpoison entry.
But I feel that I need testing for confirmation.
And I'm not sure that all other are prepared for non-present pud-mapping,
so I'll need somehow code inspection and testing for each.
Thanks,
Naoya Horiguchi
Powered by blists - more mailing lists