lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkp8oO9qvDN66_ALOqNrUDrzHH7RZc3G5GQ1pxz8qXJjqw@mail.gmail.com>
Date:   Wed, 6 Oct 2021 16:57:38 -0700
From:   Yang Shi <shy828301@...il.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>, Hugh Dickins <hughd@...gle.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Oscar Salvador <osalvador@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux FS-devel Mailing List <linux-fsdevel@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage
 for PMD page fault

On Wed, Oct 6, 2021 at 1:15 PM Peter Xu <peterx@...hat.com> wrote:
>
> On Thu, Sep 30, 2021 at 02:53:08PM -0700, Yang Shi wrote:
> > @@ -1148,8 +1148,12 @@ static int __get_hwpoison_page(struct page *page)
> >               return -EBUSY;
> >
> >       if (get_page_unless_zero(head)) {
> > -             if (head == compound_head(page))
> > +             if (head == compound_head(page)) {
> > +                     if (PageTransHuge(head))
> > +                             SetPageHasHWPoisoned(head);
> > +
> >                       return 1;
> > +             }
> >
> >               pr_info("Memory failure: %#lx cannot catch tail\n",
> >                       page_to_pfn(page));
>
> Sorry for the late comments.
>
> I'm wondering whether it's ideal to set this bit here, as get_hwpoison_page()
> sounds like a pure helper to get a refcount out of a sane hwpoisoned page.  I'm
> afraid there can be side effect that we set this without being noticed, so I'm
> also wondering we should keep it in memory_failure().
>
> Quotting comments for get_hwpoison_page():
>
>  * get_hwpoison_page() takes a page refcount of an error page to handle memory
>  * error on it, after checking that the error page is in a well-defined state
>  * (defined as a page-type we can successfully handle the memor error on it,
>  * such as LRU page and hugetlb page).
>
> For example, I see that both unpoison_memory() and soft_offline_page() will
> call it too, does it mean that we'll also set the bits e.g. even when we want
> to inject an unpoison event too?

unpoison_memory() should be not a problem since it will just bail out
once THP is met as the comment says:

/*
* unpoison_memory() can encounter thp only when the thp is being
* worked by memory_failure() and the page lock is not held yet.
* In such case, we yield to memory_failure() and make unpoison fail.
*/


And I think we should set the flag for soft offline too, right? The
soft offline does set the hwpoison flag for the corrupted sub page and
doesn't split file THP, so it should be captured by page fault as
well. And yes for poison injection.

But your comment reminds me that get_hwpoison_page() is just called
when !MF_COUNT_INCREASED, so it means MADV_HWPOISON still could
escape. This needs to be covered too.

BTW, I did the test with MADV_HWPOISON, but I didn't test this change
(moving flag set after get_page_unless_zero()) since I thought it was
just a trivial change and did overlook this case.

>
> Thanks,
>
> --
> Peter Xu
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ