lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 17 Jun 2023 15:59:27 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Jiaqi Yan <jiaqiyan@...gle.com>
Cc:     Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>,
        "songmuchun@...edance.com" <songmuchun@...edance.com>,
        "shy828301@...il.com" <shy828301@...il.com>,
        "linmiaohe@...wei.com" <linmiaohe@...wei.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "duenwen@...gle.com" <duenwen@...gle.com>,
        "axelrasmussen@...gle.com" <axelrasmussen@...gle.com>,
        "jthoughton@...gle.com" <jthoughton@...gle.com>
Subject: Re: [PATCH v1 1/3] mm/hwpoison: find subpage in hugetlb HWPOISON list

On 06/16/23 19:18, Jiaqi Yan wrote:
> On Fri, Jun 16, 2023 at 4:35 PM Mike Kravetz <mike.kravetz@...cle.com> wrote:
> > On 06/16/23 14:19, Jiaqi Yan wrote:
> > >
> > > Now looking again this, I think concurrent adding and deleting are
> > > fine with each other and with themselves, because raw_hwp_list is
> > > lock-less llist.
> >
> > Correct.
> >
> > > As for synchronizing traversal with adding and deleting, I wonder is
> > > it a good idea to make __update_and_free_hugetlb_folio hold
> > > hugetlb_lock before it folio_clear_hugetlb_hwpoison(which traverse +
> > > delete raw_hwp_list)? In hugetlb, get_huge_page_for_hwpoison already
> > > takes hugetlb_lock; it seems to me __update_and_free_hugetlb_folio is
> > > missing the lock.
> >
> > I do not think the lock is needed.  However, while looking more closely
> > at this I think I discovered another issue.
> > This is VERY subtle.
> > Perhaps Naoya can help verify if my reasoning below is correct.
> >
> > In __update_and_free_hugetlb_folio we are not operating on a hugetlb page.
> > Why is this?
> > Before calling update_and_free_hugetlb_folio we call remove_hugetlb_folio.
> > The purpose of remove_hugetlb_folio is to remove the huge page from the
> > list AND compound page destructor indicating this is a hugetlb page is changed.
> > This is all done while holding the hugetlb lock.  So, the test for
> > folio_test_hugetlb(folio) is false.
> >
> > We have technically a compound non-hugetlb page with a non-null raw_hwp_list.
> >
> > Important note: at this time we have not reallocated vmemmap pages if
> > hugetlb page was vmemmap optimized.  That is done later in
> > __update_and_free_hugetlb_folio.
> 
> 
> >
> > The 'good news' is that after this point get_huge_page_for_hwpoison will
> > not recognize this as a hugetlb page, so nothing will be added to the
> > list.  There is no need to worry about entries being added to the list
> > during traversal.
> >
> > The 'bad news' is that if we get a memory error at this time we will
> > treat it as a memory error on a regular compound page.  So,
> > TestSetPageHWPoison(p) in memory_failure() may try to write a read only
> > struct page. :(
> 
> At least I think this is an issue.
> 
> Would it help if dissolve_free_huge_page doesn't unlock hugetlb_lock
> until update_and_free_hugetlb_folio is done, or basically until
> dissolve_free_huge_page is done?

Unfortunately, update_and_free_hugetlb_folio is designed to be called
without locks held.  This is because we can not hold any locks while
allocating vmemmap pages.

I'll try to think of some way to restructure the code.  IIUC, this is a
potential general issue, not just isolated to memory error handling.
-- 
Mike Kravetz

> 
> TestSetPageHWPoison in memory_failure is called after
> try_memory_failure_hugetlb, and folio_test_hugetlb is tested within
> __get_huge_page_for_hwpoison, which is wrapped by the hugetlb_lock. So
> by the time dissolve_free_huge_page returns, subpages already go
> through hugetlb_vmemmap_restore and __destroy_compound_gigantic_folio
> and become non-compound raw pages (folios). Now
> folio_test_hugetlb(p)=false will be correct for memory_failure, and it
> can recover p as a dissolved non-hugetlb page.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ