lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6048a36d-6bb2-2e74-1d20-228f0486d965@huawei.com>
Date: Sun, 28 Apr 2024 10:24:46 +0800
From: Miaohe Lin <linmiaohe@...wei.com>
To: Jane Chu <jane.chu@...cle.com>
CC: <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
	<akpm@...ux-foundation.org>, <nao.horiguchi@...il.com>, <osalvador@...e.de>,
	Matthew Wilcox <willy@...radead.org>, Sidhartha Kumar
	<sidhartha.kumar@...cle.com>
Subject: Re: [PATCH] mm/memory-failure: remove shake_page()

On 2024/4/27 4:33, Jane Chu wrote:
> My apology for the gobbled message earlier.
> 
> On 4/26/2024 12:52 PM, Jane Chu wrote:
>> On 4/26/2024 12:05 PM, Matthew Wilcox wrote:
>> [..]
>>> That would be unsafe, the safe way would be if we moved page_folio() after
>>>> the call to __get_hw_poison() in get_any_page() and there would still be one
>>>> remaining user of shake_page() that we can't convert. A safe version of this
>>>> patch would result in a removal of one use of PageHuge() and two uses of
>>>> put_page(), would that be worth submitting?
>>>>
>>>> get_any_page()
>>>>     if(__get_hwpoison_page())
>>>>         folio = page_folio() /* folio_try_get() returned 1, safe */
>>> I think we should convert __get_hwpoison_page() to return either the folio
>>> or an ERR_PTR or NULL.  Also, I think we should delete the "cannot catch
>>> tail" part and just loop in __get_hwpoison_page() until we do catch it.
>>> See try_get_folio() in mm/gup.c for inspiration (although you can't use
>>> it exactly because that code knows that the page is mapped into a page
>>> table, so has a refcount).
>>>
>>> But that's just an immediate assessment; you might find a reason that
>>> doesn't work.
>>
> Besides, in a possible hugetlb demote scenario, it seems to me that we should retry
> get_hwpoison_hugetlb_folio() instead of falling thru to folio_try_get().
> 
> static int __get_hwpoison_page(struct page *page, unsigned long flags)
> {
>         struct folio *folio = page_folio(page);
>         int ret = 0;
>         bool hugetlb = false;
> 
>         ret = get_hwpoison_hugetlb_folio(folio, &hugetlb, false);
>         if (hugetlb) {
>                 /* Make sure hugetlb demotion did not happen from under us. */
>                 if (folio == page_folio(page))
>                         return ret;
>                 if (ret > 0) {      <===== still hugetlb, don't fall thru, retry

I tend to agree we should retry get_hwpoison_hugetlb_folio() because folio is still hugetlb in this case.
Below folio_try_get() won't do the right things. This is on my TODO list but as you mentioned this, please
feel free to submit the corresponding patch.
Thanks.
.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ