linux-kernel - Re: [PATCH 4/8] mm/memory-failure.c: remove unneeded orig

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9e6c4031-18eb-c4fb-d30c-3fa5767f07f8@huawei.com>
Date:   Tue, 15 Feb 2022 17:28:32 +0800
From:   Miaohe Lin <linmiaohe@...wei.com>
To:     HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>
CC:     Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4/8] mm/memory-failure.c: remove unneeded orig_head

On 2022/2/15 16:48, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Tue, Feb 15, 2022 at 11:14:07AM +0800, Miaohe Lin wrote:
>> On 2022/2/14 22:50, Naoya Horiguchi wrote:
>>> On Thu, Feb 10, 2022 at 10:17:29PM +0800, Miaohe Lin wrote:
>>>> orig_head is used to check whether the page have changed compound pages
>>>> during the locking. But it's always equal to hpage. So we can use hpage
>>>> directly and remove this redundant one.
>>>>
>>>> Signed-off-by: Miaohe Lin <linmiaohe@...wei.com>
>>>> ---
>>>>  mm/memory-failure.c | 5 ++---
>>>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 2dd7f35ee65a..4370c2f407c5 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -1691,7 +1691,6 @@ int memory_failure(unsigned long pfn, int flags)
>>>>  {
>>>>  	struct page *p;
>>>>  	struct page *hpage;
>>>> -	struct page *orig_head;
>>>>  	struct dev_pagemap *pgmap;
>>>>  	int res = 0;
>>>>  	unsigned long page_flags;
>>>> @@ -1737,7 +1736,7 @@ int memory_failure(unsigned long pfn, int flags)
>>>>  		goto unlock_mutex;
>>>>  	}
>>>>
>>>> -	orig_head = hpage = compound_head(p);
>>>> +	hpage = compound_head(p);
>>>>  	num_poisoned_pages_inc();
>>>>
>>>>  	/*
>>>> @@ -1821,7 +1820,7 @@ int memory_failure(unsigned long pfn, int flags)
>>>>  	 * The page could have changed compound pages during the locking.
>>>>  	 * If this happens just bail out.
>>>>  	 */
>>>> -	if (PageCompound(p) && compound_head(p) != orig_head) {
>>>> +	if (PageCompound(p) && compound_head(p) != hpage) {
>>>
>>> I think that this if-check was intended to detect the case that page p
>>> belongs to a thp when memory_failure() is called and belongs to a compound
>>> page in different size (like slab or some driver page) after the thp is
>>> split.  But your suggestion makes me aware that the page p could be embedded
>>> on a thp again after thp split.  I think this might be rare, but if it
>>
>> IIUC, this page can't be embedded on a thp again after thp split because memory_failure hold
>> an __extra__ page refcnt. I think there exist below race windows:
>>
>> memory_failure
>>   orig_head = hpage = compound_head(p); -- page is Non-compound yet
>>   < -- Page becomes compound page, like thp, slab, some driver page and even hugetlb page -- >
>>   get_hwpoison_page
>>   failed to split thp page, as hpage is Non-compound ...
>>   lock_page
>>
>>> happens the current if-check (or suggested one) cannot detect it.
>>> So I feel that simply dropping compound_head() check might be better?
>>>
>>> -	if (PageCompound(p) && compound_head(p) != orig_head) {
>>> +	if (PageCompound(p)) {
>>
>> However this change could also catch the above race correctly. In fact, we can't handle
>> compound page here. But is it enough to just return -EBUSY here as it's really rare or
>> we should do more things (like split thp, retry if in PageHuge case)?
> 
> Hmm, both could make sense and hard to judge to me, so it's upto you.
> We already have goto label "try_again" so retrying might not be so surprising.
> 

try_again sounds a good idea. Will send a V2. Many thanks.

> Thanks,
> Naoya Horiguchi
>