linux-kernel - Re: [PATCH v9 04/10] mm: thp: Support allocation of anonymous multi-size THP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8e2a5b4c-ba3a-4dcd-8aae-e5d3170d048a@arm.com>
Date:   Thu, 14 Dec 2023 12:12:05 +0000
From:   Ryan Roberts <ryan.roberts@....com>
To:     Dan Carpenter <dan.carpenter@...aro.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Matthew Wilcox <willy@...radead.org>,
        Yin Fengwei <fengwei.yin@...el.com>,
        David Hildenbrand <david@...hat.com>,
        Yu Zhao <yuzhao@...gle.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Yang Shi <shy828301@...il.com>,
        "Huang, Ying" <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Itaru Kitayama <itaru.kitayama@...il.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        John Hubbard <jhubbard@...dia.com>,
        David Rientjes <rientjes@...gle.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Hugh Dickins <hughd@...gle.com>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        Barry Song <21cnbao@...il.com>,
        Alistair Popple <apopple@...dia.com>, linux-mm@...ck.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 04/10] mm: thp: Support allocation of anonymous
 multi-size THP

On 14/12/2023 11:30, Dan Carpenter wrote:
> On Thu, Dec 14, 2023 at 10:54:19AM +0000, Ryan Roberts wrote:
>> On 13/12/2023 07:21, Dan Carpenter wrote:
>>> On Thu, Dec 07, 2023 at 04:12:05PM +0000, Ryan Roberts wrote:
>>>> @@ -4176,10 +4260,15 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
>>>>  	/* Allocate our own private page. */
>>>>  	if (unlikely(anon_vma_prepare(vma)))
>>>>  		goto oom;
>>>> -	folio = vma_alloc_zeroed_movable_folio(vma, vmf->address);
>>>> +	folio = alloc_anon_folio(vmf);
>>>> +	if (IS_ERR(folio))
>>>> +		return 0;
>>>>  	if (!folio)
>>>>  		goto oom;
>>>
>>> Returning zero is weird.  I think it should be a vm_fault_t code.
>>
>> It's the same pattern that the existing code a little further down this function
>> already implements:
>>
>> 	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
>> 	if (!vmf->pte)
>> 		goto release;
>>
>> If we fail to map/lock the pte (due to a race), then we return 0 to allow user
>> space to rerun the faulting instruction and cause the fault to happen again. The
>> above code ends up calling "return ret;" and ret is 0.
>>
> 
> Ah, okay.  Thanks!
> 
>>>
>>> This mixing of error pointers and NULL is going to cause problems.
>>> Normally when we have a mix of error pointers and NULL then the NULL is
>>> not an error but instead means that the feature has been deliberately
>>> turned off.  I'm unable to figure out what the meaning is here.
>>
>> There are 3 conditions that the function can return:
>>
>>  - folio successfully allocated
>>  - folio failed to be allocated due to OOM
>>  - fault needs to be tried again due to losing race
>>
>> Previously only the first 2 conditions were possible and they were indicated by
>> NULL/not-NULL. The new 3rd condition is only possible when THP is compile-time
>> enabled. So it keeps the logic simpler to keep the NULL/not-NULL distinction for
>> the first 2, and use the error code for the final one.
>>
>> There are IS_ERR() and IS_ERR_OR_NULL() variants so I assume a pattern where you
>> can have pointer, error or NULL is somewhat common already?
> 
> People are confused by this a lot so I have written a blog about it:
> 
> https://staticthinking.wordpress.com/2022/08/01/mixing-error-pointers-and-null/

Nice; thanks for the pointer :)

> 
> The IS_ERR_OR_NULL() function should be used like this:
> 
> int blink_leds()
> {
> 	led = get_leds();
> 	if (IS_ERR_OR_NULL(led))
> 		return PTR_ERR(led);  <-- NULL means zero/success
> 	return led->blink();
> }
> 
> In the case of alloc_anon_folio(), I would be tempted to create a
> wrapper around it where NULL becomes ERR_PTR(-ENOMEM).  But this is
> obviously fast path code and I haven't benchmarked it.
> 
> Adding a comment is the other option.

I'll add a comment; as you say this is a fast path, and I'm actively being
burned in similar places (on another series I'm working on) where an additional
check is regressing performance significantly so not keen on risking it here.

Andrew, I'll fold in the David's suggested ifdef improvement at the same time.
Would you prefer an additional patch to squash in, or a whole new version of the
series to swap out with the existing patches in mm-unstable?

> 
> regards,
> dan carpenter
>