lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8b4603f3-19c7-4979-ae4a-ef99690562d1@linux.alibaba.com>
Date: Fri, 4 Jul 2025 10:35:43 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: David Hildenbrand <david@...hat.com>, akpm@...ux-foundation.org,
 hughd@...gle.com
Cc: ziy@...dia.com, lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com,
 npache@...hat.com, ryan.roberts@....com, dev.jain@....com,
 baohua@...nel.org, vbabka@...e.cz, rppt@...nel.org, surenb@...gle.com,
 mhocko@...e.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: support large mapping building for tmpfs



On 2025/7/2 19:55, David Hildenbrand wrote:
> On 02.07.25 13:38, David Hildenbrand wrote:
>>
>>>> So by mapping more in a single page fault, you end up increasing "RSS".
>>>> But I wouldn't
>>>> call that "expected". I rather suspect that nobody will really care :)
>>>
>>> But tmpfs is a little special here. It uses the 'huge=' option to
>>> control large folio allocation. So, I think users should know they want
>>> to use large folios and build the whole mapping for the large folios.
>>> That is why I call it 'expected'.
>>
>> Well, if your distribution decides to set huge= on /tmp or something
>> like that, your application might have very little saying in that, 
>> right? :)
>>
>> Again, I assume it's fine, but we might find surprises on the way.
>>
>>>>
>>>> The thing is, when you *allocate* a new folio, it must adhere at 
>>>> least to
>>>> pagecache alignment (e.g., cannot place an order-2 folio at pgoff 1) --
>>>
>>> Yes, agree.
>>>
>>>> that is what
>>>> thp_vma_suitable_order() checks. Otherwise you cannot add it to the
>>>> pagecache.
>>>
>>> But this alignment is not done by thp_vma_suitable_order().
>>>
>>> For tmpfs, it will check the alignment in shmem_suitable_orders() via:
>>> "
>>>     if (!xa_find(&mapping->i_pages, &aligned_index,
>>>             aligned_index + pages - 1, XA_PRESENT))
>>> "
>>
>> That's not really alignment check, that's just checking whether a
>> suitable folio order spans already-present entries, no?
>>
>> Finding suitable orders is still up to other code IIUC.
>>
>>>
>>> For other fs systems, it will check the alignment in
>>> __filemap_get_folio() via:
>>> "
>>>     /* If we're not aligned, allocate a smaller folio */
>>>     if (index & ((1UL << order) - 1))
>>>         order = __ffs(index);
>>> "
>>>
>>>> But once you *obtain* a folio from the pagecache and are supposed to 
>>>> map it
>>>> into the page tables, that must already hold true.
>>>>
>>>> So you should be able to just blindly map whatever is given to you here
>>>> AFAIKS.
>>>>
>>>> If you would get a pagecache folio that violates the linear page offset
>>>> requirement
>>>> at that point, something else would have messed up the pagecache.
>>>
>>> Yes. But the comments from thp_vma_suitable_order() is not about the
>>> pagecache alignment, it says "the order-aligned addresses in the VMA map
>>> to order-aligned offsets within the file",
>>
>> Let's dig, it's confusing.
>>
>> The code in question is:
>>
>> if (!IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
>>         hpage_size >> PAGE_SHIFT))
>>
>> So yes, I think this tells us: if we would have a PMD THP in the
>> pagecache, would we be able to map it with a PMD. If not, then don't
>> bother with allocating a PMD THP.
>>
>> Of course, this also applies to other orders, but for PMD THPs it's
>> probably most relevant: if we cannot even map it through a PMD, then
>> probably it could be a wasted THP.
>>
>> So yes, I agree: if we are both no missing something, then this
>> primarily relevant for the PMD case.
>>
>> And it's more about "optimization" than "correctness" I guess?
> 
> Correction: only if a caller doesn't assume that this is an implicit 
> pagecache alignment check. Not sure if that might be the case for shmem 
> when it calls thp_vma_suitable_order() with a VMA ...

I am sure shmem will not use thp_vma_suitable_order() for pagecache 
alignment checks, because shmem has explicit code for pagecache 
alignment checks.

Adding thp_vma_suitable_order() in shmem is more about following the 
allocation logic of anonymous pages. I also think the 'IS_ALIGNED()' 
check in thp_vma_suitable_order() for shmem is more about 'optimization' 
rather than 'correction'. Anyway, I will take another look at shmem's 
checking logic.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ