linux-kernel - Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <58895de6-58e5-4cb5-b2b3-4a66283908a8@linux.alibaba.com>
Date: Mon, 22 Sep 2025 11:09:53 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Vernon Yang <vernon2gm@...il.com>
Cc: hughd@...gle.com, akpm@...ux-foundation.org, da.gomez@...sung.com,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 Vernon Yang <yanglincheng@...inos.cn>
Subject: Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback
 4KB



On 2025/9/22 10:51, Vernon Yang wrote:
> On Mon, Sep 22, 2025 at 09:46:53AM +0800, Baolin Wang wrote:
>>
>>
>> On 2025/9/9 20:29, Vernon Yang wrote:
>>>
>>>
>>>> On Sep 9, 2025, at 13:58, Baolin Wang <baolin.wang@...ux.alibaba.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2025/9/8 20:31, Vernon Yang wrote:
>>>>> From: Vernon Yang <yanglincheng@...inos.cn>
>>>>> When the system memory is sufficient, allocating memory is always
>>>>> successful, but when tmpfs size is low (e.g. 1MB), it falls back
>>>>> directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB)
>>>>> will not be tried.
>>>>> Therefore add check whether the remaining space of tmpfs is sufficient
>>>>> for allocation. If there is too little space left, try smaller large
>>>>> folio.
>>>>
>>>> I don't think so.
>>>>
>>>> For a tmpfs mount with 'huge=within_size' and 'size=1M', if you try to write 1M data, it will allocate an order 8 large folio and will not fallback to order 0.
>>>>
>>>> For a tmpfs mount with 'huge=always' and 'size=1M', if you try to write 1M data, it will not completely fallback to order 0 either, instead, it will still allocate some order 1 to order 7 large folios.
>>>>
>>>> I'm not sure if this is your actual user scenario. If your files are small and you are concerned about not getting large folio allocations, I recommend using the 'huge=within_size' mount option.
>>>>
>>>
>>> No, this is not my user scenario.
>>>
>>> Based on your previous patch [1], this scenario can be easily reproduced as
>>> follows.
>>>
>>> $ mount -t tmpfs -o size=1024K,huge=always tmpfs /xxx/test
>>> $ echo hello > /xxx/test/README
>>> $ df -h
>>> tmpfs            1.0M  4.0K 1020K   1% /xxx/test
>>>
>>> The code logic is as follows:
>>>
>>> shmem_get_folio_gfp()
>>>       orders = shmem_allowable_huge_orders()
>>>       shmem_alloc_and_add_folio(orders) return -ENOSPC;
>>>           shmem_alloc_folio() alloc 2MB
>>>           shmem_inode_acct_blocks()
>>>               percpu_counter_limited_add() goto unacct;
>>>           filemap_remove_folio()
>>>       shmem_alloc_and_add_folio(order = 0)
>>>
>>>
>>> As long as the tmpfs remaining space is too little and the system can allocate
>>> memory 2MB, the above path will be triggered.
>>
>> In your scenario, wouldn't allocating 4K be more reasonable? Using a 1M
>> large folio would waste memory. Moreover, if you want to use a large folio,
>> I think you could increase the 'size' mount option. To me, this doesn't seem
>> like a real-world usage scenario, instead it looks more like a contrived
>> test case for a specific situation.
> 
> The previous example is just an easy demo to reproduce, and if someone
> uses this example in the real world, of course the best method is to
> increase the 'size'.
> 
> But the scenario I want to express here is that when the tmpfs space is
> *consumed* to less than 2MB, only 4KB will be allocated, you can imagine
> that when a tmpfs is constantly consumed, but someone is reclaiming or
> freeing memory, causing often tmpfs space to remain in the range of
> [0~2MB), then tmpfs will always only allocate 4KB.

Please increase your 'size' mount option for testing. I don't see why we 
need to add more such logic without a solid reason.

Andrew, please drop this patch.