[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e7b276eb-960a-4e05-9f84-6152de9ac2ea@linux.alibaba.com>
Date: Fri, 7 Feb 2025 15:23:54 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Lance Yang <ioworker0@...il.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@...oo.ca>, linux-mm@...ck.org,
Daniel Gomez <da.gomez@...sung.com>, Barry Song <baohua@...nel.org>,
David Hildenbrand <david@...hat.com>, Hugh Dickins <hughd@...gle.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Matthew Wilcox <willy@...radead.org>, Ryan Roberts <ryan.roberts@....com>,
linux-kernel@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: Hang when swapping huge=within_size tmpfs from zram
On 2025/2/5 22:39, Lance Yang wrote:
> On Wed, Feb 5, 2025 at 2:38 PM Baolin Wang
> <baolin.wang@...ux.alibaba.com> wrote:
>>
>>
>>
>> On 2025/2/5 09:55, Baolin Wang wrote:
>>> Hi Alex,
>>>
>>> On 2025/2/5 09:23, Alex Xu (Hello71) wrote:
>>>> Hi all,
>>>>
>>>> On 6.14-rc1, I found that creating a lot of files in tmpfs then deleting
>>>> them reliably hangs when tmpfs is mounted with huge=within_size, and it
>>>> is swapped out to zram (zstd/zsmalloc/no backing dev). I bisected this
>>>> to acd7ccb284b "mm: shmem: add large folio support for tmpfs".
>>>>
>>>> When the issue occurs, rm uses 100% CPU, cannot be killed, and has no
>>>> output in /proc/pid/stack or wchan. Eventually, an RCU stall is
>>>> detected:
>>>
>>> Thanks for your report. Let me try to reproduce the issue locally and
>>> see what happens.
>>>
>>>> rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>>>> rcu: Tasks blocked on level-0 rcu_node (CPUs 0-11): P25160
>>>> rcu: (detected by 10, t=2102 jiffies, g=532677, q=4997 ncpus=12)
>>>> task:rm state:R running task stack:0 pid:25160
>>>> tgid:25160 ppid:24309 task_flags:0x400000 flags:0x00004004
>>>> Call Trace:
>>>> <TASK>
>>>> ? __schedule+0x388/0x1000
>>>> ? kmem_cache_free.part.0+0x23d/0x280
>>>> ? sysvec_apic_timer_interrupt+0xa/0x80
>>>> ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>>>> ? xas_load+0x12/0xc0
>>>> ? xas_load+0x8/0xc0
>>>> ? xas_find+0x144/0x190
>>>> ? find_lock_entries+0x75/0x260
>>>> ? shmem_undo_range+0xe6/0x5f0
>>>> ? shmem_evict_inode+0xe4/0x230
>>>> ? mtree_erase+0x7e/0xe0
>>>> ? inode_set_ctime_current+0x2e/0x1f0
>>>> ? evict+0xe9/0x260
>>>> ? _atomic_dec_and_lock+0x31/0x50
>>>> ? do_unlinkat+0x270/0x2b0
>>>> ? __x64_sys_unlinkat+0x30/0x50
>>>> ? do_syscall_64+0x37/0xe0
>>>> ? entry_SYSCALL_64_after_hwframe+0x50/0x58
>>>> </TASK>
>>>>
>>>> Let me know what information is needed to further troubleshoot this
>>>> issue.
>>
>> Sorry, I can't reproduce this issue, and my testing process is as follows:
>> 1. Mount tmpfs with huge=within_size
>> 2. Create and write a tmpfs file
>> 3. Swap out the large folios of the tmpfs file to zram
>> 4. Execute 'rm' command to remove the tmpfs file
>
> I’m unable to reproduce the issue as well, and am following steps similar
> to Baolin's process:
>
> 1) Mount tmpfs with the huge=within_size option and enable swap (using
> zstd/zsmalloc without a backing device).
> 2) Create and write over 10,000 files in the tmpfs.
> 3) Swap out the large folios of these tmpfs files to zram.
> 4) Use the rm command to delete all the files from the tmpfs.
>
> Testing with both 2MiB and 64KiB large folio sizes, and with
> shmem_enabled=within_size, but everything works as expected.
Thanks Lance for confirming again.
Alex, could you give more hints on how to reproduce this issue?
Powered by blists - more mailing lists