[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aNKJ6b7kmT_u0A4c@li-2b55cdcc-350b-11b2-a85c-a78bff51fc11.ibm.com>
Date: Tue, 23 Sep 2025 13:52:09 +0200
From: Sumanth Korikkar <sumanthk@...ux.ibm.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Jonathan Corbet <corbet@....net>, Matthew Wilcox <willy@...radead.org>,
Guo Ren <guoren@...nel.org>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
"David S . Miller" <davem@...emloft.net>,
Andreas Larsson <andreas@...sler.com>, Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Dan Williams <dan.j.williams@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Dave Jiang <dave.jiang@...el.com>, Nicolas Pitre <nico@...xnic.net>,
Muchun Song <muchun.song@...ux.dev>,
Oscar Salvador <osalvador@...e.de>,
David Hildenbrand <david@...hat.com>,
Konstantin Komarov <almaz.alexandrovich@...agon-software.com>,
Baoquan He <bhe@...hat.com>, Vivek Goyal <vgoyal@...hat.com>,
Dave Young <dyoung@...hat.com>, Tony Luck <tony.luck@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Dave Martin <Dave.Martin@....com>, James Morse <james.morse@....com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Hugh Dickins <hughd@...gle.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Uladzislau Rezki <urezki@...il.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Andrey Konovalov <andreyknvl@...il.com>, Jann Horn <jannh@...gle.com>,
Pedro Falcato <pfalcato@...e.de>, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-csky@...r.kernel.org, linux-mips@...r.kernel.org,
linux-s390@...r.kernel.org, sparclinux@...r.kernel.org,
nvdimm@...ts.linux.dev, linux-cxl@...r.kernel.org, linux-mm@...ck.org,
ntfs3@...ts.linux.dev, kexec@...ts.infradead.org,
kasan-dev@...glegroups.com, Jason Gunthorpe <jgg@...dia.com>,
iommu@...ts.linux.dev, Kevin Tian <kevin.tian@...el.com>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>
Subject: Re: [PATCH v4 11/14] mm/hugetlbfs: update hugetlbfs to use
mmap_prepare
On Wed, Sep 17, 2025 at 08:11:13PM +0100, Lorenzo Stoakes wrote:
> Since we can now perform actions after the VMA is established via
> mmap_prepare, use desc->action_success_hook to set up the hugetlb lock
> once the VMA is setup.
>
> We also make changes throughout hugetlbfs to make this possible.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> Reviewed-by: Jason Gunthorpe <jgg@...dia.com>
> ---
> fs/hugetlbfs/inode.c | 36 ++++++++++------
> include/linux/hugetlb.h | 9 +++-
> include/linux/hugetlb_inline.h | 15 ++++---
> mm/hugetlb.c | 77 ++++++++++++++++++++--------------
> 4 files changed, 85 insertions(+), 52 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index f42548ee9083..9e0625167517 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -96,8 +96,15 @@ static const struct fs_parameter_spec hugetlb_fs_parameters[] = {
> #define PGOFF_LOFFT_MAX \
> (((1UL << (PAGE_SHIFT + 1)) - 1) << (BITS_PER_LONG - (PAGE_SHIFT + 1)))
>
> -static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
> +static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma)
> {
> + /* Unfortunate we have to reassign vma->vm_private_data. */
> + return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma);
> +}
Hi Lorenzo,
The following tests causes the kernel to enter a blocked state,
suggesting an issue related to locking order. I was able to reproduce
this behavior in certain test runs.
Test case:
git clone https://github.com/libhugetlbfs/libhugetlbfs.git
cd libhugetlbfs ; ./configure
make -j32
cd tests
echo 100 > /proc/sys/vm/nr_hugepages
mkdir -p /test-hugepages && mount -t hugetlbfs nodev /test-hugepages
./run_tests.py <in a loop>
...
shm-fork 10 100 (1024K: 64): PASS
set shmmax limit to 104857600
shm-getraw 100 /dev/full (1024K: 32):
shm-getraw 100 /dev/full (1024K: 64): PASS
fallocate_stress.sh (1024K: 64): <blocked>
Blocked task state below:
task:fallocate_stres state:D stack:0 pid:5106 tgid:5106 ppid:5103
task_flags:0x400000 flags:0x00000001
Call Trace:
[<00000255adc646f0>] __schedule+0x370/0x7f0
[<00000255adc64bb0>] schedule+0x40/0xc0
[<00000255adc64d32>] schedule_preempt_disabled+0x22/0x30
[<00000255adc68492>] rwsem_down_write_slowpath+0x232/0x610
[<00000255adc68922>] down_write_killable+0x52/0x80
[<00000255ad12c980>] vm_mmap_pgoff+0xc0/0x1f0
[<00000255ad164bbe>] ksys_mmap_pgoff+0x17e/0x220
[<00000255ad164d3c>] __s390x_sys_old_mmap+0x7c/0xa0
[<00000255adc60e4e>] __do_syscall+0x12e/0x350
[<00000255adc6cfee>] system_call+0x6e/0x90
task:fallocate_stres state:D stack:0 pid:5109 tgid:5106 ppid:5103
task_flags:0x400040 flags:0x00000001
Call Trace:
[<00000255adc646f0>] __schedule+0x370/0x7f0
[<00000255adc64bb0>] schedule+0x40/0xc0
[<00000255adc64d32>] schedule_preempt_disabled+0x22/0x30
[<00000255adc68492>] rwsem_down_write_slowpath+0x232/0x610
[<00000255adc688be>] down_write+0x4e/0x60
[<00000255ad1c11ec>] __hugetlb_zap_begin+0x3c/0x70
[<00000255ad158b9c>] unmap_vmas+0x10c/0x1a0
[<00000255ad180844>] vms_complete_munmap_vmas+0x134/0x2e0
[<00000255ad1811be>] do_vmi_align_munmap+0x13e/0x170
[<00000255ad1812ae>] do_vmi_munmap+0xbe/0x140
[<00000255ad183f86>] __vm_munmap+0xe6/0x190
[<00000255ad166832>] __s390x_sys_munmap+0x32/0x40
[<00000255adc60e4e>] __do_syscall+0x12e/0x350
[<00000255adc6cfee>] system_call+0x6e/0x90
Thanks,
Sumanth
Powered by blists - more mailing lists