lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e2033095-9bf1-4d9c-9a5b-01148eaffc30@redhat.com>
Date: Thu, 4 Dec 2025 19:16:54 +0100
From: Cédric Le Goater <clg@...hat.com>
To: Peter Xu <peterx@...hat.com>, kvm@...r.kernel.org, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Cc: Jason Gunthorpe <jgg@...dia.com>, Nico Pache <npache@...hat.com>,
 Zi Yan <ziy@...dia.com>, Alex Mastro <amastro@...com>,
 David Hildenbrand <david@...hat.com>, Alex Williamson <alex@...zbot.org>,
 Zhi Wang <zhiw@...dia.com>, David Laight <david.laight.linux@...il.com>,
 Yi Liu <yi.l.liu@...el.com>, Ankit Agrawal <ankita@...dia.com>,
 Kevin Tian <kevin.tian@...el.com>, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings

On 12/4/25 16:09, Peter Xu wrote:
> This series is based on v6.18.  It allows mmap(!MAP_FIXED) to work with
> huge pfnmaps with best effort.  Meanwhile, it enables it for vfio-pci as
> the first user.
> 
> v1: https://lore.kernel.org/r/20250613134111.469884-1-peterx@redhat.com
> 
> A changelog may not apply because all the patches were rewrote based on a
> new interface this v2 introduced.  Hence omitted.
> 
> In this version, a new file operation, get_mapping_order(), is introduced
> (based on discussion with Jason on v1) to minimize the code needed for
> drivers to implement this.  It also helps avoid exporting any mm functions.
> One can refer to the discussion in v1 for more information.
> 
> Currently, get_mapping_order() API is define as:
> 
>    int (*get_mapping_order)(struct file *file, unsigned long pgoff, size_t len);
> 
> The first argument is the file pointer, the 2nd+3rd are the pgoff+len
> specified from a mmap() request.  The driver can use this interface to
> opt-in providing mapping order hints to core mm on VA allocations for the
> range of the file specified.  I kept the interface as simple for now, so
> that core mm will always do the alignment with pgoff assuming that would
> always work.  The driver can only report the order from pgoff+len, which
> will be used to do the alignment.
> 
> Before this series, an userapp in most cases need to be modified to benefit
> from huge mappings to provide huge size aligned VA using MAP_FIXED.  After
> this series, the userapp can benefit from huge pfnmap automatically after
> the kernel upgrades, with no userspace modifications.
> 
> It's still best-effort, because the auto-alignment will require a larger VA
> range to be allocated via the per-arch allocator, hence if the huge-mapping
> aligned VA cannot be allocated then it'll still fallback to small mappings
> like before.  However that's from theory POV: in reality I don't yet know
> when it'll fail especially when on a 64bits system.
> 
> So far, only vfio-pci is supported.  But the logic should be applicable to
> all the drivers that support or will support huge pfnmaps.  I've copied
> some more people in this version too from hardware perspective.
> 
> For testings:
> 
> - checkpatch.pl
> - cross build harness
> - unit test that I got from Alex [1], checking mmap() alignments on a QEMU
>    instance with an 128MB bar.
> 
> Checking the alignments look all sane with mmap(!MAP_FIXED), and huge
> mappings properly installed.  I didn't observe anything wrong.
> 
> I currently lack larger bars to test PUD sizes.  Please kindly report if
> one can run this with 1G+ bars and hit issues.

LGTM, with a 32G BAR :

Using device 0000:02:00.0 in IOMMU group 27
Device 0000:02:00.0 supports 9 regions, 5 irqs
[BAR0]: size 0x1000000, order 24, offset 0x0, flags 0xf
Testing BAR0, require at least 21 bit alignment
[PASS] Minimum alignment 21
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size
[BAR1]: size 0x800000000, order 35, offset 0x10000000000, flags 0x7
Testing BAR1, require at least 30 bit alignment
[PASS] Minimum alignment 31
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size
[BAR3]: size 0x2000000, order 25, offset 0x30000000000, flags 0x7
Testing BAR3, require at least 21 bit alignment
[PASS] Minimum alignment 21
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size


C.

> 
> Alex Mastro: thanks for the testing offered in v1, but since this series
> was rewritten, a re-test will be needed.  I hence didn't collect the T-b.
> 
> Comments welcomed, thanks.
> 
> [1] https://github.com/awilliam/tests/blob/vfio-pci-device-map-alignment/vfio-pci-device-map-alignment.c
> 
> Peter Xu (4):
>    mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment
>    mm: Add file_operations.get_mapping_order()
>    vfio: Introduce vfio_device_ops.get_mapping_order hook
>    vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
> 
>   Documentation/filesystems/vfs.rst |  4 +++
>   drivers/vfio/pci/vfio_pci.c       |  1 +
>   drivers/vfio/pci/vfio_pci_core.c  | 49 ++++++++++++++++++++++++++
>   drivers/vfio/vfio_main.c          | 14 ++++++++
>   include/linux/fs.h                |  1 +
>   include/linux/huge_mm.h           |  5 +--
>   include/linux/vfio.h              |  5 +++
>   include/linux/vfio_pci_core.h     |  2 ++
>   mm/huge_memory.c                  |  7 ++--
>   mm/mmap.c                         | 58 +++++++++++++++++++++++++++----
>   10 files changed, 135 insertions(+), 11 deletions(-)
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ