lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFsPwLB41_3VDvtY@x1.local>
Date: Tue, 24 Jun 2025 16:51:12 -0400
From: Peter Xu <peterx@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	kvm@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
	Alex Williamson <alex.williamson@...hat.com>,
	Zi Yan <ziy@...dia.com>, Alex Mastro <amastro@...com>,
	David Hildenbrand <david@...hat.com>,
	Nico Pache <npache@...hat.com>
Subject: Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED
 mappings

On Tue, Jun 24, 2025 at 04:37:26PM -0400, Peter Xu wrote:
> On Thu, Jun 19, 2025 at 03:40:41PM -0300, Jason Gunthorpe wrote:
> > Even with this new version you have to decide to return PUD_SIZE or
> > bar_size in pci and your same reasoning that PUD_SIZE make sense
> > applies (though I would probably return bar_size and just let the core
> > code cap it to PUD_SIZE)
> 
> Yes.
> 
> Today I went back to look at this, I was trying to introduce this for
> file_operations:
> 
> 	int (*get_mapping_order)(struct file *, unsigned long, size_t);
> 
> It looks almost good, except that it so far has no way to return the
> physical address for further calculation on the alignment.
> 
> For THP, VA is always calculated against pgoff not physical address on the
> alignment.  I think it's OK for THP, because every 2M THP folio will be
> naturally 2M aligned on the physical address, so it fits when e.g. pgoff=0
> in the calculation of thp_get_unmapped_area_vmflags().
> 
> Logically it should even also work for vfio-pci, as long as VFIO keeps
> using the lower 40 bits of the device_fd to represent the bar offset,
> meanwhile it'll also require PCIe spec asking the PCI bars to be mapped
> aligned with bar sizes.
> 
> But from an API POV, get_mapping_order() logically should return something
> for further calculation of the alignment to get the VA.  pgoff here may not
> always be the right thing to use to align to the VA: after all, pgtable
> mapping is about VA -> PA, the only reasonable and reliable way is to align
> VA to the PA to be mappped, and as an API we shouldn't assume pgoff is
> always aligned to PA address space.
> 
> Any thoughts?

I should have listed current viable next steps..  We have at least these
options:

(a) Ignore this issue, keep the get_mapping_order() interface like above,
    as long as it works for vfio-pci

    I don't like this option.  I prefer the API (if we're going to
    introduce one) to be applicable no matter how pgoff would be mapped to
    PAs.  I don't like the API to rely on specific driver on specific spec
    (in this case, PCI).

(b) I can make the new API like this instead:

    int (*get_mapping_order)(struct file *, unsigned long, unsigned long *, size_t);

    where I can return a *phys_pgoff altogether after the call returned the
    order to map in retval.  But that's very not pretty if not ugly.

(c) Go back to what I did with the current v1, addressing comments and keep
    using get_unmapped_area() until we know a better way.

I'll vote for (c), but I'm open to suggestions.

Thanks,

-- 
Peter Xu


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ