[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4cd8929d6ef45f62e9eb6bb905f28ada62600c23.camel@linux.ibm.com>
Date: Wed, 26 Feb 2025 09:28:13 +0100
From: Niklas Schnelle <schnelle@...ux.ibm.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Christoph Hellwig <hch@....de>, Alexandra Winter
<wintera@...ux.ibm.com>,
Alex Williamson <alex.williamson@...hat.com>,
Gerd Bayer <gbayer@...ux.ibm.com>,
Matthew Rosato
<mjrosato@...ux.ibm.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Thorsten Winkler
<twinkler@...ux.ibm.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Julian Ruess
<julianr@...ux.ibm.com>,
Halil Pasic <pasic@...ux.ibm.com>,
Christian
Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle
<svens@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-pci@...r.kernel.org
Subject: Re: [PATCH v6 0/3] vfio/pci: s390: Fix issues preventing
VFIO_PCI_MMAP=y for s390 and enable it
On Tue, 2025-02-25 at 14:35 -0600, Bjorn Helgaas wrote:
> On Tue, Feb 25, 2025 at 09:59:13AM +0100, Niklas Schnelle wrote:
> > On Mon, 2025-02-24 at 14:53 -0600, Bjorn Helgaas wrote:
> > > On Fri, Feb 14, 2025 at 02:10:51PM +0100, Niklas Schnelle wrote:
> > > > With the introduction of memory I/O (MIO) instructions enbaled in commit
> > > > 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions") s390
> > > > gained support for direct user-space access to mapped PCI resources.
> > > > Even without those however user-space can access mapped PCI resources
> > > > via the s390 specific MMIO syscalls. There is thus nothing fundamentally
> > > > preventing s390 from supporting VFIO_PCI_MMAP, allowing user-space
> > > > drivers to access PCI resources without going through the pread()
> > > > interface. To actually enable VFIO_PCI_MMAP a few issues need fixing
> > > > however.
> > > >
> > > > Firstly the s390 MMIO syscalls do not cause a page fault when
> > > > follow_pte() fails due to the page not being present. This breaks
> > > > vfio-pci's mmap() handling which lazily maps on first access.
> > > >
> > > > Secondly on s390 there is a virtual PCI device called ISM which has
> > > > a few oddities. For one it claims to have a 256 TiB PCI BAR (not a typo)
> > > > which leads to any attempt to mmap() it fail with the following message:
> > > >
> > > > vmap allocation for size 281474976714752 failed: use vmalloc=<size> to increase size
> > > >
> > > > Even if one tried to map this BAR only partially the mapping would not
> > > > be usable on systems with MIO support enabled. So just block mapping
> > > > BARs which don't fit between IOREMAP_START and IOREMAP_END. Solve this
> > > > by keeping the vfio-pci mmap() blocking behavior around for this
> > > > specific device via a PCI quirk and new pdev->non_mappable_bars
> > > > flag.
> > > >
> > > > As noted by Alex Williamson With mmap() enabled in vfio-pci it makes
> > > > sense to also enable HAVE_PCI_MMAP with the same restriction for pdev->
> > > > non_mappable_bars. So this is added in patch 3 and I tested this with
> > > > another small test program.
> > > >
> > > > Note:
> > > > For your convenience the code is also available in the tagged
> > > > b4/vfio_pci_mmap branch on my git.kernel.org site below:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/niks/linux.git/
> > > >
> > > > Thanks,
> > > > Niklas
> > > >
> > > > Link: https://lore.kernel.org/all/c5ba134a1d4f4465b5956027e6a4ea6f6beff969.camel@linux.ibm.com/
> > > > Signed-off-by: Niklas Schnelle <schnelle@...ux.ibm.com>
> > > > ---
> > > > Changes in v6:
> > > > - Add a patch to also enable PCI resource mmap() via sysfs and proc
> > > > exlcluding pdev->non_mappable_bars devices (Alex Williamson)
> > > > - Added Acks
> > > > - Link to v5: https://lore.kernel.org/r/20250212-vfio_pci_mmap-v5-0-633ca5e056da@linux.ibm.com
> > >
> > > I think the series would be more readable if patch 2/3 included all
> > > the core changes (adding pci_dev.non_mappable_bars, the 3/3
> > > pci-sysfs.c and proc.c changes to test it, and I suppose the similar
> > > vfio_pci_core.c change), and we moved all the s390 content from 2/3 to
> > > 3/3.
> >
> > Maybe we could do the following:
> >
> > 1/3: As is
> >
> > 2/3: Introduces pdev->non_mappable_bars and the checks in vfio and
> > proc.c/pci-sysfs.c. To make the flag handle the vfio case with
> > VFIO_PCI_MMAP gone, a one-line change in s390 will set pdev-
> > > non_mappable_bars = 1 for all PCI devices.
>
> What if you moved the vfio_pci_core.c change to patch 3? Then I think
> patch 2 would do nothing at all (since there's nothing that sets
> non_mappable_bars), and all the s390 stuff would be in patch 3?
>
> Not sure if that's possible, but I think it's a little confusing to
> have the s390 changes split across patch 2 and 3.
I'm not really a fan of having a completely unused flag, even in an
intermediate commit. I've edited the commits yesterday and with this
approach the s390 specific part of 2/3 really is just the below hunk:
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 88f72745fa59..d14b8605a32c 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -590,6 +590,7 @@ int pcibios_device_add(struct pci_dev *pdev)
zpci_zdev_get(zdev);
if (pdev->is_physfn)
pdev->no_vf_scan = 1;
+ pdev->non_mappable_bars = 1;
zpci_map_resources(pdev);
That added line then gets deleted again in 3/3. I think this makes it
pretty logical, with patch 2/3 we set it for all PCI devices giving the
existing behavior and by pdev->non_mappable_bars replacing the "y if
S390" of VFIO_PCI_MMAP, then 3/3 narrows it down to just the ISM
device.
Thanks,
Niklas
Powered by blists - more mailing lists