linux-kernel - Re: [PATCH] vfio/pci: Virtualize Maximum Payload Size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c3bbb645-50a9-a1b4-c8f2-8973f0478a2a@redhat.com>
Date:   Tue, 19 Sep 2017 19:50:37 +0200
From:   Auger Eric <eric.auger@...hat.com>
To:     Alex Williamson <alex.williamson@...hat.com>, kvm@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org, jcm@...hat.com
Subject: Re: [PATCH] vfio/pci: Virtualize Maximum Payload Size

Hi Alex,

On 19/09/2017 18:58, Alex Williamson wrote:
> With virtual PCI-Express chipsets, we now see userspace/guest drivers
> trying to match the physical MPS setting to a virtual downstream port.
> Of course a lone physical device surrounded by virtual interconnects
> cannot make a correct decision for a proper MPS setting.  Instead,
> let's virtualize the MPS control register so that writes through to
> hardware are disallowed.  Userspace drivers like QEMU assume they can
> write anything to the device and we'll filter out anything dangerous.
> Since mismatched MPS can lead to AER and other faults, let's add it
> to the kernel side rather than relying on userspace virtualization to
> handle it.
> 
> Signed-off-by: Alex Williamson <alex.williamson@...hat.com>
> ---
> 
> Do we have any reason to suspect that a userspace driver has any
> dependencies on the physical MPS setting or is this only tuning the
> protocol layer and it's transparent to the driver?  Note that per the
> PCI spec, a device supporting only 128B MPS can hardwire the control
> register to 000b, but it doesn't seem PCIe compliant to hardwire it to
> any given value, such as would be the appearance if we exposed this as
> a read-only register rather than virtualizing it.  QEMU would then be
> responsible for virtualizing it, which makes coordinating the upgrade
> troublesome.
> 
>  drivers/vfio/pci/vfio_pci_config.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> index 5628fe114347..91335e6de88a 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -849,11 +849,13 @@ static int __init init_pci_cap_exp_perm(struct perm_bits *perm)
>  
>  	/*
>  	 * Allow writes to device control fields, except devctl_phantom,
> -	 * which could confuse IOMMU, and the ARI bit in devctl2, which
> +	 * which could confuse IOMMU, MPS, which can break communication
> +	 * with other physical devices, and the ARI bit in devctl2, which
>  	 * is set at probe time.  FLR gets virtualized via our writefn.
>  	 */
>  	p_setw(perm, PCI_EXP_DEVCTL,
> -	       PCI_EXP_DEVCTL_BCR_FLR, ~PCI_EXP_DEVCTL_PHANTOM);
> +	       PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD,
> +	       ~PCI_EXP_DEVCTL_PHANTOM);
>  	p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI);
Is it correct that the read value still will be the one written by the
guest?

I see the MMRS can take the read MPS value in some pcie_bus_config
values. So a consequence could be that the applied MMRS (which is not
virtualized) is lower than what is set by host, due to a guest pcie root
port MPSS for instance.

So if the above is not totally wrong, shouldn't we virtualize MMRS as well?

Thanks

Eric

>  	return 0;
>  }
>