lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 10 Dec 2020 10:51:23 -0500
From:   Matthew Rosato <mjrosato@...ux.ibm.com>
To:     Cornelia Huck <cohuck@...hat.com>
Cc:     alex.williamson@...hat.com, schnelle@...ux.ibm.com,
        pmorel@...ux.ibm.com, borntraeger@...ibm.com, hca@...ux.ibm.com,
        gor@...ux.ibm.com, gerald.schaefer@...ux.ibm.com,
        linux-s390@...r.kernel.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC 0/4] vfio-pci/zdev: Fixing s390 vfio-pci ISM support

On 12/10/20 7:33 AM, Cornelia Huck wrote:
> On Wed,  9 Dec 2020 15:27:46 -0500
> Matthew Rosato <mjrosato@...ux.ibm.com> wrote:
> 
>> Today, ISM devices are completely disallowed for vfio-pci passthrough as
>> QEMU will reject the device due to an (inappropriate) MSI-X check.
>> However, in an effort to enable ISM device passthrough, I realized that the
>> manner in which ISM performs block write operations is highly incompatible
>> with the way that QEMU s390 PCI instruction interception and
>> vfio_pci_bar_rw break up I/O operations into 8B and 4B operations -- ISM
>> devices have particular requirements in regards to the alignment, size and
>> order of writes performed.  Furthermore, they require that legacy/non-MIO
>> s390 PCI instructions are used, which is also not guaranteed when the I/O
>> is passed through the typical userspace channels.
> 
> The part about the non-MIO instructions confuses me. How can MIO
> instructions be generated with the current code, and why does changing

So to be clear, they are not being generated at all in the guest as the 
necessary facility is reported as unavailable.

Let's talk about Linux in LPAR / the host kernel:  When hardware that 
supports MIO instructions is available, all userspace I/O traffic is 
going to be routed through the MIO variants of the s390 PCI 
instructions.  This is working well for other device types, but does not 
work for ISM which does not support these variants.  However, the ISM 
driver also does not invoke the userspace I/O routines for the kernel, 
it invokes the s390 PCI layer directly, which in turn ensures the proper 
PCI instructions are used -- This approach falls apart when the guest 
ISM driver invokes those routines in the guest -- we (qemu) pass those 
non-MIO instructions from the guest as memory operations through 
vfio-pci, traversing through the vfio I/O layer in the guest 
(vfio_pci_bar_rw and friends), where we then arrive in the host s390 PCI 
layer -- where the MIO variant is used because the facility is available.

Per conversations with Niklas (on CC), it's not trivial to decide by the 
time we reach the s390 PCI I/O layer to switch gears and use the non-MIO 
instruction set.

> the write pattern help?

The write pattern is a separate issue from non-MIO instruction 
requirements...  Certain address spaces require specific instructions to 
be used (so, no substituting PCISTG for PCISTB - that happens too by 
default for any writes coming into the host s390 PCI layer that are 
<=8B, and they all are when the PCISTB is broken up into 8B memory 
operations that travel through vfio_pci_bar_rw, which further breaks 
those up into 4B operations).  There's also a requirement for some 
writes that the data, if broken up, be written in a certain order in 
order to properly trigger events. :(  The ability to pass the entire 
PCISTB payload vs breaking it into 8B chunks is also significantly faster.

> 
>>
>> As a result, this patchset proposes a new VFIO region to allow a guest to
>> pass certain PCI instruction intercepts directly to the s390 host kernel
>> PCI layer for exeuction, pinning the guest buffer in memory briefly in
>> order to execute the requested PCI instruction.
>>
>> Matthew Rosato (4):
>>    s390/pci: track alignment/length strictness for zpci_dev
>>    vfio-pci/zdev: Pass the relaxed alignment flag
>>    s390/pci: Get hardware-reported max store block length
>>    vfio-pci/zdev: Introduce the zPCI I/O vfio region
>>
>>   arch/s390/include/asm/pci.h         |   4 +-
>>   arch/s390/include/asm/pci_clp.h     |   7 +-
>>   arch/s390/pci/pci_clp.c             |   2 +
>>   drivers/vfio/pci/vfio_pci.c         |   8 ++
>>   drivers/vfio/pci/vfio_pci_private.h |   6 ++
>>   drivers/vfio/pci/vfio_pci_zdev.c    | 160 ++++++++++++++++++++++++++++++++++++
>>   include/uapi/linux/vfio.h           |   4 +
>>   include/uapi/linux/vfio_zdev.h      |  33 ++++++++
>>   8 files changed, 221 insertions(+), 3 deletions(-)
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ