lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR12MB1600255ACFCD3FB3C80EB8B6C88B0@MWHPR12MB1600.namprd12.prod.outlook.com>
Date:   Wed, 9 Aug 2017 12:33:34 +0000
From:   Casey Leedom <leedom@...lsio.com>
To:     Bjorn Helgaas <helgaas@...nel.org>,
        Ding Tianhong <dingtianhong@...wei.com>
CC:     "ashok.raj@...el.com" <ashok.raj@...el.com>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        Michael Werner <werner@...lsio.com>,
        Ganesh GR <ganeshgr@...lsio.com>,
        "asit.k.mallick@...el.com" <asit.k.mallick@...el.com>,
        "patrick.j.cramer@...el.com" <patrick.j.cramer@...el.com>,
        "Suravee.Suthikulpanit@....com" <Suravee.Suthikulpanit@....com>,
        "Bob.Shaw@....com" <Bob.Shaw@....com>,
        "l.stach@...gutronix.de" <l.stach@...gutronix.de>,
        "amira@...lanox.com" <amira@...lanox.com>,
        "gabriele.paoloni@...wei.com" <gabriele.paoloni@...wei.com>,
        "David.Laight@...lab.com" <David.Laight@...lab.com>,
        "jeffrey.t.kirsher@...el.com" <jeffrey.t.kirsher@...el.com>,
        "catalin.marinas@....com" <catalin.marinas@....com>,
        "will.deacon@....com" <will.deacon@....com>,
        "mark.rutland@....com" <mark.rutland@....com>,
        "robin.murphy@....com" <robin.murphy@....com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "alexander.duyck@...il.com" <alexander.duyck@...il.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxarm@...wei.com" <linuxarm@...wei.com>
Subject: Re: [PATCH v9 2/4] PCI: Disable PCIe Relaxed Ordering if unsupported

| From: Bjorn Helgaas <helgaas@...nel.org>
| Sent: Tuesday, August 8, 2017 7:22 PM
| ...
| and the caller should do something like this:
| 
|     if (pcie_relaxed_ordering_broken(pci_find_pcie_root_port(pdev)))
|         adapter->flags |= ROOT_NO_RELAXED_ORDERING;
| 
| That way it's obvious where the issue is, and it's obvious that the
| answer might be different for peer-to-peer transactions than it is for
| transactions to the root port, i.e., to coherent memory.
| ...

Which is back to something very close to what I initially suggested in my
first patch submission.  Because you're right, this isn't about broken
Source Devices, it's about broken Completing Devices.

Unfortunately, as Alexander Duyck noted, in a Virtual Machine we won't be
able to see our Root Port in order to make this determination.  (And in some
Hypervisor implementations I've seen, there's not even a synthetic Root Port
available to the VM at all, let alone read-only access to the real one.)

So the current scheme was developed of having the Hypervisor kernel traverse
down the PCIe Fabric when it finds a broken Root Port implementation (the
issue that we've mostly been primarily focused upon), and turning off the
PCIe Capability Device Control[Relaxed Ordering Enable].  This was to serve
two purposes:

 1. Turn that off in order to prevent sending any Transaction Layer
    Packets with the Relaxed Ordering Attribute to any Completer.
    Which unfortunately would also prevent Peer-to-Peer use of the
    Relaxed Ordering Attribute.

 2. Act as a message to Device Drivers for those downstream devices
    that the they were dealing with a broken Root Port implementation.
    And this would work even for a driver in a VM with an attached
    device since it would be able to see the PCIe Configuration Space
    for the attached device.

I haven't been excited about any of this because:

 A. While so far all of the examples we've talked about are broken
    Root Port Completers, it's perfectly possible that other devices
    could be broken -- say an NVMe Device which is not "Coherent
    Memory".  How would this fit into the framework APIs being
    described?

 B. I have yet to see a n example of how the currently proposed
    API framework would be used in a hybrid environment where
    TLPs to the Root Port would not use Relaxed Ordering, but TLPs
    to a Peer would use Relaxed Ordering.  So far its all been about
    using a "big hammer" to completely disable the use of Relaxed
    Ordering.

But the VM problem keeps cropping up over and over.  A driver in a VM
doesn't have access to the Root Port to determine if its on a "Black List"
and our only way of communicating with the driver in the VM is to leave the
device in a particular state (i.e. initialize the PCIe Capability Device
Control[Relaxed Ordering Enable] to "off").

Oh, and also, on the current patch submission's focus on broken Root Port
implementations: one could suggest that even if we're stuck with the "Device
attached to a VM Conundrum", that what we should really be thinking about is
if ANY device within a PCIe Fabric has broken Relaxed Ordering completion
problems, and, if so, "poisoning" the entire containing PCIe Fabric by
turning off Relaxed Ordering Enable for every device, up, down sideways --
including the Root Port itself.

| ...
| This associates the message with the Requester that may potentially
| use relaxed ordering. But there's nothing wrong or unusual about the
| Requester; the issue is with the *Completer*, so I think the message
| should be in the quirk where we set PCI_DEV_FLAGS_NO_RELAXED_ORDERING.
| Maybe it should be both places; I dunno.
| 
| This implementation assumes the device only initiates transactions to
| coherent memory, i.e., it assumes the device never does peer-to-peer
| DMA.  I guess we'll have to wait and see if we trip over any
| peer-to-peer issues, then figure out how to handle them.
| ...

Yes, as soon as we want to implement the hybrid use of Relaxed Ordering I
mentioned above.  And that the Intel document mentions itself.

Casey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ