lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 14 Jul 2020 13:47:15 +0300
From:   Aya Levin <ayal@...lanox.com>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     David Miller <davem@...emloft.net>, kuba@...nel.org,
        saeedm@...lanox.com, mkubecek@...e.cz, linux-pci@...r.kernel.org,
        netdev@...r.kernel.org, tariqt@...lanox.com,
        alexander.h.duyck@...ux.intel.com, Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [net-next 10/10] net/mlx5e: Add support for PCI relaxed ordering



On 7/9/2020 2:16 AM, Bjorn Helgaas wrote:
> On Sun, Jul 08, 2040 at 11:22:12AM +0300, Aya Levin wrote:
>> On 7/6/2020 10:49 PM, David Miller wrote:
>>> From: Aya Levin <ayal@...lanox.com>
>>> Date: Mon, 6 Jul 2020 16:00:59 +0300
>>>
>>>> Assuming the discussions with Bjorn will conclude in a well-trusted
>>>> API that ensures relaxed ordering in enabled, I'd still like a method
>>>> to turn off relaxed ordering for performance debugging sake.
>>>> Bjorn highlighted the fact that the PCIe sub system can only offer a
>>>> query method. Even if theoretically a set API will be provided, this
>>>> will not fit a netdev debugging - I wonder if CPU vendors even support
>>>> relaxed ordering set/unset...
>>>> On the driver's side relaxed ordering is an attribute of the mkey and
>>>> should be available for configuration (similar to number of CPU
>>>> vs. number of channels).
>>>> Based on the above, and binding the driver's default relaxed ordering
>>>> to the return value from pcie_relaxed_ordering_enabled(), may I
>>>> continue with previous direction of a private-flag to control the
>>>> client side (my driver) ?
>>>
>>> I don't like this situation at all.
>>>
>>> If RO is so dodgy that it potentially needs to be disabled, that is
>>> going to be an issue not just with networking devices but also with
>>> storage and other device types as well.
>>>
>>> Will every device type have a custom way to disable RO, thus
>>> inconsistently, in order to accomodate this?
>>>
>>> That makes no sense and is a terrible user experience.
>>>
>>> That's why the knob belongs generically in PCI or similar.
>>>
>> Hi Bjorn,
>>
>> Mellanox NIC supports relaxed ordering operation over DMA buffers.
>> However for debug prepossess we must have a chicken bit to disable
>> relaxed ordering on a specific system without effecting others in
>> run-time. In order to meet this requirement, I added a netdev
>> private-flag to ethtool for set RO API.
>>
>> Dave raised a concern regarding embedding relaxed ordering set API
>> per system (networking, storage and others). We need the ability to
>> manage relaxed ordering in a unify manner. Could you please define a
>> PCI sub-system solution to meet this requirement?
> 
> I agree, this is definitely a mess.  Let me just outline what I think
> we have today and what we're missing.
> 
>    - On the hardware side, device use of Relaxed Ordering is controlled
>      by the Enable Relaxed Ordering bit in the PCIe Device Control
>      register (or the PCI-X Command register).  If set, the device is
>      allowed but not required to set the Relaxed Ordering bit in
>      transactions it initiates (PCIe r5.0, sec 7.5.3.4; PCI-X 2.0, sec
>      7.2.3).
> 
>      I suspect there may be device-specific controls, too, because [1]
>      claims to enable/disable Relaxed Ordering but doesn't touch the
>      PCIe Device Control register.  Device-specific controls are
>      certainly allowed, but of course it would be up to the driver, and
>      the device cannot generate TLPs with Relaxed Ordering unless the
>      architected PCIe Enable Relaxed Ordering bit is *also* set.
> 
>    - Platform firmware can enable Relaxed Ordering for a device either
>      before handoff to the OS or via the _HPX ACPI method.
> 
>    - The PCI core never enables Relaxed Ordering itself except when
>      applying _HPX.
> 
>    - At enumeration-time, the PCI core disables Relaxed Ordering in
>      pci_configure_relaxed_ordering() if the device is below a Root
>      Port that has a quirk indicating an erratum.  This quirk currently
>      includes many Intel Root Ports, but not all, and is an ongoing
>      maintenance problem.
> 
>    - The PCI core provides pcie_relaxed_ordering_enabled() which tells
>      you whether Relaxed Ordering is enabled.  Only used by cxgb4 and
>      csio, which use that information to fill in Ingress Queue
>      Commands.
> 
>    - The PCI core does not provide a driver interface to enable or
>      disable Relaxed Ordering.
> 
>    - Some drivers disable Relaxed Ordering themselves: mtip32xx,
>      netup_unidvb, tg3, myri10ge (oddly, only if CONFIG_MYRI10GE_DCA),
>      tsi721, kp2000_pcie.
> 
>    - Some drivers enable Relaxed Ordering themselves: niu, tegra.
> 
> What are we missing and what should the PCI core do?
> 
>    - Currently the Enable Relaxed Ordering bit depends on what firmware
>      did.  Maybe the PCI core should always clear it during
>      enumeration?
> 
>    - The PCI core should probably have a driver interface like
>      pci_set_relaxed_ordering(dev, enable) that can set or clear the
>      architected PCI-X or PCIe Enable Relaxed Ordering bit.
> 
>    - Maybe there should be a kernel command-line parameter like
>      "pci=norelax" that disables Relaxed Ordering for every device and
>      prevents pci_set_relaxed_ordering() from enabling it.
> 
>      I'm mixed on this because these tend to become folklore about how
>      to "fix" problems and we end up with systems that don't work
>      unless you happen to find the option on the web.  For debugging
>      issues, it might be enough to disable Relaxed Ordering using
>      setpci, e.g., "setpci -s02:00.0 CAP_EXP+8.w=0"
> 
> [1] https://lore.kernel.org/netdev/20200623195229.26411-11-saeedm@mellanox.com/
> 
Hi Bjorn,

Thanks for the detailed reply. From initial testing I can say that 
turning off the relaxed ordering on the PCI (setpci -s02:00.0 
CAP_EXP+8.w=0) is the chicken bit I was looking for.
This lower the risk of depending on pcie_relaxed_ordering_enabled(). I 
will update my patch and resubmit.

Thanks,
Aya

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ