lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 24 Jun 2020 10:34:40 +0300 From: Aya Levin <ayal@...lanox.com> To: Saeed Mahameed <saeedm@...lanox.com>, "kuba@...nel.org" <kuba@...nel.org>, Bjorn Helgaas <helgaas@...nel.org> Cc: "mkubecek@...e.cz" <mkubecek@...e.cz>, "davem@...emloft.net" <davem@...emloft.net>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, Tariq Toukan <tariqt@...lanox.com> Subject: Re: [net-next 10/10] net/mlx5e: Add support for PCI relaxed ordering On 6/24/2020 9:56 AM, Saeed Mahameed wrote: > On Tue, 2020-06-23 at 14:31 -0700, Jakub Kicinski wrote: >> On Tue, 23 Jun 2020 12:52:29 -0700 Saeed Mahameed wrote: >>> From: Aya Levin <ayal@...lanox.com> >>> >>> The concept of Relaxed Ordering in the PCI Express environment >>> allows >>> switches in the path between the Requester and Completer to reorder >>> some >>> transactions just received before others that were previously >>> enqueued. >>> >>> In ETH driver, there is no question of write integrity since each >>> memory >>> segment is written only once per cycle. In addition, the driver >>> doesn't >>> access the memory shared with the hardware until the corresponding >>> CQE >>> arrives indicating all PCI transactions are done. >> > > Hi Jakub, sorry i missed your comments on this patch. > >> Assuming the device sets the RO bits appropriately, right? Otherwise >> CQE write could theoretically surpass the data write, no? >> > > Yes HW guarantees correctness of correlated queues and transactions. > >>> With relaxed ordering set, traffic on the remote-numa is at the >>> same >>> level as when on the local numa. >> >> Same level of? Achievable bandwidth? >> > > Yes, Bandwidth, according the below explanation, i see that the message > needs improvements. > >>> Running TCP single stream over ConnectX-4 LX, ARM CPU on remote- >>> numa >>> has 300% improvement in the bandwidth. >>> With relaxed ordering turned off: BW:10 [GB/s] >>> With relaxed ordering turned on: BW:40 [GB/s] >>> >>> The driver turns relaxed ordering off by default. It exposes 2 >>> boolean >>> private-flags in ethtool: pci_ro_read and pci_ro_write for user >>> control. >>> >>> $ ethtool --show-priv-flags eth2 >>> Private flags for eth2: >>> ... >>> pci_ro_read : off >>> pci_ro_write : off >>> >>> $ ethtool --set-priv-flags eth2 pci_ro_write on >>> $ ethtool --set-priv-flags eth2 pci_ro_read on >> >> I think Michal will rightly complain that this does not belong in >> private flags any more. As (/if?) ARM deployments take a foothold >> in DC this will become a common setting for most NICs. > > Initially we used pcie_relaxed_ordering_enabled() to > programmatically enable this on/off on boot but this seems to > introduce some degradation on some Intel CPUs since the Intel Faulty > CPUs list is not up to date. Aya is discussing this with Bjorn. Adding Bjorn Helgaas > > So until we figure this out, will keep this off by default. > > for the private flags we want to keep them for performance analysis as > we do with all other mlx5 special performance features and flags. >
Powered by blists - more mailing lists