[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200709182011.GQ23676@nvidia.com>
Date: Thu, 9 Jul 2020 15:20:11 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Jonathan Lemon <jonathan.lemon@...il.com>
CC: Bjorn Helgaas <helgaas@...nel.org>, Aya Levin <ayal@...lanox.com>,
"David Miller" <davem@...emloft.net>, <kuba@...nel.org>,
<saeedm@...lanox.com>, <mkubecek@...e.cz>,
<linux-pci@...r.kernel.org>, <netdev@...r.kernel.org>,
<tariqt@...lanox.com>, <alexander.h.duyck@...ux.intel.com>
Subject: Re: [net-next 10/10] net/mlx5e: Add support for PCI relaxed ordering
On Thu, Jul 09, 2020 at 10:35:50AM -0700, Jonathan Lemon wrote:
> On Wed, Jul 08, 2020 at 08:26:02PM -0300, Jason Gunthorpe wrote:
> > On Wed, Jul 08, 2020 at 06:16:30PM -0500, Bjorn Helgaas wrote:
> > > I suspect there may be device-specific controls, too, because [1]
> > > claims to enable/disable Relaxed Ordering but doesn't touch the
> > > PCIe Device Control register. Device-specific controls are
> > > certainly allowed, but of course it would be up to the driver, and
> > > the device cannot generate TLPs with Relaxed Ordering unless the
> > > architected PCIe Enable Relaxed Ordering bit is *also* set.
> >
> > Yes, at least on RDMA relaxed ordering can be set on a per transaction
> > basis and is something userspace can choose to use or not at a fine
> > granularity. This is because we have to support historical
> > applications that make assumptions that data arrives in certain
> > orders.
> >
> > I've been thinking of doing the same as this patch but for RDMA kernel
> > ULPs and just globally turn it on if the PCI CAP is enabled as none of
> > our in-kernel uses have the legacy data ordering problem.
>
> If I'm following this correctly - there are two different controls being
> discussed here:
>
> 1) having the driver request PCI relaxed ordering, which may or may
> not be granted, based on other system settings, and
This is what Bjorn was thinking about, yes, it is some PCI layer
function to control the global config space bit.
> 2) having the driver set RO on the transactions it initiates, which
> are honored iff the PCI bit is set.
>
> It seems that in addition to the PCI core changes, there still is a need
> for driver controls? Unless the driver always enables RO if it's capable?
I think the PCI spec imagined that when the config space RO bit was
enabled the PCI device would just start using RO packets, in an
appropriate and device specific way.
So the fine grained control in #2 is something done extra by some
devices.
IMHO if the driver knows it is functionally correct with RO then it
should enable it fully on the device when the config space bit is set.
I'm not sure there is a reason to allow users to finely tune RO, at
least I haven't heard of cases where RO is a degredation depending on
workload.
If some platform doesn't work when RO is turned on then it should be
globally black listed like is already done in some cases.
If the devices has bugs and uses RO wrong, or the driver has bugs and
is only stable with !RO and Intel, then the driver shouldn't turn it
on at all.
In all of these cases it is not a user tunable.
Development and testing reasons, like 'is my crash from a RO bug?' to
tune should be met by the device global setpci, I think.
Jason
Powered by blists - more mailing lists