[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200822230511.GD1152540@nvidia.com>
Date: Sat, 22 Aug 2020 20:05:11 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: LKML <linux-kernel@...r.kernel.org>, <x86@...nel.org>,
Marc Zyngier <maz@...nel.org>, Megha Dey <megha.dey@...el.com>,
Dave Jiang <dave.jiang@...el.com>,
Alex Williamson <alex.williamson@...hat.com>,
"Jacob Pan" <jacob.jun.pan@...el.com>,
Baolu Lu <baolu.lu@...el.com>,
Kevin Tian <kevin.tian@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Joerg Roedel <joro@...tes.org>,
<iommu@...ts.linux-foundation.org>, <linux-hyperv@...r.kernel.org>,
Haiyang Zhang <haiyangz@...rosoft.com>,
"Jon Derrick" <jonathan.derrick@...el.com>,
Lu Baolu <baolu.lu@...ux.intel.com>,
Wei Liu <wei.liu@...nel.org>,
"K. Y. Srinivasan" <kys@...rosoft.com>,
"Stephen Hemminger" <sthemmin@...rosoft.com>,
Steve Wahl <steve.wahl@....com>,
"Dimitri Sivanich" <sivanich@....com>, Russ Anderson <rja@....com>,
<linux-pci@...r.kernel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
"Lorenzo Pieralisi" <lorenzo.pieralisi@....com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
<xen-devel@...ts.xenproject.org>, Juergen Gross <jgross@...e.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
"Stefano Stabellini" <sstabellini@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
On Sat, Aug 22, 2020 at 03:34:45AM +0200, Thomas Gleixner wrote:
> >> One question is whether the device can see partial updates to that
> >> memory due to the async 'swap' of context from the device CPU.
> >
> > It is worse than just partial updates.. The device operation is much
> > more like you'd imagine a CPU cache. There could be copies of the RAM
> > in the device for long periods of time, dirty data in the device that
> > will flush back to CPU RAM overwriting CPU changes, etc.
>
> TBH, that's insane. You clearly want to think about this some
> more. If
I think this general design is around 15 years old, across a healthy
number of silicon generations, and rather a lager number of shipped
devices. People have thought about it :)
> you swap out device state and device control state then you definitly
> want to have regions which are read only from the device POV and never
> written back.
It is not as useful as you'd think - the issue with atomicity of
update still largely prevents doing much useful from the CPU, and to
make any CPU side changes visible a device command would still be
needed to synchronize the internal state to that modified memory.
So, CPU centric updates would cover a very limited number of
operations, and a device command is required anyhow. Little is
actually gained.
> The MSI msg store clearly belongs into that category.
> But that's not restricted to the MSI msg store, there is certainly other
> stuff which never wants to be written back by the device.
To get a design where you'd be able to run everything from a CPU
atomic context that can't trigger a WQ..
New silicon would have to implement some MSI-only 'cache' that can
invalidate entries based on a simple MemWr TLP.
Then the affinity update would write to the host memory, then send a
MemWr to the device to trigger invalidate.
As a silicon design it might work, but it means existing devices can't
be used with this dev_msi. It is also the sort of thing that would
need a standard document to have any hope of multiple vendors fitting
into it. Eg at PCI-SIG or something.
> If you don't do that then you simply can't write to that space from the
> CPU and you have to transport this kind information always via command
> queues.
Yes, exactly. This is part of the architectural design of the device,
has been for a long time. Has positives and negatives.
> > I suppose the core code could provide this as a service? Sort of a
> > varient of the other lazy things above?
>
> Kinda. That needs a lot of thought for the affinity setting stuff
> because it can be called from contexts which do not allow that. It's
> solvable though, but I clearly need to stare at the corner cases for a
> while.
If possible, this would be ideal, as we could use the dev_msi on a big
installed base of existing HW.
I suspect other HW can probably fit into this too as the basic
ingredients should be fairly widespread.
Even a restricted version for situations where affinity does not need
a device update would possibly be interesting (eg x86 IOMMU remap, ARM
GIC, etc)
> OTOH, in normal operation for MSI interrupts (edge type) masking is not
> used at all and just restricted to the startup teardown.
Yeah, at least this device doesn't need masking at runtime, just
startup/teardown and affinity update.
Thanks,
Jason
Powered by blists - more mailing lists