[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87fsr54tw1.ffs@tglx>
Date: Mon, 06 Dec 2021 16:47:58 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Logan Gunthorpe <logang@...tatee.com>,
LKML <linux-kernel@...r.kernel.org>,
Bjorn Helgaas <helgaas@...nel.org>,
Marc Zygnier <maz@...nel.org>,
Alex Williamson <alex.williamson@...hat.com>,
Kevin Tian <kevin.tian@...el.com>,
Megha Dey <megha.dey@...el.com>,
Ashok Raj <ashok.raj@...el.com>, linux-pci@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jon Mason <jdmason@...zu.us>,
Dave Jiang <dave.jiang@...el.com>,
Allen Hubbe <allenbh@...il.com>, linux-ntb@...glegroups.com,
linux-s390@...r.kernel.org, Heiko Carstens <hca@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>, x86@...nel.org,
Joerg Roedel <jroedel@...e.de>,
iommu@...ts.linux-foundation.org, Kalle Valo <kvalo@...eaurora.org>
Subject: Re: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()
Jason,
On Mon, Dec 06 2021 at 10:43, Jason Gunthorpe wrote:
> On Sun, Dec 05, 2021 at 03:16:40PM +0100, Thomas Gleixner wrote:
>> > That's not really a good idea because dev->irqdomain is a generic
>> > mechanism and not restricted to the PCI use case. Special casing it for
>> > PCI is just wrong. Special casing it for all use cases just to please
>> > PCI is equally wrong. There is a world outside of PCI and x86.
>>
>> That argument is actually only partially correct.
>
> I'm not sure I understood your reply? I think we are both agreeing
> that dev->irqdomain wants to be a generic mechanism?
Yes. I managed to confuse myself there by being too paranoid about how
to distinguish things on platforms which need to support both ways, i.e.
x86 when XEN is enabled.
> I'd say that today we've special cased it to handle PCI. IMHO that is
> exactly what pci_msi_create_irq_domain() is doing - it replaces the
> chip ops with ops that can *ONLY* do PCI MSI and so dev->irqdomain
> becomes PCI only and non-generic.
Right. See above. That's why I went back to my notes, did some more
research ...
>> 2) Guest support is strictly opt-in
>>
>> The underlying architecture/subarchitecture specific irqdomain has
>> to detect at setup time (eventually early boot), whether the
>> underlying hypervisor supports it.
>>
>> The only reasonable way to support that is the availability of
>> interrupt remapping via vIOMMU, as we discussed before.
>
> This is talking about IMS specifically because of the legacy issue
> where the MSI addr/data pair inside a guest is often completely fake?
This is about IMS, right. PCI/MSI[x] is handled today because the writes
to the MSI/MSI-X message store can be trapped.
>> That does not work in all cases due to architecture and host
>> controller constraints, so we might end up with:
>>
>> VECTOR -> IOMMU -> SHIM -> PCI/[MSI/MSI-X/IMS] domains
>
> OK - I dont' know enough about the architecture/controller details to
> imagine what SHIM is, but if it allows keeping the PCI code as purely
> PCI code, then great
It's today part of the arch/subarch specific PCI/MSI domain to deal with
quirks above the IOMMU level. As we can't proliferate that into the new
endpoint domain, that needs to be done as a shim layer in between which
has no real other functionality than applying the quirks. Yes, it's all
pretty. Welcome to my wonderful world.
>> - The irqchip callbacks which can be implemented by these top
>> level domains are going to be restricted.
>
> OK - I think it is great that the driver will see a special ops struct
> that is 'ops for device's MSI addr/data pair storage'. It makes it
> really clear what it is
It will need some more than that, e.g. mask/unmask and as we discussed
quite some time ago something like the irq_buslock/unlock pair, so you
can handle updates to the state from thread context via a command queue
(IIRC).
>> - For the irqchip callbacks which are allowed/required the rules
>> vs. following down the hierarchy need to be defined and
>> enforced.
>
> The driver should be the ultimate origin of the interrupt so it is
> always end-point in the hierarchy, opposite the CPU?
>
> I would hope the driver doesn't have an exposure to hierarchy?
No.
> So we have a new concept: 'device MSI storage ops'
>
> Put them along with the xarray holding the msi_descs and you've got my
> msi_table :)
Hehe.
>> Sorry Jason, no tables for you. :)
>
> How does the driver select with 'device MSI storage ops' it is
> requesting a MSI for ?
Via some cookie, reference whatever as discussed in the other
mail. We'll bikeshed the naming once I get there :)
>> 1) I'm going to post part 1-3 of the series once more with the fallout
>> and review comments addressed.
>
> OK, I didn't see anything in there that was making anything harder in
> this direction
It's helping to keep the existing stuff including the !irqdomain parts
sufficiently self contained so I can actually change the inner workings
of msi domains without going back to any of these places (hopefully).
>> 5) Implement an IMS user.
>>
>> The obvious candidate which should be halfways accessible is the
>> ath11 PCI driver which falls into that category.
>
> Aiiee:
Yes.
> drivers/net/wireless/ath/ath11k/pci.c: ab_pci->msi_ep_base_data = msi_desc->msg.data;
That's only one part of it. Look how the address is retrieved.
Thanks,
tglx
Powered by blists - more mailing lists