[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a6l8qwql.wl-maz@kernel.org>
Date: Mon, 23 Aug 2021 11:30:26 +0100
From: Marc Zyngier <maz@...nel.org>
To: Barry Song <21cnbao@...il.com>
Cc: Bjorn Helgaas <helgaas@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Jonathan Corbet <corbet@....net>, Jonathan.Cameron@...wei.com,
bilbao@...edu, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
leon@...nel.org, LKML <linux-kernel@...r.kernel.org>,
linux-pci@...r.kernel.org, Linuxarm <linuxarm@...wei.com>,
luzmaximilian@...il.com, mchehab+huawei@...nel.org,
schnelle@...ux.ibm.com, Barry Song <song.bao.hua@...ilicon.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 1/2] PCI/MSI: Fix the confusing IRQ sysfs ABI for MSI-X
On Sat, 21 Aug 2021 23:14:35 +0100,
Barry Song <21cnbao@...il.com> wrote:
>
> On Sat, Aug 21, 2021 at 10:42 PM Marc Zyngier <maz@...nel.org> wrote:
> >
> > Hi Bjorn,
> >
> > On Sat, 21 Aug 2021 00:33:28 +0100,
> > Bjorn Helgaas <helgaas@...nel.org> wrote:
> > >
[...]
> > > In msix_setup_entries(), we get nvecs msi_entry structs, and we
> > > get a saved .default_irq in each one?
> >
> > That's a key point.
> >
> > Old-school PCI/MSI is represented by a single interrupt, and you
> > *could* somehow make it relatively easy for drivers that only
> > understand INTx to migrate to MSI if you replaced whatever is held in
> > dev->irq (which should only represent the INTx mapping) with the MSI
> > interrupt number. Which I guess is what the MSI code is doing.
> >
> > This is the 21st century, and nobody should ever rely on such horror,
> > but I'm sure we do have such drivers in the tree. Boo.
> >
> > However, this *cannot* hold true for Multi-MSI, nor MSI-X, because
> > there is a plurality of interrupts. Even worse, for MSI-X, there is
> > zero guarantee that the allocated interrupts will be in a contiguous
> > space.
> >
> > Given that, what is dev->irq good for? "Absolutely Nothing! (say it
> > again!)".
> >
>
> The only thing is that dev->irq is an sysfs ABI to userspace. Due to
> the inconsistency between legacy PCI INTx, MSI, MSI-X, this ABI
> should have been absolutely broken nowadays. This is actually what
> the patchset was originally aiming at to fix.
I do not think we should expose more of a broken abstraction to
userspace. We will have to carry on exposing the first MSI in this
field forever, but it doesn't mean we should have to do it for MSI-X.
> One more question from me is that does dev->irq actually hold any
> valid hardware INTx information while hardware is using MSI-X? At
> least in my hardware, sysfs ABI for PCI is all "0".
That's probably because nothing actually configured the interrupt, or
that there is no INTx implementation. I have that on systems with
pretty dodgy (or incomplete) firmware.
> root@...ntu:/sys/devices/pci0000:7c/0000:7c:00.0/0000:7d:00.3# cat irq
> 0
>
> root@...ntu:/sys/devices/pci0000:7c/0000:7c:00.0/0000:7d:00.3# ls -l msi_irqs/*
> -r--r--r-- 1 root root 4096 Aug 21 22:04 msi_irqs/499
> -r--r--r-- 1 root root 4096 Aug 21 22:04 msi_irqs/500
> -r--r--r-- 1 root root 4096 Aug 21 22:04 msi_irqs/501
> ...
> root@...ntu:/sys/devices/pci0000:7c/0000:7c:00.0/0000:7d:00.3# cat msi_irqs/499
> msix
>
> Not quite sure how it is going on different hardware platforms.
My D05 does that as well, and it doesn't expose any INTx support.
>
> > MSI-X is not something you can "accidentally" use. You have to
> > actively embrace it. In all honesty, this patch tries to move in the
> > wrong direction. If anything, we should kill this hack altogether and
> > fix the (handful of?) drivers that rely on it. That'd actually be a
> > good way to find whether they are still worth keeping in the tree. And
> > if it breaks too many of them, then at least we'll know where we
> > stand.
> >
> > I'd be tempted to leave the below patch simmer in -next for a few
> > weeks and see if how many people shout:
>
> This looks like a more proper direction to go.
> but here i am wondering how sysfs ABI document should follow the below change
> doc is patch 2/2:
> https://lore.kernel.org/lkml/20210820223744.8439-3-21cnbao@gmail.com/
>
> On the other hand, my feeling is that nobody should depend on sysfs
> irq entry nowadays.
Too late. It is there, and we need to preserve it. I just don't think
feeding it more erroneous information is the right thing to do.
My patch was only dealing with the kernel side of things, not the
userspace ABI. That ABI should be carried on unchanged.
> For example, userspace irqbalance is actually using
> /sys/devices/.../msi_irqs/ So probably we should set this ABI
> invisible when devices are using MSI or MSI-X?
Can it actually be made optional? I don't believe we can.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists