lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Jan 2020 09:58:29 +0200
From:   Ramon Fried <rfried.dev@...il.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     hkallweit1@...il.com, Bjorn Helgaas <bhelgaas@...gle.com>,
        maz@...nel.org, lorenzo.pieralisi@....com,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: MSI irqchip configured as IRQCHIP_ONESHOT_SAFE causes spurious IRQs

On Thu, Jan 16, 2020 at 3:39 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Ramon,
>
> Ramon Fried <rfried.dev@...il.com> writes:
> > On Wed, Jan 15, 2020 at 12:54 AM Thomas Gleixner <tglx@...utronix.de> wrote:
> >> Ramon Fried <rfried.dev@...il.com> writes:
> >> Due to the semantics of MSI this is perfectly fine and aside of your
> >> problem this has worked perfectly fine so far and it's an actual
> >> performance win because it avoid fiddling with the MSI mask which is
> >> slow.
> >>
> > fiddling with MSI masks is a configuration space write, which is
> > non-posted, so it does come with a price.
> > The question is if a test was ever conducted to see the it's better
> > than spurious IRQ's.
>
> The point is that there are no spurious interrupts in the sane cases and
> the tests we did showed a real performance improvements in high
> frequency interrupt situations due to avoiding the config space access.
>
> Please stop claiming that this spurious interrupt problem is there by
> design. It's not. Read the MSI spec.
>
> Also boot your laptop/workstation with 'threadirqs' on the kernel
> command line and check how many spurious interrupts come in. On a test
> machine which has that command line parameter set I see exactly ONE with
> an uptime of several days and heavy MSI interrupt activity. The ONE is
> even there without 'threadirqs' on the command line, so I really can't
> be bothered to analyze that.
>
> >> You still have not told which driver/hardware is affected by this. Can
> >> you please provide that information so we can finally look at the actual
> >> hardware/driver combo?
> >>
> > Sure,
> > I'm writing an MSI IRQ controller, it's basically a MIPS GIC interrupt
> > line which several MSI are multiplexed on it.
>
> I assume you write the driver, not the VHDL for the actual hardware,
> right? If so, you still did not tell which hardware that is and where we
> can find information about it.
There's no official information I can share but I can explain how it works:
Basically, 32 MSI vectors are represented by a single GIC irq.
There's a status registers which every bit correspond to an MSI vector, and
individual MSI needs to be acked on that registers. in any case where
there's asserted bit
the GIC IRQ level is high.

>
> I further assume that 'multiplexed' means that the hardware is something
> like an MSI receiver on the CPU/chipset which handles multiple MSI
> messages and forwards them to a single shared interrupt line on the MIPS
> GIC. Right?
Yes.
>
> Can you please provide a pointer to the hardware documentation?
There's no official documentation for that.
>
> > It's configured with handle_level_irq() as the GIC is level IRQ.
>
> Which is completely bonkers. MSI has edge semantics and sharing an
> interrupt line for edge type interrupts is broken by design, unless the
> hardware which handles the incoming MSIs and forwards them to the level
> type interrupt line is designed properly and the driver does the right
> thing.
Yes, the design of the HW is sort of broken. I concur.
>
> > The ack callback acks the GIC irq.  the mask/unmask calls
> > pci_msi_mask_irq() / pci_msi_unmask_irq()
>
> What? How is that supposed to work with multiple MSIs?
Acking is per MSI vector as I described above, so it should work.
>
> Either the hardware is a trainwreck or the driver or both.
>
> I can't tell as I can't find my crystal ball. Maybe I should replace it
> with an Mobileye :)
:)
>
> Thanks,
>
>         tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ