lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240926-resilient-arrogant-limpet-98af37-mkl@pengutronix.de>
Date: Thu, 26 Sep 2024 11:43:13 +0200
From: Marc Kleine-Budde <mkl@...gutronix.de>
To: Matthias Schiffer <matthias.schiffer@...tq-group.com>
Cc: Markus Schneider-Pargmann <msp@...libre.com>, 
	Chandrasekar Ramakrishnan <rcsekar@...sung.com>, Vincent Mailhol <mailhol.vincent@...adoo.fr>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Martin Hundebøll <martin@...nix.com>, "Felipe Balbi (Intel)" <balbi@...nel.org>, 
	Raymond Tan <raymond.tan@...el.com>, Jarkko Nikula <jarkko.nikula@...ux.intel.com>, 
	linux-can@...r.kernel.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux@...tq-group.com
Subject: Re: [PATCH v3 2/2] can: m_can: fix missed interrupts with m_can_pci

On 26.09.2024 11:19:53, Matthias Schiffer wrote:
> On Tue, 2024-09-24 at 08:08 +0200, Markus Schneider-Pargmann wrote:
> > 
> > On Mon, Sep 23, 2024 at 05:32:16PM GMT, Matthias Schiffer wrote:
> > > The interrupt line of PCI devices is interpreted as edge-triggered,
> > > however the interrupt signal of the m_can controller integrated in Intel
> > > Elkhart Lake CPUs appears to be generated level-triggered.
> > > 
> > > Consider the following sequence of events:
> > > 
> > > - IR register is read, interrupt X is set
> > > - A new interrupt Y is triggered in the m_can controller
> > > - IR register is written to acknowledge interrupt X. Y remains set in IR
> > > 
> > > As at no point in this sequence no interrupt flag is set in IR, the
> > > m_can interrupt line will never become deasserted, and no edge will ever
> > > be observed to trigger another run of the ISR. This was observed to
> > > result in the TX queue of the EHL m_can to get stuck under high load,
> > > because frames were queued to the hardware in m_can_start_xmit(), but
> > > m_can_finish_tx() was never run to account for their successful
> > > transmission.
> > > 
> > > To fix the issue, repeatedly read and acknowledge interrupts at the
> > > start of the ISR until no interrupt flags are set, so the next incoming
> > > interrupt will also result in an edge on the interrupt line.
> > > 
> > > Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake")
> > > Signed-off-by: Matthias Schiffer <matthias.schiffer@...tq-group.com>
> > 
> > Just a few comment nitpicks below. Otherwise:
> > 
> > Reviewed-by: Markus Schneider-Pargmann <msp@...libre.com>
> 
> 
> We have received a report that while this patch fixes a stuck queue issue reproducible with cangen,
> the problem has not disappeared with our customer's application. I will hold off sending a new
> version of the patch while we're investigating whether there is a separate issue with the same
> symptoms or the patch is insufficient.
> 
> Patch 1/2 should be good to go and could be applied independently.

Can you post the reproducer here, too. So that we can add it to the
patch or at least reference to it.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ