[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230711-refusing-derby-9a9d4d255d30-mkl@pengutronix.de>
Date: Tue, 11 Jul 2023 19:05:19 +0200
From: Marc Kleine-Budde <mkl@...gutronix.de>
To: Mike Looijmans <mike.looijmans@...ic.nl>
Cc: linux-can@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Manivannan Sadhasivam <mani@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Thomas Kopp <thomas.kopp@...rochip.com>,
Wolfgang Grandegger <wg@...ndegger.com>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH] can: mcp251xfd: Always stop on BUS_OFF and call
netif_stop_queue
On 11.07.2023 17:26:47, Mike Looijmans wrote:
> When there's an error attempting to store the BER counter, don't abort
> but continue shutting down the chip as required.
If you refer to an error by __mcp251xfd_get_berr_counter(), it's not a
store error, but a failure of regmap_read(priv->map_reg,
MCP251XFD_REG_TREC, &trec). By default the SPI transfers are CRC enabled
and no/wrong data from the chip will be detected and return an error
here (after 3 retires). In out of memory conditions or other kernel
errors might be possible here, too.
Have you seen a problem here? But as we shut down the chip here anyways,
we can ignore the error here.
> After disabling communications, also stop the TX queue with a call to
> netif_stop_queue.
can_bus_off() will call netif_carrier_off(), isn't this sufficient? Have
you enabled automatic restart in case of bus off? I think the netdev
watchdog will kick you, if the interface has a stooped queue for too
long (IIRC 5 seconds).
> When the interface restarts, mcp251xfd_set_mode will
> call netif_wake_queue and resume.
>
> This fixes a hangup in either send() or poll() from userspace. To
> reproduce: I ran "cansequence can0 -p" to flood the system with packets.
> While running that, I shorted the CAN signals, causing a bus error.
> Usually communications would resume after the CAN bus restarted, but
> sometimes the system got stuck and refused to send any more packets.
> The sending process would be in either poll() or send(), waiting for
> the queue to resume. To "unstuck" the process from send() it was
> sufficient to send any packet on the can interface from another
> process. To get it out of the poll() hang, only bringing the can
> interface down (and up) would work.
>
> After adding the netif_stop_queue call, I was unable to reproduce the
> problem.
The newly added netif_stop_queue() will cause the netif_wake_queue() in
the mcp251xfd_set_mode() to actually wake the queue. If you observe a
problem, I think it's a general problem, so all drivers would be
effected.
> Signed-off-by: Mike Looijmans <mike.looijmans@...ic.nl>
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung Nürnberg | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-9 |
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists