lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <134ddb8e-a231-922d-f554-ca77ce0c16af@pengutronix.de>
Date:   Wed, 9 Oct 2019 15:18:12 +0200
From:   Marc Kleine-Budde <mkl@...gutronix.de>
To:     Jeroen Hofstee <jhofstee@...tronenergy.com>,
        "linux-can@...r.kernel.org" <linux-can@...r.kernel.org>
Cc:     Wolfgang Grandegger <wg@...ndegger.com>,
        "David S. Miller" <davem@...emloft.net>,
        "open list:NETWORKING DRIVERS" <netdev@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/7] can: rx-offload: continue on error

Hello Jeroen,

I'm currently looking at the rx-offload error handling in detail.

TLDR: I've added the patch to linux-can.

Thanks,
Marc

For the record, the details:

On 9/24/19 8:45 PM, Jeroen Hofstee wrote:
> While can_rx_offload_offload_one will call mailbox_read to discard
> the mailbox in case of an error,

Yes.

can_rx_offload_offload_one() will read into the discard mailbox in case
of an error.

Currently there are two kinds of errors:
1) the rx-offoad skb queue (between the IRQ handler and the NAPI)
   is full
2) alloc_can_skb() fails to allocate a skb, due to OOM

> can_rx_offload_irq_offload_timestamp bails out in the error case.

Yes, in case of an error both can_rx_offload_irq_offload_timestamp() and
can_rx_offload_irq_offload_fifo() will stop reading mailboxes, add the
already filled skbs to the queue and schedule NAPI if needed.

Currently both can_rx_offload_irq_offload_timestamp() and
can_rx_offload_irq_offload_fifo() will return the number of queued
mailboxes.

This means in case of queue overflow or OOM, only one mailbox is
discaded, before can_rx_offload_irq_offload_*() returning the number of
successfully queued mailboxes so far.

Looking at the flexcan driver:

https://elixir.bootlin.com/linux/latest/source/drivers/net/can/flexcan.c#L867

> 		while ((reg_iflag = flexcan_read_reg_iflag_rx(priv))) {
> 			handled = IRQ_HANDLED;
> 			ret = can_rx_offload_irq_offload_timestamp(&priv->offload,
> 								   reg_iflag);
> 			if (!ret)
> 				break;
> 		}
[...]
> 		if (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE) {
> 			handled = IRQ_HANDLED;
> 			can_rx_offload_irq_offload_fifo(&priv->offload);
> 		}

This means for the timestamp mode, at least one mailbox is discarded or
if the error occurred after reading one or more mailboxes the while loop
will try again. If the error persists a second mailbox is discarded.

For the fifo mode, only one mailbox is discarded.

Then the flexcan's IRQ is exited. If there are messages in mailboxes are
pending another IRQ is triggered.... I doubt that this is a good idea.

On the other hand the ti_hecc driver:

> 		/* offload RX mailboxes and let NAPI deliver them */
> 		while ((rx_pending = hecc_read(priv, HECC_CANRMP))) {
> 			can_rx_offload_irq_offload_timestamp(&priv->offload,
> 							     rx_pending);
> 			hecc_write(priv, HECC_CANRMP, rx_pending);
> 		}

The error is ignored and the _all_ mailboxes are discarded (given the
error persists).

> Since it is typically called from a while loop, all message will
> eventually be discarded. So lets continue on error instead to discard
> them directly.

After reading my own code and writing it up, your patch totally makes sense.

If there is a shortage of resources, queue overrun or OOM, it makes no
sense to return from the IRQ handler, if a mailbox is still active as it
will trigger the IRQ again. Entering the IRQ handler again probably
doesn't give the system time to recover from the resource problem.

> Signed-off-by: Jeroen Hofstee <jhofstee@...tronenergy.com>
> ---
>  drivers/net/can/rx-offload.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/can/rx-offload.c b/drivers/net/can/rx-offload.c
> index e6a668ee7730..39df41280e2d 100644
> --- a/drivers/net/can/rx-offload.c
> +++ b/drivers/net/can/rx-offload.c
> @@ -158,7 +158,7 @@ int can_rx_offload_irq_offload_timestamp(struct can_rx_offload *offload, u64 pen
>  
>  		skb = can_rx_offload_offload_one(offload, i);
>  		if (!skb)
> -			break;
> +			continue;
>  
>  		__skb_queue_add_sort(&skb_queue, skb, can_rx_offload_compare);
>  	}
> 

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ