linux-kernel - Re: [PATCH] can: ti_hecc: use timestamp based rx-offloading

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04bdda38-79fa-c266-2a3c-1229a1fd8229@pengutronix.de>
Date:   Fri, 26 Jul 2019 13:48:50 +0200
From:   Marc Kleine-Budde <mkl@...gutronix.de>
To:     Jeroen Hofstee <jhofstee@...tronenergy.com>,
        "linux-can@...r.kernel.org" <linux-can@...r.kernel.org>
Cc:     Anant Gole <anantgole@...com>, AnilKumar Ch <anilkumar@...com>,
        Wolfgang Grandegger <wg@...ndegger.com>,
        "David S. Miller" <davem@...emloft.net>,
        "open list:NETWORKING DRIVERS" <netdev@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] can: ti_hecc: use timestamp based rx-offloading

On 4/29/19 2:03 PM, Jeroen Hofstee wrote:
> As already mentioned in [1] and included in [2], there is an off by one
> issue since the high bank is already enabled when the _next_ mailbox to
> be read has index 12, so the mailbox being read was 13. The message can
> therefore go into mailbox 31 and the driver will be repolled until the
> mailbox 12 eventually receives a msg. Or the message might end up in the
> 12th mailbox, but then it would become disabled after reading it and only
> be enabled again in the next "round" after mailbox 13 was read, which can
> cause out of order messages, since the lower priority mailboxes can
> accept messages in the meantime.
> 
> As mentioned in [3] there is a hardware race condition when changing the
> CANME register while messages are being received. Even when including a
> busy poll on reception, like in [2] there are still overflows and out of
> order messages at times, but less then without the busy loop polling.
> Unlike what the patch suggests, the polling time is not in the microsecond
> range, but takes as long as a current CAN bus reception needs to finish,
> so typically more in the fraction of millisecond range. Since the timeout
> is in jiffies it won't timeout.
> 
> Even with these additional fixes the driver is still not able to provide a
> proper FIFO which doesn't drop packages. So change the driver to use
> rx-offload and base order on timestamp instead of message box numbers. As
> a side affect, this also fixes [4] and [5].
> 
> Before this change messages with a single byte counter were dropped /
> received out of order at a bitrate of 250kbit/s on an am3517. With this
> patch that no longer occurs up to and including 1Mbit/s.
> 
> [1] https://linux-can.vger.kernel.narkive.com/zgO9inVi/patch-can-ti-hecc-fix-rx-wrong-sequence-issue#post6
> [2] http://arago-project.org/git/projects/?p=linux-omap3.git;a=commit;h=02346892777f07245de4d5af692513ebd852dcb2
> [3] https://linux-can.vger.kernel.narkive.com/zgO9inVi/patch-can-ti-hecc-fix-rx-wrong-sequence-issue#post5
> [4] https://patchwork.ozlabs.org/patch/895956/
> [5] https://www.spinics.net/lists/netdev/msg494971.html
> 
> Cc: Anant Gole <anantgole@...com>
> Cc: AnilKumar Ch <anilkumar@...com>
> Signed-off-by: Jeroen Hofstee <jhofstee@...tronenergy.com>

[...]

> @@ -744,8 +652,8 @@ static irqreturn_t ti_hecc_interrupt(int irq, void *dev_id)
>  	struct net_device *ndev = (struct net_device *)dev_id;
>  	struct ti_hecc_priv *priv = netdev_priv(ndev);
>  	struct net_device_stats *stats = &ndev->stats;
> -	u32 mbxno, mbx_mask, int_status, err_status;
> -	unsigned long ack, flags;
> +	u32 mbxno, mbx_mask, int_status, err_status, stamp;
> +	unsigned long flags, rx_pending;
>  
>  	int_status = hecc_read(priv,
>  		(priv->use_hecc1int) ? HECC_CANGIF1 : HECC_CANGIF0);
> @@ -769,11 +677,11 @@ static irqreturn_t ti_hecc_interrupt(int irq, void *dev_id)
>  			spin_lock_irqsave(&priv->mbx_lock, flags);
>  			hecc_clear_bit(priv, HECC_CANME, mbx_mask);
>  			spin_unlock_irqrestore(&priv->mbx_lock, flags);
> -			stats->tx_bytes += hecc_read_mbx(priv, mbxno,
> -						HECC_CANMCF) & 0xF;
> +			stamp = hecc_read_stamp(priv, mbxno);
> +			stats->tx_bytes += can_rx_offload_get_echo_skb(&priv->offload,
> +									mbxno, stamp);
>  			stats->tx_packets++;
>  			can_led_event(ndev, CAN_LED_EVENT_TX);
> -			can_get_echo_skb(ndev, mbxno);
>  			--priv->tx_tail;
>  		}
>  
> @@ -784,12 +692,11 @@ static irqreturn_t ti_hecc_interrupt(int irq, void *dev_id)
>  		((priv->tx_head & HECC_TX_MASK) == HECC_TX_MASK)))
>  			netif_wake_queue(ndev);
>  
> -		/* Disable RX mailbox interrupts and let NAPI reenable them */
> -		if (hecc_read(priv, HECC_CANRMP)) {
> -			ack = hecc_read(priv, HECC_CANMIM);
> -			ack &= BIT(HECC_MAX_TX_MBOX) - 1;
> -			hecc_write(priv, HECC_CANMIM, ack);
> -			napi_schedule(&priv->napi);
> +		/* offload RX mailboxes and let NAPI deliver them */
> +		while ((rx_pending = hecc_read(priv, HECC_CANRMP))) {
> +			can_rx_offload_irq_offload_timestamp(&priv->offload,
> +							     rx_pending);
> +			hecc_write(priv, HECC_CANRMP, rx_pending);

Can prepare a patch to move the RMP writing into the mailbox_read()
function. This makes the mailbox available a bit earlier and works
better for memory pressure situations, where no skb can be allocated.

>  		}
>  	}

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)