[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 10 Oct 2019 18:04:45 +0000
From: Jeroen Hofstee <jhofstee@...tronenergy.com>
To: Marc Kleine-Budde <mkl@...gutronix.de>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
linux-can <linux-can@...r.kernel.org>
CC: Kurt Van Dijck <dev.kurt@...dijck-laurijssen.be>,
Martin Hundebøll <martin@...nix.com>,
"kernel@...gutronix.de" <kernel@...gutronix.de>,
"davem@...emloft.net" <davem@...emloft.net>
Subject: Re: [PATCH 24/29] can: ti_hecc: add fifo underflow error reporting
Attempt 2, now as plain text... (vger doesn't like html)
On 10/10/19 5:52 PM, Marc Kleine-Budde wrote:
> On 10/10/19 2:17 PM, Marc Kleine-Budde wrote:
>> From: Jeroen Hofstee <jhofstee@...tronenergy.com>
>>
>> When the rx fifo overflows the ti_hecc would silently drop them since
>> the overwrite protection is enabled for all mailboxes. So disable it
>> for the lowest priority mailbox and increment the rx_fifo_errors when
>> receive message lost is set. Drop the message itself in that case,
>> since it might be partially updated.
> Is that your observation or does the data sheet say anything to this
> situation?
I couldn't find in the data sheet, so I simply tested it, by allowing
the highest mailbox to be overwritten and send a stream alternating
with messages will all bits set and all cleared. That does end with
canids from one message combined with data from another.
>> Signed-off-by: Jeroen Hofstee <jhofstee@...tronenergy.com>
>> Signed-off-by: Marc Kleine-Budde <mkl@...gutronix.de>
>> ---
>> drivers/net/can/ti_hecc.c | 21 +++++++++++++++++----
>> 1 file changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/can/ti_hecc.c b/drivers/net/can/ti_hecc.c
>> index 6ea29126c60b..c2d83ada203a 100644
>> --- a/drivers/net/can/ti_hecc.c
>> +++ b/drivers/net/can/ti_hecc.c
>> @@ -82,7 +82,7 @@ MODULE_VERSION(HECC_MODULE_VERSION);
>> #define HECC_CANTA 0x10 /* Transmission acknowledge */
>> #define HECC_CANAA 0x14 /* Abort acknowledge */
>> #define HECC_CANRMP 0x18 /* Receive message pending */
>> -#define HECC_CANRML 0x1C /* Remote message lost */
>> +#define HECC_CANRML 0x1C /* Receive message lost */
>> #define HECC_CANRFP 0x20 /* Remote frame pending */
>> #define HECC_CANGAM 0x24 /* SECC only:Global acceptance mask */
>> #define HECC_CANMC 0x28 /* Master control */
>> @@ -385,8 +385,17 @@ static void ti_hecc_start(struct net_device *ndev)
>> /* Enable tx interrupts */
>> hecc_set_bit(priv, HECC_CANMIM, BIT(HECC_MAX_TX_MBOX) - 1);
>>
>> - /* Prevent message over-write & Enable interrupts */
>> - hecc_write(priv, HECC_CANOPC, HECC_SET_REG);
>> + /* Prevent message over-write to create a rx fifo, but not for
>> + * the lowest priority mailbox, since that allows detecting
>> + * overflows instead of the hardware silently dropping the
>> + * messages. The lowest rx mailbox is one above the tx ones,
>> + * hence its mbxno is the number of tx mailboxes.
>> + */
>> + mbxno = HECC_MAX_TX_MBOX;
>> + mbx_mask = ~BIT(mbxno);
>> + hecc_write(priv, HECC_CANOPC, mbx_mask);
>> +
>> + /* Enable interrupts */
>> if (priv->use_hecc1int) {
>> hecc_write(priv, HECC_CANMIL, HECC_SET_REG);
>> hecc_write(priv, HECC_CANGIM, HECC_CANGIM_DEF_MASK |
>> @@ -531,6 +540,7 @@ static unsigned int ti_hecc_mailbox_read(struct can_rx_offload *offload,
>> {
>> struct ti_hecc_priv *priv = rx_offload_to_priv(offload);
>> u32 data, mbx_mask;
>> + int lost;
>>
>> mbx_mask = BIT(mbxno);
>> data = hecc_read_mbx(priv, mbxno, HECC_CANMID);
>> @@ -552,9 +562,12 @@ static unsigned int ti_hecc_mailbox_read(struct can_rx_offload *offload,
>> }
>>
>> *timestamp = hecc_read_stamp(priv, mbxno);
>> + lost = hecc_read(priv, HECC_CANRML) & mbx_mask;
>> + if (unlikely(lost))
>> + priv->offload.dev->stats.rx_fifo_errors++;
> In the flexcan and at91_can driver we're incrementing the following errors:
> dev->stats.rx_over_errors++;
> dev->stats.rx_errors++;
I understood it as follows, see[1] e.g.:
rx_errors -> link level errors, not really applicable to CAN
(perhaps in single shot mode or if you want (and can) report
retransmissions)
rx_over_errors -> the hardware itself cannot keep up.
Not applicable for CAN.
rx_fifo_errors -> the software driver cannot keep up.
So I picked that one.
rx_dropped -> software is dropping on purpose based on limits etc.
But I might be wrong.
> You can save the register access if you only check for overflows if
> reading from the lowest prio mailbox.
>
> If you're discarding the data if the mailbox is marked as overflow
> there's no need to read the data in the first place.
>
>> hecc_write(priv, HECC_CANRMP, mbx_mask);
>>
>> - return 1;
>> + return !lost;
>> }
>>
>> static int ti_hecc_error(struct net_device *ndev, int int_status,
>>
Mind it that you don't cause a race! The bit can become set
during reading of the data, it should be check _after_ we
have a copy of the mailbox. You can do a double check, one
before one after, but since there should be no fifo overflow
anyway, there is no reason to optimize for that path. (@250k
I cannot get more then 3 messages in the fifo...).
Regards,
Jeroen
[1]
https://community.mellanox.com/s/article/counters-troubleshooting-for-linux-driver
Powered by blists - more mailing lists