lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e9d0d223af785f060103ad67c14272a@agner.ch>
Date:	Tue, 05 Aug 2014 14:38:20 +0200
From:	Stefan Agner <stefan@...er.ch>
To:	Marc Kleine-Budde <mkl@...gutronix.de>
Cc:	shawn.guo@...escale.com, kernel@...gutronix.de,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-can@...r.kernel.org
Subject: Re: [PATCH v3 4/4] can: flexcan: add vf610 support for FlexCAN

Am 2014-08-05 11:52, schrieb Marc Kleine-Budde:
> On 08/04/2014 06:01 PM, Stefan Agner wrote:
>> Am 2014-08-04 16:27, schrieb Marc Kleine-Budde:
>>> On 08/04/2014 03:43 PM, Stefan Agner wrote:
>>> [...]
>>>
>>>>> Thanks for the test, so far looks promising :) With this setup the other
>>>>> CAN node repeats the CAN frame until it's ACKed. Because there is no
>>>>> node with a compatible bitrate, there is no ACking CAN node.
>>>>>
>>>>> Can you add a third CAN node to the network. The second and third node
>>>>> should use the same bitrate, while your vf610 uses a different one. With
>>>>> the new setup it should take more than one frame until the vf610 goes
>>>>> into error warning and even more frames to error passive. This way we
>>>>> can see it the error warning interrupt is connected or not. The error
>>>>> counters should increase with each "wrong" bitrate frame it sees, you
>>>>> can check with:
>>>>>
>>>>>     ip -details link show can0
>>>>>
>>>>> The output looks like this:
>>>>>
>>>>>> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>>>>>>     link/can
>>>>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>>>                               ^^^^^^^^^^^^^^^^^^^^^^
>>>>>>     bitrate 1000000 sample-point 0.750
>>>>>>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>>>>>>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>>>>>>     clock 8000000
>>>>>
>>>>> When one of the berr-counter crosses 96 (and stays below 128) a warning
>>>>> interrupt should be generated.
>>>>
>>>> Ok, created this setup, could successfully communicate with all three
>>>> nodes. I then set the Vybrid to half of the bitrate. When I send a frame
>>>> from Vybrid, the berr-counter tx immediately ends up at 128 and the
>>>> device is in ERROR-PASSIVE:
>>>
>>> This is expected.
>>>
>>>> root@...ibri-vf:~# ip -details link show can1
>>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>>> mode DEFAULT group default qlen 10
>>>>     link/can  promiscuity 0
>>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>>     bitrate 124990 sample-point 0.739
>>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>>     clock 83368421
>>>             ^^^^^^^^
>>>
>>> BTW: the can core has a really weird clock rate, have a look at the
>>> datasheet if you manage to route a 24 MHz clock (or another multiple of
>>> 8MHz) to the flexcan core. I had a quick glance at the datasheet, if I
>>> understand it correctly the Fast OSC clock runs with 24 MHz.
>>>
>>
>> This pinmux is actually part of the IPs CTRL register (see CLKSRC).
>> Currently this is set unconditionally to peripheral clock, which is on
>> my device 83.3 MHz (500MHz/6). This would be a bit bigger change. Just
>> thinking how to implement this.
> 
> Sigh! I've missed the fact, that this bit is _inside_ the CAN core
> (again). On the mx53 and imx6 this bit was obsolete and the clock was
> selected outside of the CAN core.
> 
>> Maybe we have to reference a third clock
>> "osc" and a device tree property fsl,can-clock-osc which one can choose
>> between "osc" and "per" what do you think?
> 
> Yes, but that's a different patch. I thought it would be as easy as on
> the mx53, where you just had to reconfigure the clock tree.
> 

Agreed.

>>>> root@...ibri-vf:~# cansend can1 1F334455#1122334455667788
>>>> interface = can1, family = 29, type = 3, proto = 1
>>>> root@...ibri-vf:~# [  818.886664] flexcan_irq, esr=00062242
>>>> [  818.890365] flexcan_irq, ctrl=1c3dac57
>>>>
>>>> root@...ibri-vf:~# ip -details link show can1
>>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>>> mode DEFAULT group default qlen 10
>>>>     link/can  promiscuity 0
>>>>     can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
>>>>     bitrate 124990 sample-point 0.739
>>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>>     clock 83368421
>>>>
>>>>
>>>> When I send the frames from another device on the bus, I can see the rx
>>>> count incrementing by one on each frame I send. As you expected, the
>>>> device changes to ERROR-WARNING when crossing the 96 frame boundary:
>>>
>>> This is correct
>>>
>>
>> Just found out when using too high bitrate (500000 vs 125000), the error
>> counter immediately goes up to 128 (or even beyond) and ends up in
>> ERROR-PASSIVE:
> 
> Maybe there is a off-by-one error in the flexcan core, so that a rx
> error of 127 doesn't trigger an error passive interrupt.
> 
>> # ip -details link show can1
>> [  292.164820] flexcan_get_berr_counter, esr=00000000
>> [  292.169715] flexcan_get_berr_counter, esr=00040190
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-PASSIVE (berr-counter tx 0 rx 135) restart-ms 0
>>     bitrate 298811 sample-point 0.777
>>     tq 371 prop-seg 3 phase-seg1 3 phase-seg2 2 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
> 
> [...]
> 
>>>> However, once reaching 128, I don't get another interrupt and the device
>>>> stays in ERROR-WARNING:
>>>
>>> The contents of the esr reg would be interesting, especially the
>>> FLT_CONF part.
> 
>> root@...ridvf61-v11:~# [  222.221651] flexcan_irq, esr=0005050a
>> [  222.225349] flexcan_irq, ctrl=1c3dac57
> 
>> ip -details link show can1
>> [  223.791082] flexcan_get_berr_counter, esr=00000000
>> [  223.796246] flexcan_get_berr_counter, esr=00040180
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen
>> 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 96) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
> 
>> root@...ridvf61-v11:~# ip -details link show can1
>> [  243.571175] flexcan_get_berr_counter, esr=00000000
>> [  243.576343] flexcan_get_berr_counter, esr=00040582
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen
>> 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 104) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>>
>> ...
>>
>> root@...ridvf61-v11:~# ip -details link show can1
>> [  299.831192] flexcan_get_berr_counter, esr=00000000
>> [  299.836358] flexcan_get_berr_counter, esr=00040582
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 126) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>> root@...ridvf61-v11:~# ip -details link show can1
>> [  302.411025] flexcan_get_berr_counter, esr=00000000
>> [  302.416195] flexcan_get_berr_counter, esr=00040592
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>>
>> The FLTCONF is in Error Passive, however the stack did not received that
>> information. I still have the printk in the interrupt, however I did not
>> get an interrupt here (while I get one when it switched to
>> ERROR-WARNING... But it looks like the ERRINT is still pending...
> 
> According to the datasheet the ERRINT indicates a bit error (bit 10...15
> is set), not the error passive state.
> 
> Marc

True, but fact is, I don't get a interrupt when the system switches from
ERROR-WARNING to ERROR-PASSIVE state. However, when I set berr-reporting
on, I get an interrupt (probably because of a change of bit 10...15),
but the state change is recognized:

ip -details link show can1
[ 2303.571656] flexcan_get_berr_counter, esr=00000000
[ 2303.576832] flexcan_get_berr_counter, esr=00040180
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING> state ERROR-WARNING (berr-counter tx 0 rx 127)
restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421
root@...ridvf61-v11:~# [ 2307.023789] flexcan_irq, esr=0004051a
[ 2307.027490] flexcan_irq, ctrl=1c3dec57
ip -details link show can1
[ 2309.081840] flexcan_get_berr_counter, esr=00000000
[ 2309.087016] flexcan_get_berr_counter, esr=00040190
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING> state ERROR-PASSIVE (berr-counter tx 0 rx 128)
restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421

It looks like, the FLTCONF state change itself doesn't lead to a
interrupt when the state changes to ERROR-PASSIVE. IMHO, this is also
not explicitly stated in the Vybrid RM (46.3.8).

What is the solution here? Force enabling ERR interrupt? The
FLEXCAN_HAS_BROKEN_ERR_STATE flag would be appropriate, however, this
was meant to work around the missing WRN_INT, which seems not to be the
case here...

--
Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ