lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <12d27b94-27c3-00fc-4030-0a0941a0b17f@opensynergy.com>
Date:   Tue, 23 May 2023 15:39:35 +0200
From:   Harald Mommer <harald.mommer@...nsynergy.com>
To:     Vincent Mailhol <vincent.mailhol@...il.com>
Cc:     Arnd Bergmann <arnd@...nel.org>,
        Mikhail Golubev <Mikhail.Golubev@...nsynergy.com>,
        Harald Mommer <hmo@...nsynergy.com>,
        virtio-dev@...ts.oasis-open.org, linux-can@...r.kernel.org,
        Netdev <netdev@...r.kernel.org>, linux-kernel@...r.kernel.org,
        Wolfgang Grandegger <wg@...ndegger.com>,
        Marc Kleine-Budde <mkl@...gutronix.de>,
        "David S . Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Dariusz Stojaczyk <Dariusz.Stojaczyk@...nsynergy.com>,
        Stratos Mailing List <stratos-dev@...lists.linaro.org>
Subject: Re: [virtio-dev] [RFC PATCH 1/1] can: virtio: Initial virtio CAN
 driver.

Hello Vincent,

On 15.05.23 07:58, Vincent Mailhol wrote:
> Hi Harald,
>
> On Fri. 12 May 2023 at 22:19, Harald Mommer
> <harald.mommer@...nsynergy.com>  wrote:
>> Hello Vincent,
>>
>> searched for the old E-Mail, this was one of that which slipped through.
>> Too much of those.
>>
>> On 05.11.22 10:21, Vincent Mailhol wrote:
>>> On Fry. 4 nov. 2022 at 20:13, Arnd Bergmann<arnd@...nel.org>  wrote:
>>>> On Thu, Nov 3, 2022, at 13:26, Harald Mommer wrote:
>>>>> On 25.08.22 20:21, Arnd Bergmann wrote:
>>>> ...
>>>>> The messages are not necessarily processed in sequence by the CAN stack.
>>>>> CAN is priority based. The lower the CAN ID the higher the priority. So
>>>>> a message with CAN ID 0x100 can surpass a message with ID 0x123 if the
>>>>> hardware is not just simple basic CAN controller using a single TX
>>>>> mailbox with a FIFO queue on top of it.
>>> Really? I acknowledge that it is priority based *on the bus*, i.e. if
>>> two devices A and B on the same bus try to send CAN ID 0x100 and 0x123
>>> at the same time, then device A will win the CAN arbitration.
>>> However, I am not aware of any devices which reorder their own stack
>>> according to the CAN IDs. If I first send CAN ID 0x123 and then ID
>>> 0x100 on the device stack, 0x123 would still go out first, right?
>> The CAN hardware may be a basic CAN hardware: Single mailbox only with a
>> TX FIFO on top of this.
>>
>> No reordering takes place, the CAN hardware will try to arbitrate the
>> CAN bus with a low priority CAN message (big CAN ID) while some high
>> priority CAN message (small CAN ID) is waiting in the FIFO. This is
>> called "internal priority inversion", a property of basic CAN hardware.
>> A basic CAN hardware does exactly what you describe.
>>
>> Should be the FIFO in software it's a bad idea to try to improve this
>> doing some software sorting, the processing time needed is likely to
>> make things even worse. Therefore no software does this or at least it's
>> not recommended to do this.
>>
>> But the hardware may also be a better one. No FIFO but a lot of TX
>> mailboxes. A full CAN hardware tries to arbitrate the bus using the
>> highest priority waiting CAN message considering all hardware TX
>> mailboxes. Such a better (full CAN) hardware does not cause "internal
>> priority inversion" but tries to arbitrate the bus in the correct order
>> given by the message IDs.
>>
>> We don't know about the actually used CAN hardware and how it's used on
>> this level we are with our virtio can device. We are using SocketCAN, no
>> information about the properties of the underlying hardware is provided
>> at some API. May be basic CAN using a FIFO and a single TX mailbox or
>> full CAN using a lot of TX mailboxes in parallel.
>>
>> On the bus it's guaranteed always that the sender with the lowest CAN ID
>> winds regardless which hardware is used, the only difference is whether
>> we have "internal priority inversion" or not.
>>
>> If I look at the CAN stack = Software + hardware (and not only software)
>> it's correct: The hardware device may re-order if it's a better (full
>> CAN) one and thus the actual sending on the bus is not done in the same
>> sequence as the messages were provided internally (e.g. at some socket).
> OK. Thanks for the clarification.
>
> So, you are using scatterlist to be able to interface with the
> different CAN mailboxes. But then, it means that all the heuristics to
> manage those mailboxes are done in the virtio host.

There is some heuristic when VIRTIO_CAN_F_LATE_TX_ACK is supported on 
the device side. The feature means that the host marks a TX message as 
done not at the moment when it's scheduled for sending but when it has 
been really sent on the bus.

To do that SocketCAN needs to be configured to receive it's own sent 
message. On RX the device identifies the message which has been sent on 
the bus. The heuristic is going through the list of pending messages, 
check CAN ID and payload and mark the respective message as done.

Problem with SocketCAN: There is a load case (full sending without any 
delay in both directions) where it seems that own sent messages are 
getting lost in the software stack. Thus we get in a state where the 
list of pending messages gets full and TX gets stuck.

The feature flag is not offered in the open source device, it is only 
experimental in our proprietary device and normally disabled.

Without this feature there is no heuristic, just send to SocketCAN and 
put immediately as used (done). But for an AUTOSAR CAN driver this means 
CanIf_TxConfirmation() came too early, not late when the message "has 
been transmitted on the CAN network" but already earlier when the 
message is put to SocketCAN scheduled for transmission.

> Did you consider exposing the number of supported mailboxes as a
> configuration parameter and let the virtio guest manage these? In
> Linux, it is supported since below commit:
>
>    commit 038709071328 ("can: dev: enable multi-queue for SocketCAN devices")
>    Link:https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fgit.kernel.org%2ftorvalds%2fc%2f038709071328&umid=67080c1c-b5d1-4d20-a9eb-ab7f9a062932&auth=53c7c7de28b92dfd96e93d9dd61a23e634d2fbec-9ae22f0c43ab3effc4ba0f9fd0327c9852d5d05a
>
> Generally, from a design perspective, isn't it better to make the
> virtio host as dumb as possible and let the host do the management?

I was not aware of this patch.

But thought about different priorities. 2 priorities, low priority for 
CAN messages which may go into some FIFO suffering priority inversion 
and high priority for CAN messages going to mailboxes. The very first 
draft specification had this not knowing about some restrictions in the 
Linux environment. It had the number of places for each priority (low: 
FIFO places, high: mailboxes) in the config space. Everything going into 
a single TX queue but with some priority field. Got the comment on the 
list to use a dedicated queue for high priority messages instead of 
using a priority field in the message itself.

It would be easy to do if there was an AUTOSAR CAN driver used as back 
end, in this case you configure it and know the capabilities and 
configuration.

But looking into for example m_can.c the information is not available. 
Checked now again.

=> The information about underlying hardware properties is not available 
outside the CAN driver

And I also looked now into the patch you sent:

dev.h:

#define alloc_candev(sizeof_priv, echo_skb_max) \
     alloc_candev_mqs(sizeof_priv, echo_skb_max, 1, 1)
#define alloc_candev_mq(sizeof_priv, echo_skb_max, count) \
     alloc_candev_mqs(sizeof_priv, echo_skb_max, count, count)

=> Every single driver uses alloc_candev(), none uses alloc_candev_mq(). 
So the patch which came in already 2018 is still an offering which is 
not used at all.

To have multiple priorities with queues we needed a way in user land

- to determine the number of queues (priorities)
- to address the queues
- to determine the number of resources behind each queue for flow 
control purposes
- to determine the nature of the queue (basic CAN FIFO with n places or 
full CAN queue with m mailboxes)

There is nothing of this in place.

=> We are currently not in the position to support different priority 
queues in Linux.

BTW: Even then we would probably need the heuristic on the device side 
when VIRTIO_CAN_F_LATE_TX_ACK is negotiated. I don't think it was a good 
idea to use 1 queue for basic CAN and m queues for m full CAN mailboxes, 
probably it was better to have a low priority queue and a high priority 
queue. But as there is nothing in place currently beside this patch you 
mentioned this is an issue to think about in the future.

Regards
Harald

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ