[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <77b2b5a6-13a5-47de-a6a0-e5aaf4e91582@nvidia.com>
Date: Wed, 11 Dec 2024 18:36:56 +0100
From: Dragos Tatulea <dtatulea@...dia.com>
To: Alexandra Winter <wintera@...ux.ibm.com>,
Eric Dumazet <edumazet@...gle.com>
Cc: Rahul Rameshbabu <rrameshbabu@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>, Tariq Toukan <tariqt@...dia.com>,
Leon Romanovsky <leon@...nel.org>, David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Andrew Lunn <andrew+netdev@...n.ch>, Nils Hoppmann <niho@...ux.ibm.com>,
netdev@...r.kernel.org, linux-s390@...r.kernel.org,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Thorsten Winkler <twinkler@...ux.ibm.com>, Simon Horman <horms@...nel.org>,
Niklas Schnelle <schnelle@...ux.ibm.com>
Subject: Re: [PATCH net-next] net/mlx5e: Transmit small messages in linear skb
Hi Alexandra,
On 11.12.24 17:19, Alexandra Winter wrote:
>
>
> On 10.12.24 12:49, Dragos Tatulea wrote:
>>
>>
>> On 06.12.24 16:25, Alexandra Winter wrote:
>>>
>>>
>>> On 04.12.24 15:36, Eric Dumazet wrote:
>>>> I would suggest the opposite : copy the headers (typically less than
>>>> 128 bytes) on a piece of coherent memory.
>>>>
>>>> As a bonus, if skb->len is smaller than 256 bytes, copy the whole skb.
>>>>
>>>> include/net/tso.h and net/core/tso.c users do this.
>>>>
>>>> Sure, patch is going to be more invasive, but all arches will win.
>>>
>>>
>>> Thank you very much for the examples, I think I understand what you are proposing.
>>> I am not sure whether I'm able to map it to the mlx5 driver, but I could
>>> try to come up with a RFC. It may take some time though.
>>>
>>> NVidia people, any suggesttions? Do you want to handle that yourselves?
>>>
>> Discussed with Saeed and he proposed another approach that is better for
>> us: copy the whole skb payload inline into the WQE if it's size is below a
>> threshold. This threshold can be configured through the
>> tx-copybreak mechanism.
>>
>> Thanks,
>> Dragos
>
>
> Thank you very much Dargos and Saeed.
> I am not sure I understand the details of "inline into the WQE".
> The idea seems to be to use a premapped coherent array per WQ
> that is indexed by queue element index and can be used to copy headers and
> maybe small messages into.
> I think I see something similar to your proposal in mlx4 (?).
I think so, yes.
> To me the general concept seems to be similar to what Eric is proposing.
> Did I get it right?
>
AFAIU, it's not quite the same thing. With Eric's proposal we'd use an
additional premapped buffer, copy data there and sumbit WQEs with
pointers to this buffer. The inline proposal is to copy the data inline
in the WQE directly without the need of an additional buffer.
To understand if this is feasible, I still need to find out what is
actual max inline space is.
> I really like the idea to use tx-copybreak for threshold configuration.
>
That's good to know.
> As Eric mentioned that is not a very small patch and maybe not fit for backporting
> to older distro versions.
> What do you think of a two-step approach as described in the other sub-thread?
> A simple patch for mitigation that can be backported, and then the improvement
> as a replacement?
As stated in the previous mail, let's see what Tariq has to say about
doing some arch specific fix. I am not very optimistic though...
Thanks,
Dragos
Powered by blists - more mailing lists