[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4a20e5d1-240e-0baf-9605-978f9a4aeb40@redhat.com>
Date: Wed, 8 Jun 2016 08:48:33 -0400
From: Doug Ledford <dledford@...hat.com>
To: Eran Ben Elisha <eranlinuxmellanox@...il.com>
Cc: Saeed Mahameed <saeedm@....mellanox.co.il>,
Linux Netdev List <netdev@...r.kernel.org>,
ophirm@...lanox.com, Eran Ben Elisha <eranbe@...lanox.com>
Subject: Re: mlx5 core/en oops in 4.6-rc6+
On 5/19/2016 1:13 PM, Eran Ben Elisha wrote:
> Hi Doug,
> Attaching here a response from Ophir Maor (from Mellanox community)
This conversation is a low priority, spare time thread for me, so it can
take a while to respond to sometimes ;-)
>>
>> Read your own guides ;-).
>>
>> I'm using this one for your switches:
>> https://community.mellanox.com/docs/DOC-1417
>>
>> And these to try and get the linux machines configured properly:
>> https://community.mellanox.com/docs/DOC-1414
>> https://community.mellanox.com/docs/DOC-1415
>> https://community.mellanox.com/docs/DOC-2311
>> https://community.mellanox.com/docs/DOC-2474
>> http://www.mellanox.com/related-docs/prod_software/RoCE_with_Priority_Flow_Control_Application_Guide.pdf
>>
>> The guides are helpful if your setup allows you to follow their exact
>> example. But, they are shy on information about how to modify the
>> examples to your specific situation. For instance, I have to use vlan
>> priority 5 as my no-drop priority for RoCE traffic. I can't reliably
>> tell which portions of the guide I must switch the 3s to 5s in order to
>> get the new priority, and which uses of 3s in the guides relate to other
>> things that could be mapped to 5. On a separate note, it's unclear to
>> me if your switches and cards support more than one no-drop priority
>> (other vendor's RoCE cards I'm using here don't, they only allow one
>> no-drop priority for RoCE traffic and it must be 5). If it does support
>> more than one, I'd actually like both 3 and 5 to be no-drop and for one
>> vlan to use 3 and another to use 5.
>
> There are two flows to configure egress mapping
>
> - flow that pass via the kernel. Then you need to use kernel commands
> (e.g. vconfig set_egress_map, or other commands) to make the kernel
> set the egress priority.
Yes. Done. Which actually has nothing to do with RoCE (I don't think
even kernel RoCE flows go through this since they don't use the kernel
net stack but use the card's firmware and RoCE work requests to send
data) and is just part of the Mellanox recommended "put all traffic on
this vlan on this priority even if it isn't all RoCE". I'm not sure I
agree with it, and explanations that specifically exclude it to make
things clearer would be nice.
> - flows that bypass the kernel such as RoCE, then you need to use
> tc_wrap to set the egress mapping.
tc_wrap is not an explanation, nor really a suitable answer to "how do I
do this" as it's out of date for the current upstream kernels last I
checked...
> This post explains it very nicely for ConnectX-4.
>
> https://community.mellanox.com/docs/DOC-2474
Yes, I read this post, and I downloaded tc_wrap from Mellanox, and I
dissected tc_wrap to figure out it was doing what I added to my
dispatcher file, namely this:
tc qdisc add dev mlx4_roce root mqprio num_tc 8 map 5 5 5 5 5 5 5 5 5 5
5 5 5 5 5 5 queues 32@0 32@32 32@64 32@96 32@128 32@160 32@192 32@224
But even though I was able to pull that out of tc_wrap, the explanation
of how setting what appears to be a kernel queue discipline on packets
that the kernel does not see and are handled entirely by the card causes
those packets never seen by the kernel to be sent with a specific
priority is completely missing. What is the chain here? Does setting
the queue discipline here translate to a setting on the card and there
is some magic in that setting that triggers the firmware to do the right
thing on RoCE packets? Does the driver read the queue disc when setting
up address handles to use on the work requests and get the information
that way? How is this information actually making it to the packet
generation engine in the firmware? And given how recent upstream
kernels have changed the default queue discipline on these cards, it is
unclear how this command might need to be modified to keep working.
Download attachment "signature.asc" of type "application/pgp-signature" (885 bytes)
Powered by blists - more mailing lists