[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k06xjplj.fsf@nvidia.com>
Date: Wed, 24 Aug 2022 21:36:54 +0200
From: Petr Machata <petrm@...dia.com>
To: <Daniel.Machon@...rochip.com>
CC: <petrm@...dia.com>, <netdev@...r.kernel.org>, <kuba@...nel.org>,
<vinicius.gomes@...el.com>, <vladimir.oltean@....com>,
<thomas.petazzoni@...tlin.com>, <Allan.Nielsen@...rochip.com>,
<maxime.chevallier@...tlin.com>, <roopa@...dia.com>
Subject: Re: Basic PCP/DEI-based queue classification
<Daniel.Machon@...rochip.com> writes:
>> > As I hinted earlier, we could also add an entirely new PCP interface
>> > (like with maxrate), this will give us a bit more flexibility and will
>> > not crash with anything. This approach will not give is trust for DSCP,
>> > but maybe we can disregard this and go with a PCP solution initially?
>>
>> I would like to have a line of sight to how things will be done. Not
>> everything needs to be implemented at once, but we have to understand
>> how to get there when we need to. At least for issues that we can
>> already foresee now, such as the DSCP / PCP / default ordering.
>>
>> Adding the PCP rules as a new APP selector, and then expressing the
>> ordering as a "selector policy" or whatever, IMHO takes care of this
>> nicely.
>>
>> But OK, let's talk about the "flexibility" bit that you mention: what
>> does this approach make difficult or impossible?
>
> It was merely a concern of not changing too much on something that is
> already standard. Maybe I dont quite see how the APP interface can be
> extended to accomodate for: pcp/dei, ingress/egress and trust. Lets
> try to break it down:
>
> - pcp/dei:
> this *could* be expressed in app->protocol and map 1:1 to the
> pcp table entrise, so that 8*dei+pcp:priority. If I want to map
> pcp 3, with dei 1 to priority 2, it would be encoded 11:2.
Yep. In particular something like {sel=255, pid=11, prio=2}.
iproute2 "dcb" would obviously grow brains to let you configure this
stuff semantically, so e.g.:
# dcb app replace dev X pcp-prio 3:3 3de:2 2:2 2de:1
> - ingress/egress:
> I guess we need a selector for each? I notice that the mellanox
> driver uses the dcb_ieee_getapp_prio_dscp_mask_map and
> dcb_ieee_getapp_dscp_prio_mask_map for priority map and priority
> rewrite map, but these seems to be the same for both ingress and
> egress to me?
Ha, I was only thinking about prioritization, not about rewrite at all.
Yeah, mlxsw uses APP rules for rewrite as well. The logic is that if the
network behind port X uses DSCP value D to express priority P, then
packets with priority P leaving that port should have DSCP value of D.
Of course it doesn't work too well, because there are 8 priorities, but
64 DSCP values. So mlxsw arbitrarily chooses the highest DSCP value.
The situation is similar with PCP, where there are 16 PCP+DEI
combinations, but only 8 priorities.
So having a way to configure rewrite would be good. But then we are very
firmly in the extension territory. This would basically need a separate
APP-like object.
> So far only subtle changes. Now how do you see trust going in. Can you
> elaborate a little on the policy selector you mentioned?
Sure. In my mind the policy is a array that describes the order in which
APP rules are applied. "default" is implicitly last.
So "trust DSCP" has a policy of just [DSCP]. "Trust PCP" of [PCP].
"Trust DSCP, then PCP" of [DSCP, PCP]. "Trust port" (i.e. just default)
is simply []. Etc.
Individual drivers validate whether their device can implement the
policy.
I expect most devices to really just support the DSCP and PCP parts, but
this is flexible in allowing more general configuration in devices that
allow it.
ABI-wise it is tempting to reuse APP to assign priority to selectors in
the same way that it currently assigns priority to field values:
# dcb app replace dev X sel-prio dscp:2 pcp:1
But that feels like a hack. It will probably be better to have a
dedicated object for this:
# dcb app-policy set dev X sel-order dscp pcp
This can be sliced in different ways that we can bikeshed to death
later. Does the above basically address your request?
Powered by blists - more mailing lists