[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1PR0402MB38710DAD1F17B80F83403396E0B60@VI1PR0402MB3871.eurprd04.prod.outlook.com>
Date: Wed, 20 May 2020 20:24:43 +0000
From: Ioana Ciornei <ioana.ciornei@....com>
To: Jakub Kicinski <kuba@...nel.org>
CC: "davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
classes
> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
>
> On Wed, 20 May 2020 15:10:42 +0000 Ioana Ciornei wrote:
> > DPAA2 has frame queues per each Rx traffic class and the decision from
> > which queue to pull frames from is made by the HW based on the queue
> > priority within a channel (there is one channel per each CPU).
>
> IOW you're reading the descriptor for the device memory/iomem address and
> the HW will return the next descriptor based on configured priority?
That's the general idea but the decision is not made on a frame by frame bases
but rather on a dequeue operation which can, at a maximum, return
16 frame descriptors at a time.
> Presumably strict priority?
Only the two highest traffic classes are in strict priority, while the other 6 TCs
form two priority tiers - medium(4 TCs) and low (last two TCs).
>
> > If this should be modeled in software, then I assume there should be a
> > NAPI instance for each traffic class and the stack should know in
> > which order to call the poll() callbacks so that the priority is respected.
>
> Right, something like that. But IMHO not needed if HW can serve the right
> descriptor upon poll.
After thinking this through I don't actually believe that multiple NAPI instances
would solve this in any circumstance at all:
- If you have hardware prioritization with full scheduling on dequeue then job on the
driver side is already done.
- If you only have hardware assist for prioritization (ie hardware gives you multiple
rings but doesn't tell you from which one to dequeue) then you can still use a single
NAPI instance just fine and pick the highest priority non-empty ring on-the-fly basically.
What I am having trouble understanding is how the fully software implementation
of this possible new Rx qdisc should work. Somehow the skb->priority should be taken
into account when the skb is passing though the stack (ie a higher priority skb should
surpass another previously received skb even if the latter one was received first, but
its priority queue is congested).
I don't have a very deep understanding of the stack but I am thinking that the
enqueue_to_backlog()/process_backlog() area could be a candidate place for sorting out
bottlenecks. In case we do that I don't see why a qdisc would be necessary at all and not
have everybody benefit from prioritization based on skb->priority.
Ioana
Powered by blists - more mailing lists