netdev - RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <VI1PR0402MB38710DAD1F17B80F83403396E0B60@VI1PR0402MB3871.eurprd04.prod.outlook.com>
Date:   Wed, 20 May 2020 20:24:43 +0000
From:   Ioana Ciornei <ioana.ciornei@....com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     "davem@...emloft.net" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
 classes

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Wed, 20 May 2020 15:10:42 +0000 Ioana Ciornei wrote:
> > DPAA2 has frame queues per each Rx traffic class and the decision from
> > which queue to pull frames from is made by the HW based on the queue
> > priority within a channel (there is one channel per each CPU).
> 
> IOW you're reading the descriptor for the device memory/iomem address and
> the HW will return the next descriptor based on configured priority?

That's the general idea but the decision is not made on a frame by frame bases
but rather on a dequeue operation which can, at a maximum, return
16 frame descriptors at a time.

> Presumably strict priority?

Only the two highest traffic classes are in strict priority, while the other 6 TCs
form two priority tiers - medium(4 TCs) and low (last two TCs).

> 
> > If this should be modeled in software, then I assume there should be a
> > NAPI instance for each traffic class and the stack should know in
> > which order to call the poll() callbacks so that the priority is respected.
> 
> Right, something like that. But IMHO not needed if HW can serve the right
> descriptor upon poll.

After thinking this through I don't actually believe that multiple NAPI instances
would solve this in any circumstance at all:

- If you have hardware prioritization with full scheduling on dequeue then job on the
driver side is already done.
- If you only have hardware assist for prioritization (ie hardware gives you multiple
rings but doesn't tell you from which one to dequeue) then you can still use a single
NAPI instance just fine and pick the highest priority non-empty ring on-the-fly basically.

What I am having trouble understanding is how the fully software implementation
of this possible new Rx qdisc should work. Somehow the skb->priority should be taken
into account when the skb is passing though the stack (ie a higher priority skb should
surpass another previously received skb even if the latter one was received first, but
its priority queue is congested).

I don't have a very deep understanding of the stack but I am thinking that the
enqueue_to_backlog()/process_backlog() area could be a candidate place for sorting out
bottlenecks. In case we do that I don't see why a qdisc would be necessary at all and not
have everybody benefit from prioritization based on skb->priority.

Ioana