lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S35NaCEBPXAsM-8-wrYYQhDB2EVxAN1RaGiJM9yNncaHaQ@mail.gmail.com>
Date:   Sat, 27 Jun 2020 09:26:20 -0700
From:   Tom Herbert <tom@...bertland.com>
To:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
Cc:     Maxim Mikityanskiy <maximmi@...lanox.com>,
        Amritha Nambiar <amritha.nambiar@...el.com>,
        Kiran Patil <kiran.patil@...el.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: ADQ - comparison to aRFS, clarifications on NAPI ID, binding with busy-polling

On Wed, Jun 24, 2020 at 1:21 PM Samudrala, Sridhar
<sridhar.samudrala@...el.com> wrote:
>
>
>
> On 6/17/2020 6:15 AM, Maxim Mikityanskiy wrote:
> > Hi,
> >
> > I discovered Intel ADQ feature [1] that allows to boost performance by
> > picking dedicated queues for application traffic. We did some research,
> > and I got some level of understanding how it works, but I have some
> > questions, and I hope you could answer them.
> >
> > 1. SO_INCOMING_NAPI_ID usage. In my understanding, every connection has
> > a key (sk_napi_id) that is unique to the NAPI where this connection is
> > handled, and the application uses that key to choose a handler thread
> > from the thread pool. If we have a one-to-one relationship between
> > application threads and NAPI IDs of connections, each application thread
> > will handle only traffic from a single NAPI. Is my understanding correct?
>
> Yes. It is correct and recommended with the current implementation.
>
> >
> > 1.1. I wonder how the application thread gets scheduled on the same core
> > that NAPI runs at. It currently only works with busy_poll, so when the
> > application initiates busy polling (calls epoll), does the Linux
> > scheduler move the thread to the right CPU? Do we have to have a strict
> > one-to-one relationship between threads and NAPIs, or can one thread
> > handle multiple NAPIs? When the data arrives, does the scheduler run the
> > application thread on the same CPU that NAPI ran on?
>
> The app thread can do busypoll from any core and there is no requirement
> that the scheduler needs to move the thread to a specific CPU.
>
> If the NAPI processing happens via interrupts, the scheduler could move
> the app thread to the same CPU that NAPI ran on.
>
> >
> > 1.2. I see that SO_INCOMING_NAPI_ID is tightly coupled with busy_poll.
> > It is enabled only if CONFIG_NET_RX_BUSY_POLL is set. Is there a real
> > reason why it can't be used without busy_poll? In other words, if we
> > modify the kernel to drop this requirement, will the kernel still
> > schedule the application thread on the same CPU as NAPI when busy_poll
> > is not used?
>
> It should be OK to remove this restriction, but requires enabling this
> in skb_mark_napi_id() and sk_mark_napi_id() too.
>
> >
> > 2. Can you compare ADQ to aRFS+XPS? aRFS provides a way to steer traffic
> > to the application's CPU in an automatic fashion, and xps_rxqs can be
> > used to transmit from the corresponding queues. This setup doesn't need
> > manual configuration of TCs and is not limited to 4 applications. The
> > difference of ADQ is that (in my understanding) it moves the application
> > to the RX CPU, while aRFS steers the traffic to the RX queue handled my
> > the application's CPU. Is there any advantage of ADQ over aRFS, that I
> > failed to find?
>
> aRFS+XPS ties app thread to a cpu, whereas ADQ ties app thread to a napi
> id which in turn ties to a queue(s)
>
> ADQ also provides 2 levels of filtering compared to aRFS+XPS. The first
> level of filtering selects a queue-set associated with the application
> and the second level filter or RSS will select a queue within that queue
> set associated with an app thread.
>
The association between queues and thread is implicit in ADQ and
depends on some assumption particularly on symmetric queueing which
doesn't always work (TX/RX devices are different, uni-directional
traffic, peer using some encapsulation that the tc filter misses).
Please look at Per Thread Queues (https://lwn.net/Articles/824414/)
which aims to make this association of queues to threads explicit.

> The current interface to configure ADQ limits us to support upto 16
> application specific queue sets(TC_MAX_QUEUE)
>
>
> >
> > 3. At [1], you mention that ADQ can be used to create separate RSS sets.
> >   Could you elaborate about the API used? Does the tc mqprio
> > configuration also affect RSS? Can it be turned on/off?
>
> Yes. tc mqprio allows to create queue-sets per application and the
> driver configures RSS per queue-set.
>
> >
> > 4. How is tc flower used in context of ADQ? Does the user need to
> > reflect the configuration in both mqprio qdisc (for TX) and tc flower
> > (for RX)? It looks like tc flower maps incoming traffic to TCs, but what
> > is the mechanism of mapping TCs to RX queues?
>
> tc mqprio is used to map TCs to RX queues
>
> tc flower is used to configure the first level of filter to redirect
> packets to a queue set associated with an application.
>
> >
> > I really hope you will be able to shed more light on this feature to
> > increase my awareness on how to use it and to compare it with aRFS.
>
> Hope this helps and we will go over in more detail in our netdev session.
>
Also, please add a document in Documentation/networking that describes
the feature, configuration, and any limitations and relationship to
other packet steering features.

> >
> > Thanks,
> > Max
> >
> > [1]:
> > https://netdevconf.info/0x14/session.html?talk-ADQ-for-system-level-network-io-performance-improvements
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ