netdev - Re: Modification to skb->queue_mapping affecting performance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAAmHdhxGBBOCebkvq43pwsaZ8HcULZmXnikQAtc2uLBMWZgXjA@mail.gmail.com>
Date:   Tue, 13 Sep 2016 22:13:04 -0700
From:   Michael Ma <make0818@...il.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     netdev <netdev@...r.kernel.org>
Subject: Re: Modification to skb->queue_mapping affecting performance

2016-09-13 18:18 GMT-07:00 Eric Dumazet <eric.dumazet@...il.com>:
> On Tue, 2016-09-13 at 17:23 -0700, Michael Ma wrote:
>
>> If I understand correctly this is still to associate a qdisc with each
>> ifb TXQ. How should I do this if I want to use HTB? I guess I'll need
>> to divide the bandwidth of each class in HTB by the number of TX
>> queues for each individual HTB qdisc associated?
>>
>> My original idea was to attach a HTB qdisc for each ifb queue
>> representing a set of flows not sharing bandwidth with others so that
>> root lock contention still happens but only affects flows in the same
>> HTB. Did I understand the root lock contention issue incorrectly for
>> ifb? I do see some comments in __dev_queue_xmit() about using a
>> different code path for software devices which bypasses
>> __dev_xmit_skb(). Does this mean ifb won't go through
>> __dev_xmit_skb()?
>
> You can install HTB on all of your MQ children for sure.
>
> Again, there is no qdisc lock contention if you properly use MQ.
>
> Now if you _need_ to install a single qdisc for whatever reason, then
> maybe you want to use a single rx queue on the NIC, to reduce lock
> contention ;)
>
>
I don't intend to install multiple qdisc - the only reason that I'm
doing this now is to leverage MQ to workaround the lock contention,
and based on the profile this all worked. However to simplify the way
to setup HTB I wanted to use TXQ to partition HTB classes so that a
HTB class only belongs to one TXQ, which also requires mapping skb to
TXQ using some rules (here I'm using priority but I assume it's
straightforward to use other information such as classid). And the
problem I found here is that when using priority to infer the TXQ so
that queue_mapping is changed, bandwidth is affected significantly -
the only thing I can guess is that due to queue switch, there are more
cache misses assuming processor cores have a static mapping to all the
queues. Any suggestion on what to do next for the investigation?

I would also guess that this should be a common problem if anyone
wants to use MQ+IFB to workaround the qdisc lock contention on the
receiver side and classful qdisc is used on IFB, but haven't really
found a similar thread here...