netdev - Re: qdisc spin lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAAmHdhy_foFw6udQnFk+KUNUjt+zCPRbq5QDBcw0gZ_-jmb4ow@mail.gmail.com>
Date:	Wed, 20 Apr 2016 22:51:07 -0700
From:	Michael Ma <make0818@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Cong Wang <xiyou.wangcong@...il.com>,
	Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: qdisc spin lock

2016-04-20 15:34 GMT-07:00 Eric Dumazet <eric.dumazet@...il.com>:
> On Wed, 2016-04-20 at 14:24 -0700, Michael Ma wrote:
>> 2016-04-08 7:19 GMT-07:00 Eric Dumazet <eric.dumazet@...il.com>:
>> > On Thu, 2016-03-31 at 16:48 -0700, Michael Ma wrote:
>> >> I didn't really know that multiple qdiscs can be isolated using MQ so
>> >> that each txq can be associated with a particular qdisc. Also we don't
>> >> really have multiple interfaces...
>> >>
>> >> With this MQ solution we'll still need to assign transmit queues to
>> >> different classes by doing some math on the bandwidth limit if I
>> >> understand correctly, which seems to be less convenient compared with
>> >> a solution purely within HTB.
>> >>
>> >> I assume that with this solution I can still share qdisc among
>> >> multiple transmit queues - please let me know if this is not the case.
>> >
>> > Note that this MQ + HTB thing works well, unless you use a bonding
>> > device. (Or you need the MQ+HTB on the slaves, with no way of sharing
>> > tokens between the slaves)
>>
>> Actually MQ+HTB works well for small packets - like flow of 512 byte
>> packets can be throttled by HTB using one txq without being affected
>> by other flows with small packets. However I found using this solution
>> large packets (10k for example) will only achieve very limited
>> bandwidth. In my test I used MQ to assign one txq to a HTB which sets
>> rate at 1Gbit/s, 512 byte packets can achieve the ceiling rate by
>> using 30 threads. But sending 10k packets using 10 threads has only 10
>> Mbit/s with the same TC configuration. If I increase burst and cburst
>> of HTB to some extreme large value (like 50MB) the ceiling rate can be
>> hit.
>>
>> The strange thing is that I don't see this problem when using HTB as
>> the root. So txq number seems to be a factor here - however it's
>> really hard to understand why would it only affect larger packets. Is
>> this a known issue? Any suggestion on how to investigate the issue
>> further? Profiling shows that the cpu utilization is pretty low.
>
> You could try
>
> perf record -a -g -e skb:kfree_skb sleep 5
> perf report
>
> So that you see where the packets are dropped.
>
> Chances are that your UDP sockets SO_SNDBUF is too big, and packets are
> dropped at qdisc enqueue time, instead of having backpressure.
>

Thanks for the hint - how should I read the perf report? Also we're
using TCP socket in this testing - TCP window size is set to 70kB.

-  35.88%             init  [kernel.kallsyms]  [k] intel_idle
                                                   ◆
     intel_idle
                                                   ▒
-  15.83%          strings  libc-2.5.so        [.]
__GI___connect_internal
▒
   - __GI___connect_internal
                                                   ▒
      - 50.00% get_mapping
                                                   ▒
           __nscd_get_map_ref
                                                   ▒
        50.00% __nscd_open_socket
                                                   ▒
-  13.19%          strings  libc-2.5.so        [.] __GI___libc_recvmsg
                                                   ▒
   - __GI___libc_recvmsg
                                                   ▒
      + 64.52% getifaddrs
                                                   ▒
      + 35.48% __check_pf
                                                   ▒
-  10.55%          strings  libc-2.5.so        [.] __sendto_nocancel
                                                   ▒
   - __sendto_nocancel
                                                   ▒
        100.00% 0
>
>