netdev - Re: qdisc spin lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAAmHdhwpVOCv=4Y+pb9PfGKWV0ooqnr7eC58ZYfRTtYjC35EFw@mail.gmail.com>
Date:	Thu, 31 Mar 2016 16:41:30 -0700
From:	Michael Ma <make0818@...il.com>
To:	Jesper Dangaard Brouer <brouer@...hat.com>
Cc:	netdev@...r.kernel.org
Subject: Re: qdisc spin lock

Thanks for the suggestion - I'll try the MQ solution out. It seems to
be able to solve the problem well with the assumption that bandwidth
can be statically partitioned.

2016-03-31 12:18 GMT-07:00 Jesper Dangaard Brouer <brouer@...hat.com>:
>
> On Wed, 30 Mar 2016 00:20:03 -0700 Michael Ma <make0818@...il.com> wrote:
>
>> I know this might be an old topic so bare with me – what we are facing
>> is that applications are sending small packets using hundreds of
>> threads so the contention on spin lock in __dev_xmit_skb increases the
>> latency of dev_queue_xmit significantly. We’re building a network QoS
>> solution to avoid interference of different applications using HTB.
>
> Yes, as you have noticed with HTB there is a single qdisc lock, and
> congestion obviously happens :-)
>
> It is possible with different tricks to make it scale.  I believe
> Google is using a variant of HTB, and it scales for them.  They have
> not open source their modifications to HTB (which likely also involves
> a great deal of setup tricks).
>
> If your purpose it to limit traffic/bandwidth per "cloud" instance,
> then you can just use another TC setup structure.  Like using MQ and
> assigning a HTB per MQ queue (where the MQ queues are bound to each
> CPU/HW queue)... But you have to figure out this setup yourself...
>
>
>> But in this case when some applications send massive small packets in
>> parallel, the application to be protected will get its throughput
>> affected (because it’s doing synchronous network communication using
>> multiple threads and throughput is sensitive to the increased latency)
>>
>> Here is the profiling from perf:
>>
>> -  67.57%   iperf  [kernel.kallsyms]     [k] _spin_lock
>>   - 99.94% dev_queue_xmit
>>   -  96.91% _spin_lock
>>  - 2.62%  __qdisc_run
>>   - 98.98% sch_direct_xmit
>> - 99.98% _spin_lock
>>
>> As far as I understand the design of TC is to simplify locking schema
>> and minimize the work in __qdisc_run so that throughput won’t be
>> affected, especially with large packets. However if the scenario is
>> that multiple classes in the queueing discipline only have the shaping
>> limit, there isn’t really a necessary correlation between different
>> classes. The only synchronization point should be when the packet is
>> dequeued from the qdisc queue and enqueued to the transmit queue of
>> the device. My question is – is it worth investing on avoiding the
>> locking contention by partitioning the queue/lock so that this
>> scenario is addressed with relatively smaller latency?
>
> Yes, there is a lot go gain, but it is not easy ;-)
>
>> I must have oversimplified a lot of details since I’m not familiar
>> with the TC implementation at this point – just want to get your input
>> in terms of whether this is a worthwhile effort or there is something
>> fundamental that I’m not aware of. If this is just a matter of quite
>> some additional work, would also appreciate helping to outline the
>> required work here.
>>
>> Also would appreciate if there is any information about the latest
>> status of this work http://www.ijcset.com/docs/IJCSET13-04-04-113.pdf
>
> This article seems to be very low quality... spelling errors, only 5
> pages, no real code, etc.
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer