lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <5711FD35.90108@seti.kr.ua> Date: Sat, 16 Apr 2016 11:52:05 +0300 From: Andrew <nitr0@...i.kr.ua> To: Michael Ma <make0818@...il.com>, Jesper Dangaard Brouer <brouer@...hat.com> Cc: netdev@...r.kernel.org Subject: Re: qdisc spin lock I think that it isn't a good solution - unless you can bind specified host (src/dst) to specified txq. Usually traffic is spreaded on txqs by src+dst IP (or even IP:port) hash which results in traffic spreading among all mqs on device, and wrong bandwidth limiting (N*bandwidth on multi-session load like p2p/server traffic)... People said that hfsc shaper have better performance, but I didn't tested it. 01.04.2016 02:41, Michael Ma пишет: > Thanks for the suggestion - I'll try the MQ solution out. It seems to > be able to solve the problem well with the assumption that bandwidth > can be statically partitioned. > > 2016-03-31 12:18 GMT-07:00 Jesper Dangaard Brouer <brouer@...hat.com>: >> On Wed, 30 Mar 2016 00:20:03 -0700 Michael Ma <make0818@...il.com> wrote: >> >>> I know this might be an old topic so bare with me – what we are facing >>> is that applications are sending small packets using hundreds of >>> threads so the contention on spin lock in __dev_xmit_skb increases the >>> latency of dev_queue_xmit significantly. We’re building a network QoS >>> solution to avoid interference of different applications using HTB. >> Yes, as you have noticed with HTB there is a single qdisc lock, and >> congestion obviously happens :-) >> >> It is possible with different tricks to make it scale. I believe >> Google is using a variant of HTB, and it scales for them. They have >> not open source their modifications to HTB (which likely also involves >> a great deal of setup tricks). >> >> If your purpose it to limit traffic/bandwidth per "cloud" instance, >> then you can just use another TC setup structure. Like using MQ and >> assigning a HTB per MQ queue (where the MQ queues are bound to each >> CPU/HW queue)... But you have to figure out this setup yourself... >> >> >>> But in this case when some applications send massive small packets in >>> parallel, the application to be protected will get its throughput >>> affected (because it’s doing synchronous network communication using >>> multiple threads and throughput is sensitive to the increased latency) >>> >>> Here is the profiling from perf: >>> >>> - 67.57% iperf [kernel.kallsyms] [k] _spin_lock >>> - 99.94% dev_queue_xmit >>> - 96.91% _spin_lock >>> - 2.62% __qdisc_run >>> - 98.98% sch_direct_xmit >>> - 99.98% _spin_lock >>> >>> As far as I understand the design of TC is to simplify locking schema >>> and minimize the work in __qdisc_run so that throughput won’t be >>> affected, especially with large packets. However if the scenario is >>> that multiple classes in the queueing discipline only have the shaping >>> limit, there isn’t really a necessary correlation between different >>> classes. The only synchronization point should be when the packet is >>> dequeued from the qdisc queue and enqueued to the transmit queue of >>> the device. My question is – is it worth investing on avoiding the >>> locking contention by partitioning the queue/lock so that this >>> scenario is addressed with relatively smaller latency? >> Yes, there is a lot go gain, but it is not easy ;-) >> >>> I must have oversimplified a lot of details since I’m not familiar >>> with the TC implementation at this point – just want to get your input >>> in terms of whether this is a worthwhile effort or there is something >>> fundamental that I’m not aware of. If this is just a matter of quite >>> some additional work, would also appreciate helping to outline the >>> required work here. >>> >>> Also would appreciate if there is any information about the latest >>> status of this work http://www.ijcset.com/docs/IJCSET13-04-04-113.pdf >> This article seems to be very low quality... spelling errors, only 5 >> pages, no real code, etc. >> >> -- >> Best regards, >> Jesper Dangaard Brouer >> MSc.CS, Principal Kernel Engineer at Red Hat >> Author of http://www.iptv-analyzer.org >> LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists