lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <50DD6DF1.7080304@markandruth.co.uk> Date: Fri, 28 Dec 2012 10:01:21 +0000 From: Mark Zealey <netdev@...kandruth.co.uk> To: netdev@...r.kernel.org Subject: UDP multi-core performance on a single socket and SO_REUSEPORT I appreciate that this question has come up a number of times over the years, most recently as far as I can see in this thread: http://markmail.org/message/hcc7zn5ln5wktypv . I'm going to explain my problem and present some performance numbers to back this up. The problem: I'm doing some research on scaling a dns server (powerdns) to work well on multi-core boxes (in this case testing with 2*E5-2650 processors ie linux sees 32 cores). My powerdns configuration uses a shared socket with one thread for each core in the box listening on that socket using poll()/recvmsg(). I've modified powerdns so in my tests it is doing the absolute minimum of work to answer packets (all queries are for the same record, it keeps the response in memory and just changes a few fields before calling sendmsg()). I'm binding to a single 10.xxx address and using this for all local and remote tests. The numbers below are generated using 16 parallel queryperf's on localhost (it doesn't really matter if it is from remote hosts or the localhost; the numbers don't change much). Using stock centos 6.3 kernel I see powerdns performing at around 120kqps (uses at most about 12 cpus) Using 3.7.1 kernel (from elrepo) I see this increase to 200-240kqps maxing out all cpu's in the box (soft interrupt cpu time is about 8* higher than on centos 6.3 kernel at 40% and system cpu time is at 50% - powerdns only uses 10% of the cpu time) Using stock centos 6.3 kernel with the google SO_REUSEPORT patch from 2010 (modified slightly so it applies) I see 500-600kqps from remote; or 1mqps when doing localhost queries. powerdns doesn't go past using 8 cpus - it appears that the limit it is hitting then is to do with some lock in sendmsg(). I've not been able to get the 2010 SO_REUSEPORT patch working on the 3.7.1 kernel I suspect it would make for even better performance as sendmsg() should have been significantly improved. Now, I don't believe that SO_REUSEPORT is needed in the kernel in this case, however the numbers above clearly show that the current UDP implementation for recvmsg() on a single socket across multiple cores on kernel 3.7.1 is still locking badly. A perf report on 3.7.1 (using 16 local queryperf's) shows: 68.34% pdns_server [kernel.kallsyms] [k] _raw_spin_lock_bh | --- 0x7fa472023a2d system_call_fastpath sys_recvmsg __sys_recvmsg sock_recvmsg inet_recvmsg udp_recvmsg skb_free_datagram_locked | |--100.00%-- lock_sock_fast | _raw_spin_lock_bh --0.00%-- [...] 3.10% pdns_server [kernel.kallsyms] [k] _raw_spin_lock_irqsave | --- 0x7fa472023a2d system_call_fastpath sys_recvmsg __sys_recvmsg sock_recvmsg inet_recvmsg udp_recvmsg | |--99.69%-- __skb_recv_datagram | | | |--77.68%-- _raw_spin_lock_irqsave | | | |--14.56%-- prepare_to_wait_exclusive | | _raw_spin_lock_irqsave | | | --7.76%-- finish_wait | _raw_spin_lock_irqsave --0.31%-- [...] ... Any advice or patches welcome... :-) Mark -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists