lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 15 Jun 2012 13:37:42 +0800 From: Li Yu <raise.sail@...il.com> To: Changli Gao <xiaosuo@...il.com> CC: Linux Netdev List <netdev@...r.kernel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, davidel@...ilserver.org Subject: Re: [RFC] Introduce to batch variants of accept() and epoll_ctl() syscall 于 2012年06月15日 12:29, Changli Gao 写道: > On Fri, Jun 15, 2012 at 12:13 PM, Li Yu<raise.sail@...il.com> wrote: >> Hi, >> >> We encounter a performance problem in a large scale computer >> cluster, which needs to handle a lot of incoming concurrent TCP >> connection requests. >> >> The top shows the kernel is most cpu hog, the testing is simple, >> just a accept() -> epoll_ctl(ADD) loop, the ratio of cpu util sys% to >> si% is about 2:5. >> >> I also asked some experienced webserver/proxy developers in my team >> for suggestions, it seem that behavior of many userland programs already >> called accept() multiple times after it is waked up by >> epoll_wait(). And the common action is adding the fd that accept() >> return into epoll interface by epoll_ctl() syscall then. >> >> Therefore, I think that we'd better to introduce to batch variants of >> accept() and epoll_ctl() syscall, just like sendmmsg() or recvmmsg(). >> >> For accept(), we may need a new syscall, it may like this, >> >> struct accept_result { >> int fd; >> struct sockaddr addr; >> socklen_t addr_len; >> }; >> >> int maccept4(int fd, int flags, int nr_accept_result, struct >> accept_result *results); >> >> For epoll_ctl(), there are two means to extend it, I prefer to extend >> current interface instead of introduce to new syscall. We may introduce >> to a new flag EPOLL_CTL_BATCH. If userland call epoll_ctl() with this >> flag set, the meaning of last two arguments of epoll_ctl() change, .e.g: >> >> struct batch_epoll_event batch_event[] = { >> { >> .fd = a_newsock_fd; >> .epoll_event = { ... }; >> }, >> ... >> }; >> >> ret = epoll_ctl(fd, EPOLL_CTL_ADD|EPOLL_CTL_BATCH, nr_batch_events, >> batch_events); >> > > I think it is good idea. Would you please implement a prototype and > give some numbers? This kind of data may help selling this idea. > Thanks. > Of course, I think that implementing them should not be a hard work :) Em. I really do not know whether it is necessary to introduce to a new syscall here. An alternative solution to add new socket option to handle such batch requirement, so applications also can detect if kernel has this extended ability with a easy getsockopt() call. Any way, I am going to try to write a prototype first. Thanks Yu -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists