lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 3 May 2013 14:38:25 -0700 From: Tom Herbert <therbert@...gle.com> To: David Majnemer <majnemer@...gle.com> Cc: "David S. Miller" <davem@...emloft.net>, Maciej Żenczykowski <maze@...gle.com>, Paul Turner <pjt@...gle.com>, Linux Netdev List <netdev@...r.kernel.org> Subject: Re: [PATCH RFC] net: Update RPS target at poll for inet I think this is nice idea, if someone is polling and then reading from another thread (CPU) that's not going to be good performance anyway. Acked-by: Tom Herbert <therbert@...gle.com> On Wed, Apr 17, 2013 at 1:10 PM, David Majnemer <majnemer@...gle.com> wrote: > Hello, > > The current state of affairs is that read(2)/write(2) will engage > RPS (receive packet steering) for internet protocol sockets while > poll(2) does not. > > We noticed that this is suboptimal for scenarios similar to the following: > 1. Perform a poll(2) on a socket. > 2. Block in sock_poll_wait (or something similar). > 3. Perform a read(2) on the previously mentioned socket. > > This patch updates the RPS target during poll(2) which improves this scenario. > > Concretely, we saw improvements on the order of ~20% for the median and > ~6% for the mean on our internal packet forwarding workload. Let me know if > there is another workload I should take a look at. > > Thanks, > David > > -----------------------------8<--------------------------- > > When poll(2) gets called with an ip socket, update the flow target to be the > poller. The data will be on the appropriate CPU when the poller later calls > recvmsg(2). > > Signed-off-by: David Majnemer <majnemer@...gle.com> > --- > include/net/inet_common.h | 2 ++ > net/ipv4/af_inet.c | 13 ++++++++++++- > net/ipv4/tcp.c | 2 ++ > net/ipv4/udp.c | 2 ++ > 4 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/include/net/inet_common.h b/include/net/inet_common.h > index 2340087..21ecdb3 100644 > --- a/include/net/inet_common.h > +++ b/include/net/inet_common.h > @@ -20,6 +20,8 @@ extern int __inet_stream_connect(struct socket *sock, struct sockaddr *uaddr, > int addr_len, int flags); > extern int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr, > int addr_len, int flags); > +extern unsigned int inet_dgram_poll(struct file *file, struct socket *sock, > + poll_table *wait); > extern int inet_accept(struct socket *sock, struct socket *newsock, int flags); > extern int inet_sendmsg(struct kiocb *iocb, struct socket *sock, > struct msghdr *msg, size_t size); > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c > index c61b3bb..2c030ef 100644 > --- a/net/ipv4/af_inet.c > +++ b/net/ipv4/af_inet.c > @@ -567,6 +567,17 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr, > } > EXPORT_SYMBOL(inet_dgram_connect); > > +unsigned int inet_dgram_poll(struct file *file, struct socket *sock, > + poll_table *wait) > +{ > + struct sock *sk = sock->sk; > + > + sock_rps_record_flow(sk); > + > + return datagram_poll(file, sock, wait); > +} > +EXPORT_SYMBOL(inet_dgram_poll); > + > static long inet_wait_for_connect(struct sock *sk, long timeo, int writebias) > { > DEFINE_WAIT(wait); > @@ -1001,7 +1012,7 @@ static const struct proto_ops inet_sockraw_ops = { > .socketpair = sock_no_socketpair, > .accept = sock_no_accept, > .getname = inet_getname, > - .poll = datagram_poll, > + .poll = inet_dgram_poll, > .ioctl = inet_ioctl, > .listen = sock_no_listen, > .shutdown = inet_shutdown, > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index dcb116d..8d80092 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -436,6 +436,8 @@ unsigned int tcp_poll(struct file *file, struct socket *sock, poll_table *wait) > struct sock *sk = sock->sk; > const struct tcp_sock *tp = tcp_sk(sk); > > + sock_rps_record_flow(sk); > + > sock_poll_wait(file, sk_sleep(sk), wait); > if (sk->sk_state == TCP_LISTEN) > return inet_csk_listen_poll(sk); > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > index 3159d16..5f115cb 100644 > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -1967,6 +1967,8 @@ unsigned int udp_poll(struct file *file, struct socket *sock, poll_table *wait) > unsigned int mask = datagram_poll(file, sock, wait); > struct sock *sk = sock->sk; > > + sock_rps_record_flow(sk); > + > /* Check for false positives due to checksum errors */ > if ((mask & POLLRDNORM) && !(file->f_flags & O_NONBLOCK) && > !(sk->sk_shutdown & RCV_SHUTDOWN) && !first_packet_length(sk)) > -- > 1.8.2.1 > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists