[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-LecYGeH1Wg3kL-pDDKaFX0pbXzr_dfVPrsrBL3uL8VWw@mail.gmail.com>
Date: Wed, 21 Feb 2018 16:54:56 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Sowmini Varadhan <sowmini.varadhan@...cle.com>
Cc: Network Development <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>, rds-devel@....oracle.com,
Santosh Shilimkar <santosh.shilimkar@...cle.com>
Subject: Re: [PATCH net-next] RDS: deliver zerocopy completion notification
with data as an optimization
On Wed, Feb 21, 2018 at 3:19 PM, Sowmini Varadhan
<sowmini.varadhan@...cle.com> wrote:
> This commit is an optimization that builds on top of commit 01883eda72bd
> ("rds: support for zcopy completion notification") for PF_RDS sockets.
>
> Cookies associated with zerocopy completion are passed up on the POLLIN
> channel, piggybacked with data whereever possible. Such cookies are passed
> up as ancillary data (at level SOL_RDS) in a struct rds_zcopy_cookies when
> the returned value of recvmsg() is >= 0. A max of SO_EE_ORIGIN_MAX_ZCOOKIES
> may be passed with each message.
I'd put this optimization behind a socket option.
Either that, or always store cookies on an RDS specific queue, return as recv
cmsg and then remove the alternative error queue path completely.
Having both is confusing, also in terms of system behavior (e.g., see signaling
and sk_err handling in sock_dequeue_err_skb).
> +static int rds_recvmsg_zcookie(struct rds_sock *rs, struct msghdr *msg)
> +{
> + struct sk_buff *skb, *tmp;
> + struct sock_exterr_skb *serr;
> + struct sock *sk = rds_rs_to_sk(rs);
> + struct sk_buff_head *q = &sk->sk_error_queue;
> + struct rds_zcopy_cookies done;
> + u32 *ptr;
> + int i;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&q->lock, flags);
This seems expensive on every recvmsg. Even if zerocopy is not enabled.
> + if (skb_queue_empty(q)) {
> + spin_unlock_irqrestore(&q->lock, flags);
> + return 0;
> + }
> + skb_queue_walk_safe(q, skb, tmp) {
This too. If anything, just peek at the queue head and skip otherwise.
Anything else will cause an error, which blocks regular read and write
on the socket.
> int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
> int msg_flags)
> {
> @@ -586,6 +623,7 @@ int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
> int ret = 0, nonblock = msg_flags & MSG_DONTWAIT;
> DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name);
> struct rds_incoming *inc = NULL;
> + int ncookies;
>
> /* udp_recvmsg()->sock_recvtimeo() gets away without locking too.. */
> timeo = sock_rcvtimeo(sk, nonblock);
> @@ -609,6 +647,14 @@ int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
> break;
> }
>
> + if (list_empty(&rs->rs_recv_queue) && nonblock) {
> + ncookies = rds_recvmsg_zcookie(rs, msg);
> + if (ncookies) {
> + ret = 0;
This probably changes semantics for MSG_DONTWAIT. Will it ever return 0 now?
Powered by blists - more mailing lists