[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-KuLHa=og50nw5MQpsN8dgCPJ+iFhiCVF=OvkVJFe8knA@mail.gmail.com>
Date: Wed, 21 Feb 2018 17:50:44 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Sowmini Varadhan <sowmini.varadhan@...cle.com>
Cc: Network Development <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>, rds-devel@....oracle.com,
Santosh Shilimkar <santosh.shilimkar@...cle.com>
Subject: Re: [PATCH net-next] RDS: deliver zerocopy completion notification
with data as an optimization
On Wed, Feb 21, 2018 at 5:14 PM, Sowmini Varadhan
<sowmini.varadhan@...cle.com> wrote:
> On (02/21/18 16:54), Willem de Bruijn wrote:
>>
>> I'd put this optimization behind a socket option.
>
> Yes, that thought occurred to me as well- I think RDS applications
> are unlikely to use the error_queue path because, as I mentioned
> before, these are heavily request-response based, so you're
> going to be getting something back for each sendmsg anyway.
>
> So if I had a sockopt, it would be something that would be
> "piggyback completion" most (all?) of the time anyway, but I suppose
> making it explicit is useful to have.
>
>> Either that, or always store cookies on an RDS specific queue, return as recv
>> cmsg and then remove the alternative error queue path completely.
>
> I think the error queue path is good to have, if only to be aligned
> with other socket families.
>
>> > + spin_lock_irqsave(&q->lock, flags);
>>
>> This seems expensive on every recvmsg. Even if zerocopy is not enabled.
>
> noted, I'll predicate it on both zcopy and the sockopt suggestion.
>
>> > + if (skb_queue_empty(q)) {
>> > + spin_unlock_irqrestore(&q->lock, flags);
>> > + return 0;
>> > + }
>> > + skb_queue_walk_safe(q, skb, tmp) {
>>
>> This too. If anything, just peek at the queue head and skip otherwise.
>> Anything else will cause an error, which blocks regular read and write
>> on the socket.
>
> but that would be technically incorrect, because you could have a
> mix/match of the zcopy notification with other sk_error_queue messages
> (given that rds_rm_zerocopy_callback looks at the tail, and
> creates a new skb if the tail is not a SO_EE_ORIGIN_ZCOOKIE).
>
> Of course, its true that *today* the only thing in the rds socket
> error queue is the cookie list, so this queue walk is a bit of overkill..
>
> but maybe I am missing something you are concerned about? The queue walk is
> intended to pull out the first SO_EE_ORIGIN_ZCOOKIE (if it exists)
> and return quietly otherwise
Yes, avoiding out of order completions is sensible.
In the common case no more than one notification will be outstanding,
but with a fixed number of notifications per packet, in edge cases this
list may be long.
Reading using a reverse walk avoids that problem.
> (leaving err_queue unchanged)- where
> do you see the blocking error?
Socket functions block if sk_err is non-zero. See for instance
tcp_sendmsg_locked. It is set by most code that also calls
sock_queue_err_skb and also on dequeue from err skb.
This is the main reason that I would consider dropping error
queue completely if you expect all users of RDS to use the
cmsg on regular read to get these notifications.
>> > + if (list_empty(&rs->rs_recv_queue) && nonblock) {
>> > + ncookies = rds_recvmsg_zcookie(rs, msg);
>> > + if (ncookies) {
>> > + ret = 0;
>>
>> This probably changes semantics for MSG_DONTWAIT. Will it ever return 0 now?
>
> The only time MSG_DONTWAIT was previously returning 0 for PF_RDS
> was when there was a congestion notification or rdma completion
> i.e., the checks for rs_notify_queue and rs_cong_notify earlier
> in the function, and in those 2 cases it was not returning data
> (i.e. ret was always 0). In all other cases rds_recvmsg would
> always return either
> - ret == -1, with errno set to EAGAIN or ETIMEDOUT, depending on the value
> of MSG_DONTWAIT, or,
> - ret > 0 with data bytes.
>
> That behavior (as well existing behavior for congestion notification
> and rdma completion) has not changed. The only thing that changed is that
> if there is no data to pass up, the ret will be 0, errno will be 0,
> and the ancillary data may have completion cookies.
Okay. If callers must already handle 0 as a valid return code, then
it is fine to add another case that does this.
The extra branch in the hot path is still rather unfortunately. Could
this be integrated in the existing if (nonblock) branch below?
Powered by blists - more mailing lists