lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 21 Feb 2018 17:50:44 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Sowmini Varadhan <sowmini.varadhan@...cle.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>, rds-devel@....oracle.com,
        Santosh Shilimkar <santosh.shilimkar@...cle.com>
Subject: Re: [PATCH net-next] RDS: deliver zerocopy completion notification
 with data as an optimization

On Wed, Feb 21, 2018 at 5:14 PM, Sowmini Varadhan
<sowmini.varadhan@...cle.com> wrote:
> On (02/21/18 16:54), Willem de Bruijn wrote:
>>
>> I'd put this optimization behind a socket option.
>
> Yes, that thought occurred to me as well- I think RDS applications
> are unlikely to use the error_queue path because, as I mentioned
> before, these are heavily request-response based, so you're
> going to be getting something back for each sendmsg anyway.
>
> So if I had a sockopt, it would be something that would be
> "piggyback completion" most (all?) of the time anyway, but I suppose
> making it explicit is useful to have.
>
>> Either that, or always store cookies on an RDS specific queue, return as recv
>> cmsg and then remove the alternative error queue path completely.
>
> I think the error queue path is good to have, if only to be aligned
> with other socket families.
>
>> > +       spin_lock_irqsave(&q->lock, flags);
>>
>> This seems expensive on every recvmsg. Even if zerocopy is not enabled.
>
> noted, I'll predicate it on both zcopy and the sockopt suggestion.
>
>> > +       if (skb_queue_empty(q)) {
>> > +               spin_unlock_irqrestore(&q->lock, flags);
>> > +               return 0;
>> > +       }
>> > +       skb_queue_walk_safe(q, skb, tmp) {
>>
>> This too. If anything, just peek at the queue head and skip otherwise.
>> Anything else will cause an error, which blocks regular read and write
>> on the socket.
>
> but that would be technically incorrect, because you could have a
> mix/match of the zcopy notification with other sk_error_queue messages
> (given that rds_rm_zerocopy_callback looks at the tail, and
> creates a new skb if the tail is not a SO_EE_ORIGIN_ZCOOKIE).
>
> Of course, its true that *today* the only thing in the rds socket
> error queue is the cookie list, so this queue walk is a bit of overkill..
>
> but maybe I am missing something you are concerned about? The queue walk is
> intended to pull out the first SO_EE_ORIGIN_ZCOOKIE (if it exists)
> and return quietly otherwise

Yes, avoiding out of order completions is sensible.

In the common case no more than one notification will be outstanding,
but with a fixed number of notifications per packet, in edge cases this
list may be long.

Reading using a reverse walk avoids that problem.

> (leaving err_queue unchanged)- where
> do you see the blocking error?

Socket functions block if sk_err is non-zero. See for instance
tcp_sendmsg_locked. It is set by most code that also calls
sock_queue_err_skb and also on dequeue from err skb.

This is the main reason that I would consider dropping error
queue completely if you expect all users of RDS to use the
cmsg on regular read to get these notifications.

>> > +               if (list_empty(&rs->rs_recv_queue) && nonblock) {
>> > +                       ncookies = rds_recvmsg_zcookie(rs, msg);
>> > +                       if (ncookies) {
>> > +                               ret = 0;
>>
>> This probably changes semantics for MSG_DONTWAIT. Will it ever return 0 now?
>
> The only time MSG_DONTWAIT was previously returning 0 for PF_RDS
> was when there was a congestion notification or rdma completion
> i.e., the checks for rs_notify_queue and rs_cong_notify earlier
> in the function, and in those 2 cases it was not returning data
> (i.e. ret was always 0). In all other cases  rds_recvmsg would
> always return either
> - ret == -1, with errno set to EAGAIN or ETIMEDOUT, depending on the value
>   of MSG_DONTWAIT, or,
> - ret > 0 with data bytes.
>
> That behavior (as well existing behavior for congestion notification
> and rdma completion) has not changed. The only thing that changed is that
> if there is no data to pass up, the ret will be 0, errno will be 0,
> and the ancillary data may have completion cookies.

Okay. If callers must already handle 0 as a valid return code, then
it is fine to add another case that does this.

The extra branch in the hot path is still rather unfortunately. Could
this be integrated in the existing if (nonblock) branch below?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ