netdev - Re: [PATCH net-next 4/7] rds: support for zcopy completion notification

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180128161513.GA22885@oracle.com>
Date:   Sun, 28 Jan 2018 11:15:35 -0500
From:   Sowmini Varadhan <sowmini.varadhan@...cle.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>, rds-devel@....oracle.com,
        santosh.shilimkar@...cle.com
Subject: Re: [PATCH net-next 4/7] rds: support for zcopy completion
 notification

thanks for taking the time to go through the code!

> 
> An alternative that does not require a timer is to batch on the sk
> error queue itself, like tcp zerocopy. That queues the first notification
> skb on the error queue without any notification latency.
> 
> Then, if a subsequent notification comes in while another is pending
> with < MAX zcookies, it coalesces the new notification onto the pending
> skb and consumes the other. For RDS notifications, the implementation
> is an extra skb_put + uint32_t assignment.

This is an interesting idea, let me give it a try.

> Optionally, the socket can trigger another sk_error_report on each
> new notification.

I was trying to avoid that- an upcall for each message is not a good
idea when you have high traffic and are able to free multiple rds_messages
at a  time.

> > +static void rds_rm_zerocopy_callback(struct rds_sock *rs,
> > +                                    struct rds_znotifier *znotifier,
> > +                                    bool force)
> > +{
> > +       struct sock *sk = rds_rs_to_sk(rs);
> > +       struct sk_buff *skb;
> > +       struct sock_exterr_skb *serr;
> > +       unsigned long flags;
> > +       u32 *ptr;
> > +       int ncookies = 0, i;
> > +       struct rds_znotifier *znotif, *ztmp, *first;
> > +       LIST_HEAD(tmp_list);
> > +
> > +       spin_lock_irqsave(&rs->rs_lock, flags);
> > +       ncookies = rs->rs_ncookies;
> > +       if (ncookies < SO_EE_ORIGIN_MAX_ZCOOKIES && !force) {
> > +               if (znotifier) { /* add this cookie to the list and return */
> 
> can be checked before taking lock.
> 
> More importantly, when is this ever NULL? 

It is null when invoked from tthe timer callback (rs_zcopy_notify() 
going off because havent had any traffic for the expiration interval
so we want to send out pending notifications, but dont have any znotifier
in this case). But you are right in that:

> This function is a callback
> for a zerocopy struct of type znotifier. Is it doing double duty to flush
> any outstanding if znotifier == NULL && force == true? If so, the first
> condition probably never occurs unless force == true and thus the
> second is redundant.

yes, force can simply be !znotifier. 

I dont quite follow the "can be checked before taking the lock
comment though"- the lock is needed to make sure we atomically do
the lists "add new entry and potentially flush" operation.
the check is for the "potentially" part of that operation, so 
I'm not seeing how it would help to move it out of the lock.

having said  all that, I like the earlier suggestion of just 
batching on the error_queue itself. If that works out without any
issues, all of this stuff may not be needed, so let me give that
a shot first.

> This adds unnecessary notification latency to delivery of current
> notification. The latest notification can be appended to tmp_list and
> sent up immediately.

true it if fits within the MAX cookies limit. 

> > +               (void)mod_timer(&rs->rs_cookie_timer, RDS_REAP_TIMEOUT);
> 
> Resetting timeout on each queued notification causes unbound
> notification latency for previous notifications on the queue.

I'm not sure I get that comment. RDS_REAP_TIMEOUT is the upper bound
for sending notification. If, in the interim, we get back a TCP
ack that lets us reap a bunch of messages, we'd go and flush the
queue anyway, so the RDS_REAP_TIMEOUT will not matter. Can you 
elaborate on your concern?

> > +
> > +               sock_put(rds_rs_to_sk(rs));
> > +               rm->m_rs = NULL;
> 
> These two lines are now called only if znotifier is true, but
> used to be called whenever rm->m_rs != NULL. Intentional?

good catch, no that's a bug. Thanks for flagging.

Also, ACK to all the comments for patch 5/7. Will fix for next round.