lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 27 Nov 2016 17:21:33 +0100
From:   Paolo Abeni <pabeni@...hat.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>,
        Sabrina Dubroca <sd@...asysnail.net>
Subject: Re: [PATCH net-next 4/5] net/socket: add helpers for recvmmsg

Hi Eric,

On Fri, 2016-11-25 at 14:30 -0800, Eric Dumazet wrote:
> On Fri, 2016-11-25 at 16:39 +0100, Paolo Abeni wrote:
> > _skb_try_recv_datagram_batch dequeues multiple skb's from the
> > socket's receive queue, and runs the bulk_destructor callback under
> > the receive queue lock.
> 
> ...
> 
> > +	last = (struct sk_buff *)queue;
> > +	first = (struct sk_buff *)queue->next;
> > +	skb_queue_walk(queue, skb) {
> > +		last = skb;
> > +		totalsize += skb->truesize;
> > +		if (++datagrams == batch)
> > +			break;
> > +	}
> 
> This is absolutely not good.
> 
> Walking through a list, bringing 2 cache lines per skb, is not the
> proper way to deal with bulking.
> 
> And I do not see where 'batch' value coming from user space is capped ?
> 
> Is it really vlen argument coming from recvmmsg() system call ???
> 
> This code runs with BH masked, so you do not want to give user a way to
> make you loop there 1000 times 
> 
> Bulking is nice, only if you do not compromise with system stability and
> latency requirements from other users/applications.

Thank you for reviewing this.

You are right, the cacheline miss while accessing skb->truesize has
measurable performance impact, and the max burst length comes in from
recvmmsg(). 

We can easily cap the burst to some max value (e.g. 64 or less) and we
can pass to the bulk destructor the skb list and burst length without
accessing skb truesize beforehand. If the burst is short, say 8 skbs or
less, the bulk destructor walk again the list and release the memory,
elsewhere it defers the release after __skb_try_recv_datagram_batch()
completion: we walk the list without the lock held and we acquire it
later again to release all the memory. 

Thank you again,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ