lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 23 May 2016 23:35:04 +0300
From:	"Michael S. Tsirkin" <mst@...hat.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	linux-kernel@...r.kernel.org, Jason Wang <jasowang@...hat.com>,
	davem@...emloft.net, netdev@...r.kernel.org,
	Steven Rostedt <rostedt@...dmis.org>, brouer@...hat.com
Subject: Re: [PATCH v5 0/2] skb_array: array based FIFO for skbs

On Mon, May 23, 2016 at 06:31:46AM -0700, Eric Dumazet wrote:
> On Mon, 2016-05-23 at 13:43 +0300, Michael S. Tsirkin wrote:
> > This is in response to the proposal by Jason to make tun
> > rx packet queue lockless using a circular buffer.
> > My testing seems to show that at least for the common usecase
> > in networking, which isn't lockless, circular buffer
> > with indices does not perform that well, because
> > each index access causes a cache line to bounce between
> > CPUs, and index access causes stalls due to the dependency.
> > 
> > By comparison, an array of pointers where NULL means invalid
> > and !NULL means valid, can be updated without messing up barriers
> > at all and does not have this issue.
> 
> Note that both consumers and producers write in the array, so in light
> load (like TCP_RR), there are 2 cache line used byt the producers, and 2
> cache line used for consumers, with potential bouncing.

The shared part is RO by producer and consumer both,
so it's not bouncing - it can be shared in both caches.

Clearly memory footprint for this data structure is bigger
so it might cause more misses.

> In the other hand, the traditional sk_buff_head has one cache line,
> holding the spinlock and list head/tail.
>
> We might use the 'shared cache line' :
> 
> +       /* Shared consumer/producer data */
> +       int size ____cacheline_aligned_in_smp; /* max entries in queue
> */
> +       struct sk_buff **queue;
> 
> 
> To put here some fast path involving a single cache line access when
> queue has 0 or 1 item.
> 

I will try to experiment with it, but pls note that
this cache line is RO by producer and consumer currently,
if we make it writeable it will be bouncing.

-- 
MST

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ