lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 9 May 2010 11:57:33 +0300
From:	"Michael S. Tsirkin" <mst@...hat.com>
To:	Rusty Russell <rusty@...tcorp.com.au>
Cc:	netdev@...r.kernel.org, virtualization@...ts.linux-foundation.org,
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org, mingo@...e.hu,
	linux-mm@...ck.org, akpm@...ux-foundation.org, hpa@...or.com,
	gregory.haskins@...il.com, s.hetze@...ux-ag.com,
	Daniel Walker <dwalker@...o99.com>,
	Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: virtio: put last_used and last_avail index into ring itself.

On Fri, May 07, 2010 at 12:35:39PM +0930, Rusty Russell wrote:
> On Thu, 6 May 2010 03:57:55 pm Michael S. Tsirkin wrote:
> > On Thu, May 06, 2010 at 10:22:12AM +0930, Rusty Russell wrote:
> > > On Wed, 5 May 2010 03:52:36 am Michael S. Tsirkin wrote:
> > > > What do you think?
> > > 
> > > I think everyone is settled on 128 byte cache lines for the forseeable
> > > future, so it's not really an issue.
> > 
> > You mean with 64 bit descriptors we will be bouncing a cache line
> > between host and guest, anyway?
> 
> I'm confused by this entire thread.
> 
> Descriptors are 16 bytes.  They are at the start, so presumably aligned to
> cache boundaries.
> 
> Available ring follows that at 2 bytes per entry, so it's also packed nicely
> into cachelines.
> 
> Then there's padding to page boundary.  That puts us on a cacheline again
> for the used ring; also 2 bytes per entry.
> 

Hmm, is used ring really 2 bytes per entry?


/* u32 is used here for ids for padding reasons. */
struct vring_used_elem {
        /* Index of start of used descriptor chain. */
        __u32 id;
        /* Total length of the descriptor chain which was used (written to) */
        __u32 len;
};

struct vring_used {
        __u16 flags;
        __u16 idx;
        struct vring_used_elem ring[];
};

> I don't see how any change in layout could be more cache friendly?
> Rusty.

I thought that used ring has 8 bytes per entry, and that struct
vring_used is aligned at page boundary, this
would mean that ring element is at offset 4 bytes from page boundary.
Thus with cacheline size 128 bytes, each 4th element crosses
a cacheline boundary. If we had a 4 byte padding after idx, each
used element would always be completely within a single cacheline.

What am I missing?
-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists