[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <201005101241.57237.rusty@rustcorp.com.au>
Date: Mon, 10 May 2010 12:41:56 +0930
From: Rusty Russell <rusty@...tcorp.com.au>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux-foundation.org,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org, mingo@...e.hu,
linux-mm@...ck.org, akpm@...ux-foundation.org, hpa@...or.com,
gregory.haskins@...il.com, s.hetze@...ux-ag.com,
Daniel Walker <dwalker@...o99.com>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: virtio: put last_used and last_avail index into ring itself.
On Sun, 9 May 2010 06:27:33 pm Michael S. Tsirkin wrote:
> On Fri, May 07, 2010 at 12:35:39PM +0930, Rusty Russell wrote:
> > Then there's padding to page boundary. That puts us on a cacheline again
> > for the used ring; also 2 bytes per entry.
> >
>
> Hmm, is used ring really 2 bytes per entry?
Err, no, I am an idiot.
> /* u32 is used here for ids for padding reasons. */
> struct vring_used_elem {
> /* Index of start of used descriptor chain. */
> __u32 id;
> /* Total length of the descriptor chain which was used (written to) */
> __u32 len;
> };
>
> struct vring_used {
> __u16 flags;
> __u16 idx;
> struct vring_used_elem ring[];
> };
OK, now I get it. Sorry, I was focussed on the avail ring.
> I thought that used ring has 8 bytes per entry, and that struct
> vring_used is aligned at page boundary, this
> would mean that ring element is at offset 4 bytes from page boundary.
> Thus with cacheline size 128 bytes, each 4th element crosses
> a cacheline boundary. If we had a 4 byte padding after idx, each
> used element would always be completely within a single cacheline.
I think the numbers are: every 16th entry hits two cachelines. So currently
the first 15 entries are "free" (assuming we hit the idx cacheline anyway),
then 1 in 16 cost 2 cachelines. That makes the aligned version win when
N > 240.
But, we access the array linearly. So the extra cacheline cost is in fact
amortized. I doubt it could be measured, but maybe vring_get_buf() should
prefetch? While you're there, we could use an & rather than a mod on the
calculation, which may actually be measurable :)
Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists