lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130825114852.GA1829@redhat.com>
Date:	Sun, 25 Aug 2013 14:48:52 +0300
From:	"Michael S. Tsirkin" <mst@...hat.com>
To:	Jason Wang <jasowang@...hat.com>
Cc:	kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/6] vhost_net: use vhost_add_used_and_signal_n() in
 vhost_zerocopy_signal_used()

On Fri, Aug 23, 2013 at 04:50:38PM +0800, Jason Wang wrote:
> On 08/20/2013 10:33 AM, Jason Wang wrote:
> > On 08/16/2013 05:54 PM, Michael S. Tsirkin wrote:
> >> On Fri, Aug 16, 2013 at 01:16:26PM +0800, Jason Wang wrote:
> >>>> Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to
> >>>> vhost_add_used_and_signal(). With the patch we will call at most 2 times
> >>>> (consider done_idx warp around) compared to N times w/o this patch.
> >>>>
> >>>> Signed-off-by: Jason Wang <jasowang@...hat.com>
> >> So? Does this help performance then?
> >>
> > Looks like it can especially when guest does support event index. When
> > guest enable tx interrupt, this can saves us some unnecessary signal to
> > guest. I will do some test.
> 
> Have done some test. I can see 2% - 3% increasing in both aggregate
> transaction rate and per cpu transaction rate in TCP_RR and UDP_RR test.
> 
> I'm using ixgbe. W/o this patch, I can see more than 100 calls of
> vhost_add_used_signal() in one vhost_zerocopy_signaled_used(). This is
> because ixgbe (and other modern ethernet driver) tends to free old tx
> skbs in a loop during tx interrupt, and vhost tend to batch the adding
> used and signal in vhost_zerocopy_callback(). Switching to use
> vhost_add_use_and_signal_n() means saving 100 times of used idx updating
> and memory barriers.

Well it's only smp_wmb so a nop on most architectures, so
a 2% gain is surprising.
I'm guessing the cache miss on the write is what's
giving us a speedup here.

I'll review the code, thanks.


-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ