lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 12 Jul 2018 08:24:57 +0300
From:   "Michael S. Tsirkin" <mst@...hat.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     Tonghao Zhang <xiangxia.m.yue@...il.com>,
        makita.toshiaki@....ntt.co.jp,
        virtualization@...ts.linux-foundation.org,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next v5 0/4] net: vhost: improve performance when
 enable busyloop

On Thu, Jul 12, 2018 at 01:21:03PM +0800, Jason Wang wrote:
> 
> 
> On 2018年07月12日 11:34, Michael S. Tsirkin wrote:
> > On Thu, Jul 12, 2018 at 11:26:12AM +0800, Jason Wang wrote:
> > > 
> > > On 2018年07月11日 19:59, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 11, 2018 at 01:12:59PM +0800, Jason Wang wrote:
> > > > > On 2018年07月11日 11:49, Tonghao Zhang wrote:
> > > > > > On Wed, Jul 11, 2018 at 10:56 AM Jason Wang <jasowang@...hat.com> wrote:
> > > > > > > On 2018年07月04日 12:31, xiangxia.m.yue@...il.com wrote:
> > > > > > > > From: Tonghao Zhang <xiangxia.m.yue@...il.com>
> > > > > > > > 
> > > > > > > > This patches improve the guest receive and transmit performance.
> > > > > > > > On the handle_tx side, we poll the sock receive queue at the same time.
> > > > > > > > handle_rx do that in the same way.
> > > > > > > > 
> > > > > > > > For more performance report, see patch 4.
> > > > > > > > 
> > > > > > > > v4 -> v5:
> > > > > > > > fix some issues
> > > > > > > > 
> > > > > > > > v3 -> v4:
> > > > > > > > fix some issues
> > > > > > > > 
> > > > > > > > v2 -> v3:
> > > > > > > > This patches are splited from previous big patch:
> > > > > > > > http://patchwork.ozlabs.org/patch/934673/
> > > > > > > > 
> > > > > > > > Tonghao Zhang (4):
> > > > > > > >       vhost: lock the vqs one by one
> > > > > > > >       net: vhost: replace magic number of lock annotation
> > > > > > > >       net: vhost: factor out busy polling logic to vhost_net_busy_poll()
> > > > > > > >       net: vhost: add rx busy polling in tx path
> > > > > > > > 
> > > > > > > >      drivers/vhost/net.c   | 108 ++++++++++++++++++++++++++++----------------------
> > > > > > > >      drivers/vhost/vhost.c |  24 ++++-------
> > > > > > > >      2 files changed, 67 insertions(+), 65 deletions(-)
> > > > > > > > 
> > > > > > > Hi, any progress on the new version?
> > > > > > > 
> > > > > > > I plan to send a new series of packed virtqueue support of vhost. If you
> > > > > > > plan to send it soon, I can wait. Otherwise, I will send my series.
> > > > > > I rebase the codes. and find there is no improvement anymore, the
> > > > > > patches of  makita  may solve the problem. jason you may send your
> > > > > > patches, and I will do some research on busypoll.
> > > > > I see. Maybe you can try some bi-directional traffic.
> > > > > 
> > > > > Btw, lots of optimizations could be done for busy polling. E.g integrating
> > > > > with host NAPI busy polling or a 100% busy polling vhost_net. You're welcome
> > > > > to work or propose new ideas.
> > > > > 
> > > > > Thanks
> > > > It seems clear we do need adaptive polling.
> > > Yes.
> > > 
> > > >    The difficulty with NAPI
> > > > polling is it can't access guest memory easily. But maybe
> > > > get_user_pages on the polled memory+NAPI polling can work.
> > > You mean something like zerocopy? Looks like we can do busy polling without
> > > it. I mean something like https://patchwork.kernel.org/patch/8707511/.
> > > 
> > > Thanks
> > How does this patch work? vhost_vq_avail_empty can sleep,
> > you are calling it within an rcu read side critical section.
> 
> Ok, I get your meaning. I have patches to access vring through
> get_user_pages + vmap() which should help here. (And it increase PPS about
> 10%-20%).

Remember you must mark it as dirty on unpin too ...


> > 
> > That's not the only problem btw, another one is that the
> > CPU time spent polling isn't accounted with the VM.
> 
> 
> Yes, but it's not the 'issue' of this patch.

Yes it is. polling within thread context accounts CPU correctly.

> And I believe cgroup can help?
> 
> Thanks


cgroups are what's broken by polling in irq context.

> > 
> > > > > > > Thanks

Powered by blists - more mailing lists