netdev - Re: [RFC PATCH 0/4] Implement multiqueue virtio-net

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <OFAA7EBCA4.A8F764FD-ON65257798.0035E8C2-65257798.00385450@in.ibm.com>
Date:	Wed, 8 Sep 2010 15:47:35 +0530
From:	Krishna Kumar2 <krkumar2@...ibm.com>
To:	Avi Kivity <avi@...hat.com>
Cc:	anthony@...emonkey.ws, davem@...emloft.net, kvm@...r.kernel.org,
	mst@...hat.com, netdev@...r.kernel.org, rusty@...tcorp.com.au
Subject: Re: [RFC PATCH 0/4] Implement multiqueue virtio-net

Avi Kivity <avi@...hat.com> wrote on 09/08/2010 02:58:21 PM:

> >>> 1. This feature was first implemented with a single vhost.
> >>>      Testing showed 3-8% performance gain for upto 8 netperf
> >>>      sessions (and sometimes 16), but BW dropped with more
> >>>      sessions.  However, implementing per-txq vhost improved
> >>>      BW significantly all the way to 128 sessions.
> >> Why were vhost kernel changes required?  Can't you just instantiate
more
> >> vhost queues?
> > I did try using a single thread processing packets from multiple
> > vq's on host, but the BW dropped beyond a certain number of
> > sessions.
>
> Oh - so the interface has not changed (which can be seen from the
> patch).  That was my concern, I remembered that we planned for vhost-net
> to be multiqueue-ready.
>
> The new guest and qemu code work with old vhost-net, just with reduced
> performance, yes?

Yes, I have tested new guest/qemu with old vhost but using
#numtxqs=1 (or not passing any arguments at all to qemu to
enable MQ). Giving numtxqs > 1 fails with ENOBUFS in vhost,
since vhost_net_set_backend in the unmodified vhost checks
for boundary overflow.

I have also tested running an unmodified guest with new
vhost/qemu, but qemu should not specify numtxqs>1.

> > Are you suggesting this
> > combination:
> >    IRQ on guest:
> >       40: CPU0
> >       41: CPU1
> >       42: CPU2
> >       43: CPU3 (all CPUs are on socket #0)
> >    vhost:
> >       thread #0:  CPU0
> >       thread #1:  CPU1
> >       thread #2:  CPU2
> >       thread #3:  CPU3
> >    qemu:
> >       thread #0:  CPU4
> >       thread #1:  CPU5
> >       thread #2:  CPU6
> >       thread #3:  CPU7 (all CPUs are on socket#1)
>
> May be better to put vcpu threads and vhost threads on the same socket.
>
> Also need to affine host interrupts.
>
> >    netperf/netserver:
> >       Run on CPUs 0-4 on both sides
> >
> > The reason I did not optimize anything from user space is because
> > I felt showing the default works reasonably well is important.
>
> Definitely.  Heavy tuning is not a useful path for general end users.
> We need to make sure the the scheduler is able to arrive at the optimal
> layout without pinning (but perhaps with hints).

OK, I will see if I can get results with this.

Thanks for your suggestions,

- KK

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html