[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100916100201.GI20864@redhat.com>
Date: Thu, 16 Sep 2010 12:02:01 +0200
From: "Michael S. Tsirkin" <mst@...hat.com>
To: "Xin, Xiaohui" <xiaohui.xin@...el.com>
Cc: Shirley Ma <mashirle@...ibm.com>, Arnd Bergmann <arnd@...db.de>,
Avi Kivity <avi@...hat.com>,
David Miller <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host
kernel
On Thu, Sep 16, 2010 at 04:18:10PM +0800, Xin, Xiaohui wrote:
> >From: Michael S. Tsirkin [mailto:mst@...hat.com]
> >Sent: Wednesday, September 15, 2010 5:59 PM
> >To: Xin, Xiaohui
> >Cc: Shirley Ma; Arnd Bergmann; Avi Kivity; David Miller; netdev@...r.kernel.org;
> >kvm@...r.kernel.org; linux-kernel@...r.kernel.org
> >Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel
> >
> >On Wed, Sep 15, 2010 at 10:46:02AM +0800, Xin, Xiaohui wrote:
> >> >From: Michael S. Tsirkin [mailto:mst@...hat.com]
> >> >Sent: Wednesday, September 15, 2010 12:30 AM
> >> >To: Shirley Ma
> >> >Cc: Arnd Bergmann; Avi Kivity; Xin, Xiaohui; David Miller; netdev@...r.kernel.org;
> >> >kvm@...r.kernel.org; linux-kernel@...r.kernel.org
> >> >Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel
> >> >
> >> >On Tue, Sep 14, 2010 at 09:00:25AM -0700, Shirley Ma wrote:
> >> >> On Tue, 2010-09-14 at 17:22 +0200, Michael S. Tsirkin wrote:
> >> >> > I would expect this to hurt performance significantly.
> >> >> > We could do this for asynchronous requests only to avoid the
> >> >> > slowdown.
> >> >>
> >> >> Is kiocb in sendmsg helpful here? It is not used now.
> >> >>
> >> >> Shirley
> >> >
> >> >Precisely. This is what the patch from Xin Xiaohui does. That code
> >> >already seems to do most of what you are trying to do, right?
> >> >
> >> >The main thing missing seems to be macvtap integration, so that we can fall back
> >> >on data copy if zero copy is unavailable?
> >> >How hard would it be to basically link the mp and macvtap modules
> >> >together to get us this functionality? Anyone?
> >> >
> >> Michael,
> >> Is to support macvtap with zero-copy through mp device the functionality
> >> you mentioned above?
> >
> >I have trouble parsing the above question. At some point Arnd suggested
> >that the mp device functionality would fit nicely as part of the macvtap
> >driver. It seems to make sense superficially, the advantage if it
> >worked would be that we would get zero copy (mostly) transparently.
> >
> >Do you agree with this goal?
> >
>
> I would say yes.
In that case, it's a blocker for upstream merge because this change
affects userspace.
> >> Before Shirley Ma has suggested to move the zero-copy functionality into
> >> tun/tap device or macvtap device. How do you think about that?
> >> I suspect
> >> there will be a lot of duplicate code in that three drivers except we can extract
> >> code of zero-copy into kernel APIs and vhost APIs.
> >
> >
> >tap would be very hard at this point as it does not bind to a device.
> >macvtap might work, we mainly need to figure out a way to detect that
> >device can do zero copy so the right mode is used. I think a first step
> >could be to simply link mp code into macvtap module, pass necessary
> >ioctls on, then move some code around as necessary. This might get rid
> >of code duplication nicely.
>
> I'll look into this to see how much effort would be.
>
> >
> >
> >> Do you think that's worth to do and help current process which is blocked too
> >> long than I expected?
> >
> >I think it's nice to have.
> >
> >And if done hopefully this will get the folk working on the macvtap
> >driver to review the code, which will help find all issues faster.
> >
> >I think if you post some performance numbers,
> >this will also help get people excited and looking at the code.
> >
>
> The performance data I have posted before is compared with raw socket on vhost-net.
> But currently, the raw socket backend is removed from the qemu side.
> So I can only compare with tap on vhost-net. But unfortunately, I missed something
> that I even can't bring it up. I was blocked by this for a time.
Hey, maybe you are seeing the bug that was reported recently.
Could you try tcpdump -i on the tap interface in host and ethX on guest
and tell me what you see?
If you see packet in guest but not in host, could you try
adding printks in vhost handle_tx to see whether it gets called
and if yes where it fails?
> >I also don't see the process as completely blocked, each review round points
> >out more issues: we aren't going back and forth changing
> >same lines again and again, are we?
> >
> >One thing that might help is increase the frequency of updates,
> >try sending them out sooner.
> >On the other hand 10 new patches each revision is a lot:
> >if there is a part of patchset that has stabilised you can split it out,
> >post once and keep posting the changing part separately.
> >
> >I hope these suggestions help.
>
> Thanks, Michael!
>
> >
> >> >
> >> >--
> >> >MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists