[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1284476719.13351.35.camel@localhost.localdomain>
Date: Tue, 14 Sep 2010 08:05:19 -0700
From: Shirley Ma <mashirle@...ibm.com>
To: Avi Kivity <avi@...hat.com>
Cc: David Miller <davem@...emloft.net>, arnd@...db.de, mst@...hat.com,
xiaohui.xin@...el.com, netdev@...r.kernel.org, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host
kernel
On Tue, 2010-09-14 at 11:12 +0200, Avi Kivity wrote:
> >> + base = (unsigned long)from->iov_base + offset1;
> >> + size = ((base& ~PAGE_MASK) + len + ~PAGE_MASK)>>
> PAGE_SHIFT;
> >> + num_pages = get_user_pages_fast(base, size,
> 0,&page[i]);
> >> + if ((num_pages != size) ||
> >> + (num_pages> MAX_SKB_FRAGS -
> skb_shinfo(skb)->nr_frags))
> >> + /* put_page is in skb free */
> >> + return -EFAULT;
> > What keeps the user from writing to these pages in it's address
> space
> > after the write call returns?
> >
> > A write() return of success means:
> >
> > "I wrote what you gave to me"
> >
> > not
> >
> > "I wrote what you gave to me, oh and BTW don't touch these
> > pages for a while."
> >
> > In fact "a while" isn't even defined in any way, as there is no way
> > for the write() invoker to know when the networking card is done
> with
> > those pages.
>
> That's what io_submit() is for. Then io_getevents() tells you what
> "a
> while" actually was.
This macvtap zero copy uses iov buffers from vhost ring, which is
allocated from guest kernel. In host kernel, vhost calls macvtap
sendmsg. macvtap sendmsg calls get_user_pages_fast to pin these buffers'
pages for zero copy.
The patch is relying on how vhost handle these buffers. I need to look
at vhost code (qemu) first for addressing the questions here.
Thanks
Shirley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists