lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1284476719.13351.35.camel@localhost.localdomain>
Date:	Tue, 14 Sep 2010 08:05:19 -0700
From:	Shirley Ma <mashirle@...ibm.com>
To:	Avi Kivity <avi@...hat.com>
Cc:	David Miller <davem@...emloft.net>, arnd@...db.de, mst@...hat.com,
	xiaohui.xin@...el.com, netdev@...r.kernel.org, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host
 kernel

On Tue, 2010-09-14 at 11:12 +0200, Avi Kivity wrote:
> >> +            base = (unsigned long)from->iov_base + offset1;
> >> +            size = ((base&  ~PAGE_MASK) + len + ~PAGE_MASK)>>
> PAGE_SHIFT;
> >> +            num_pages = get_user_pages_fast(base, size,
> 0,&page[i]);
> >> +            if ((num_pages != size) ||
> >> +                (num_pages>  MAX_SKB_FRAGS -
> skb_shinfo(skb)->nr_frags))
> >> +                    /* put_page is in skb free */
> >> +                    return -EFAULT;
> > What keeps the user from writing to these pages in it's address
> space
> > after the write call returns?
> >
> > A write() return of success means:
> >
> >       "I wrote what you gave to me"
> >
> > not
> >
> >       "I wrote what you gave to me, oh and BTW don't touch these
> >           pages for a while."
> >
> > In fact "a while" isn't even defined in any way, as there is no way
> > for the write() invoker to know when the networking card is done
> with
> > those pages.
> 
> That's what io_submit() is for.  Then io_getevents() tells you what
> "a 
> while" actually was.

This macvtap zero copy uses iov buffers from vhost ring, which is
allocated from guest kernel. In host kernel, vhost calls macvtap
sendmsg. macvtap sendmsg calls get_user_pages_fast to pin these buffers'
pages for zero copy.

The patch is relying on how vhost handle these buffers. I need to look
at vhost code (qemu) first for addressing the questions here.

Thanks
Shirley

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ