lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 12 Apr 2007 06:32:25 +0300 From: Avi Kivity <avi@...ranet.com> To: Rusty Russell <rusty@...tcorp.com.au> Cc: Ingo Molnar <mingo@...e.hu>, kvm-devel@...ts.sourceforge.net, netdev <netdev@...r.kernel.org> Subject: Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work Rusty Russell wrote: > On Wed, 2007-04-11 at 17:28 +0300, Avi Kivity wrote: > >> Rusty Russell wrote: >> >>> On Wed, 2007-04-11 at 07:26 +0300, Avi Kivity wrote: >>> >>> >>>> Nope. Being async is critical for copyless networking: >>>> >>>> >> With async operations, the saga continues like this: the host-side >> driver allocates an skb, get_page()s and attaches the data to the new >> skb, this skb crosses the bridge, trickles into the real ethernet >> device, gets queued there, sent, interrupts fire, triggering async >> completion. On this completion, we send a virtual interrupt to the >> guest, which tells it to destroy the skb and reclaim the pages attached >> to it. >> > > Hi Avi! > > Thanks for spelling it out, I now understand your POV. I had > considered it obvious that a (non-async) write which didn't copy would > block until the skb was finished with, which is easy to code up within > the tap device itself. Otherwise it's actually an async write without a > notification mechanism, which I agree is broken. > > I hadn't considered an always-blocking (or unbuffered) networking API. It's very counter to current APIs, but does make sense with things like syslets. Without syslets, I don't think it's very useful as you need some artificial threads to keep things humming along. (How would userspace specify it? O_DIRECT when opening the tap?) I don't think there's a lot of difference between implementing aio or always-blocking copyless writes for tap. They just differ in how they sleep and in how to access user pages. > Note though: if the guest can change the packet headers they can > subvert some firewall rules and possibly crash the host. None of the > networking code I wrote expects packets to change in flight 8( > > This applies to a userspace or kernelspace driver. > > Umm, right. We could write-protect the packets (which would be very expensive). We could set the evil bit on guest-originated packets, and rewrite the entire networking stack to copy any part which is inspected if the evil bit is set. We need more head-scratching on this. >>> Yes, and this is already present in the tap device. Anthony suggested a >>> slightly nasty hack for multiple sg packets in one writev()/readv, which >>> could also give us batching. >>> >> No need for hacks if we get list aio support one day. >> > > As you point out though, aio is not something we want to hold our breath > for. Plus, aio never makes things simpler, and complexity kills > puppies. > The puppies had better stay away from qemu then, as it is completely async. Always-blocking writes won't reduce complexity. Suddenly you need a thread for each request batch and some pleasant code for joining the threads when done. Syslets do make it go away, though they're more for the mostly-nonblocking-with-occasional-blockage stuff rather than the always blocking thingie you describe. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists