lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 08 Apr 2007 08:36:14 +0300
From:	Avi Kivity <avi@...ranet.com>
To:	Rusty Russell <rusty@...tcorp.com.au>
Cc:	Ingo Molnar <mingo@...e.hu>, kvm-devel@...ts.sourceforge.net,
	netdev <netdev@...r.kernel.org>
Subject: Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

Rusty Russell wrote:
> On Thu, 2007-04-05 at 10:17 +0300, Avi Kivity wrote:
>   
>> Rusty Russell wrote:
>>     
>>> You didn't quote Anthony's point about "it's more about there not being
>>> good enough userspace interfaces to do network IO."
>>>
>>> It's easier to write a kernel-space network driver, but it's not
>>> obviously the right thing to do until we can show that an efficient
>>> packet-level userspace interface isn't possible.  I don't think that's
>>> been done, and it would be interesting to try.
>>>   
>>>       
>> In the case of networking, the copyful interfaces on receive are driven 
>> by the hardware not knowing how to split the header from the data.  On 
>> transmit I agree, it could be made copyless from userspace (somthing 
>> like sendfilev, only not file oriented).
>>     
>
> Hi Avi,
>
> 	I don't think you've thought about this very hard.  The receive copy is
> completely independent with whether the packet is going to the guest via
> a kernel driver or via userspace, so not relevant.
>   

A packet received in the kernel cannot be made available to userspace in 
a safe manner without a copy, as it will not be aligned with page 
boundaries, so userspace cannot examine the packet until after one copy 
has occured.  After userspace has determined what to do with the packet, 
another copy must take place to get it there.

There's a counterexample, mmapped sockets, but that works only when all 
packets arriving on a card are exposed to the same process.  This is 
useful for tcpdump or for what you outline below but is hardly generic.

> 	And if all packets from the card are going to the guest, you can
> deliver directly.  Userspace or kernel, no difference.
>   

That is not the common case.  Nor is it true when there is a mismatch 
between the card's capabilties and guest expectations and constraints.  
For example, guest memory is not physically contiguous so a NIC that 
won't do scatter/gather will require bouncing (or an iommu, but that's 
not here yet).

> 	And we have a "sendfilev not file oriented": it's called "writev" 8)
>   

writev() cannot be made copyless for networking.  One needs an async 
interface so the kernel can complete the write after the NIC acks the 
dma transfer, or a kernel driver.

> 	An in-kernel driver can avoid system call overhead and page references.
> But a better tap device helps more than just KVM.
>   

I'll believe it when I see it.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ