[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58505535.1080908@gmail.com>
Date: Tue, 13 Dec 2016 12:08:21 -0800
From: John Fastabend <john.fastabend@...il.com>
To: David Miller <davem@...emloft.net>
Cc: brouer@...hat.com, cl@...ux.com, rppt@...ux.vnet.ibm.com,
netdev@...r.kernel.org, linux-mm@...ck.org,
willemdebruijn.kernel@...il.com, bjorn.topel@...el.com,
magnus.karlsson@...el.com, alexander.duyck@...il.com,
mgorman@...hsingularity.net, tom@...bertland.com,
bblanco@...mgrid.com, tariqt@...lanox.com, saeedm@...lanox.com,
jesse.brandeburg@...el.com, METH@...ibm.com, vyasevich@...il.com
Subject: Re: Designing a safe RX-zero-copy Memory Model for Networking
On 16-12-13 11:53 AM, David Miller wrote:
> From: John Fastabend <john.fastabend@...il.com>
> Date: Tue, 13 Dec 2016 09:43:59 -0800
>
>> What does "zero-copy send packet-pages to the application/socket that
>> requested this" mean? At the moment on x86 page-flipping appears to be
>> more expensive than memcpy (I can post some data shortly) and shared
>> memory was proposed and rejected for security reasons when we were
>> working on bifurcated driver.
>
> The whole idea is that we map all the active RX ring pages into
> userspace from the start.
>
> And just how Jesper's page pool work will avoid DMA map/unmap,
> it will also avoid changing the userspace mapping of the pages
> as well.
>
> Thus avoiding the TLB/VM overhead altogether.
>
I get this but it requires applications to be isolated. The pages from
a queue can not be shared between multiple applications in different
trust domains. And the application has to be cooperative meaning it
can't "look" at data that has not been marked by the stack as OK. In
these schemes we tend to end up with something like virtio/vhost or
af_packet.
Any ACLs/filtering/switching/headers need to be done in hardware or
the application trust boundaries are broken.
If the above can not be met then a copy is needed. What I am trying
to tease out is the above comment along with other statements like
this "can be done with out HW filter features".
.John
Powered by blists - more mailing lists