[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <543265A5.8000606@redhat.com>
Date: Mon, 06 Oct 2014 11:49:25 +0200
From: Daniel Borkmann <dborkman@...hat.com>
To: John Fastabend <john.fastabend@...il.com>
CC: Florian Westphal <fw@...len.de>, gerlitz.or@...il.com,
hannes@...essinduktion.org, netdev@...r.kernel.org,
john.ronciak@...el.com, amirv@...lanox.com, eric.dumazet@...il.com,
danny.zhou@...el.com
Subject: Re: [net-next PATCH v1 1/3] net: sched: af_packet support for direct
ring access
Hi John,
On 10/06/2014 03:12 AM, John Fastabend wrote:
> On 10/05/2014 05:29 PM, Florian Westphal wrote:
>> John Fastabend <john.fastabend@...il.com> wrote:
>>> There is one critical difference when running with these interfaces
>>> vs running without them. In the normal case the af_packet module
>>> uses a standard descriptor format exported by the af_packet user
>>> space headers. In this model because we are working directly with
>>> driver queues the descriptor format maps to the descriptor format
>>> used by the device. User space applications can learn device
>>> information from the socket option PACKET_DEV_DESC_INFO which
>>> should provide enough details to extrapulate the descriptor formats.
>>> Although this adds some complexity to user space it removes the
>>> requirement to copy descriptor fields around.
>>
>> I find it very disappointing that we seem to have to expose such
>> hardware specific details to userspace via hw-independent interface.
>
> Well it was only for convenience if it doesn't fit as a socket
> option we can remove it. We can look up the device using the netdev
> name from the bind call. I see your point though so if there is
> consensus that this is not needed that is fine.
>
>> How big of a cost are we talking about when you say that it 'removes
>> the requirement to copy descriptor fields'?
>
> This was likely a poor description. If you want to let user space
> poll on the ring (without using system calls or interrupts) then
> I don't see how you can _not_ expose the ring directly complete with
> the vendor descriptor formats.
But how big is the concrete performance degradation you're seeing if you
use an e.g. `netmap-alike` Linux-own variant as a hw-neutral interface
that does *not* directly expose hw descriptor formats to user space?
With 1 core netmap does 10G line-rate on 64b; I don't know their numbers
on 40G when run on decent hardware though.
It would really be great if we have something vendor neutral exposed as
a stable ABI and could leverage emerging infrastructure we already have
in the kernel such as eBPF and recent qdisc batching for raw sockets
instead of reinventing the wheels. (Don't get me wrong, I would love to
see AF_PACKET improved ...)
Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists