netdev - Re: [PATCH] af_packet: add interframe drop cmsg (v2)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 28 Sep 2009 13:00:05 -0300
From:	Arnaldo Carvalho de Melo <acme@...hat.com>
To:	Neil Horman <nhorman@...driver.com>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH] af_packet: add interframe drop cmsg (v2)

Em Thu, Sep 24, 2009 at 11:24:51AM -0400, Neil Horman escreveu:
> On Thu, Sep 24, 2009 at 04:10:43PM +0200, Eric Dumazet wrote:
> > Neil Horman a écrit :
> > > On Thu, Sep 24, 2009 at 07:24:47AM +0200, Eric Dumazet wrote:
> > >> Neil Horman a écrit :
> > > P.S. I was thinking about your mmap comments last night, and I recalled that
> > > Arnaldo was pushing some patches to allows for multiple skb receive and send
> > > operations with a single syscall.  I'm wondering if the judicious use of such a
> > > syscall might mitigate the performance advantage of mmap for libpcap?
> > 
> > Quite frankly I dont like adding super-syscalls like this one. This is bloat,
> > and actually slows down the normal/legacy apps (larger kernel -> bigger latencies)
> > 
> > It would be better to sit down and think about better practices, to speedup
> > normal path (ie existing syscalls)
> > 
> > The performance problem comes from cache line ping pongs between cpus, and
> > lock contention.
> > Hitting the spinlock each time a frame is queueued, and each time a frame is
> > dequeued is the problem.

Yeah, we have to remove that skb_queue_tail (grabbing a lock) call in
sock_queue_rcv_skb lock, just like tcp we don't need it if we use the
socket lock over all udp_recvmsg. I started working on a patch to do
that while in Portland, will continue now that I'm back home.

Also Herbert's GRO packet train idea will help here, so that we can push
all packets received in a NAPI period grabing the socket lock just once,
and then have recvmmsg fill the iovec from there.

> > We could imagine a two level receive queue, so that the application
> > touch the "provider cache line" only to tranfert a batch of skb, when/if the first queue
> > is empty.
> > 
> I think this is (in a sense) what acme's patch did, only using 1 queue.  A
> single aquisition of the socket lock allows for a dequeue of N frames, where N
> is equal to min(number of frames requested from user space, number of frames on
> queue).  That reduces lock contnention by reducing the number of time it needs
> to be acquired.
> 
> Of course, thats all thought experiment, I'd certainly defer to evidence to the
> contrary.  Either way.  The patches are either on netdev (or may already be in
> the net-next tree), I don't quite recall.

Not yet on any tree, but Dave said he would look at the patches soon.
 
> > See previous thread from Christoph Lameter, that showed that something as simple
> > as a UDP echoer (of say 100 frames per second) is worse with current kernels.
> > 
> > I am not saying mmap() thing is the way to go, maybe ring buffer new interface could
> > give us faster af_packet, not requiring pre-allocation of huge zones
> > (__get_free_pages() of insane order pages. This is just not working !!!),
> > allowing splice() or things like that...
> > 
> I don't think acmes patch did that, all it did was allow for the reception of
> multiple skbs using 1 syscall.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html