[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101025233558.GA30118@hmsreliant.think-freely.org>
Date: Mon, 25 Oct 2010 19:35:58 -0400
From: Neil Horman <nhorman@...driver.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net, jpirko@...hat.com
Subject: Re: [PATCH] Enhance AF_PACKET implementation to not require high
order contiguous memory allocation
On Tue, Oct 26, 2010 at 12:30:56AM +0200, Eric Dumazet wrote:
> Le lundi 25 octobre 2010 à 18:14 -0400, nhorman@...driver.com a écrit :
> > I think I remember those changes and IIrc yes, tcpdump will make
> > several attempts to get buffers of an appropriate size. But while it
> > tries to do that it bogs the system trying to write out pagecahe,
> > swap, etc. And that activity doesn't guarantee success. His does
> > either, but getting 5 order 0 pages is far easier and less intrusive
> > to a loaded system than trying to get 1 order 4 chunk. That's all I'm
> > trying to accomplish here. Just making it easier to use af_packet
> > sockets without interfering with system performance
> >
>
> Actually, using vmalloc() would probably hurt performance, because of
> extra TLB pressure.
>
> Of course, on recent x86 hardware you dont notice that much...
>
Exactly, you notice it a good deal less then you do the swapping that occurs if
you try to allocate a contiguous order 4 chunk of RAM. That will bog down the
system, even if the allocation ultimately fails.
> If not, why af_packet would use such convoluted double array of
> 'compound pages' ?
>
Gah! Because I have blinders on, apparently. The origional implementation used
a ring of pointer, and apparently I was so focused on keeping with that
implementation, it never occured to me to just use vmalloc. That was stupid of
me, I'll respin this and get rid of my idiocy.
> Also, on x86_32, vmalloc()/vmap() space is small (128 MB) so you might
> exhaust it pretty fast with several sniffers running.
>
You might, although (assuming no other significant users), 64K * 32 ~= 1.5Mb.
You could run 10 sniffers and only consume about 10-15% of the vmalloc space.
> I would try a two level thing : Try to get high order pages, and
> fallback on low order pages, but normally libpcap does this for us ?
>
>
It does, but it tries them in that order, which causes the problem I'm
describing, which is to say that attempting to get a large high order allocation
causes the system to dig into swap and become unresponsive while it tries to
assemble those allocations. I would suggest a vmalloc, with a backoff to high
order allocation if that fails.
I'll post a new patch shortly.
Neil
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists