[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <D3F292ADF945FB49B35E96C94C2061B91257D6E1@nsmail.netscout.com>
Date: Tue, 5 Jul 2011 13:35:13 -0400
From: "Loke, Chetan" <Chetan.Loke@...scout.com>
To: "Eric Dumazet" <eric.dumazet@...il.com>
Cc: "Victor Julien" <victor@...iniac.net>,
"David Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>
Subject: RE: [PATCH 2/2] packet: Add fanout support.
> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@...il.com]
> Sent: July 05, 2011 12:16 PM
> To: Loke, Chetan
> Cc: Victor Julien; David Miller; netdev@...r.kernel.org
> Subject: RE: [PATCH 2/2] packet: Add fanout support.
>
> Le mardi 05 juillet 2011 à 12:03 -0400, Loke, Chetan a écrit :
> > > -----Original Message-----
> > > From: netdev-owner@...r.kernel.org [mailto:netdev-
> > > owner@...r.kernel.org] On Behalf Of Eric Dumazet
> > > Sent: July 05, 2011 3:00 AM
> > > To: Victor Julien
> > > Cc: David Miller; netdev@...r.kernel.org
> > > Subject: Re: [PATCH 2/2] packet: Add fanout support.
> > >
> > > Le mardi 05 juillet 2011 à 08:56 +0200, Victor Julien a écrit :
> > >
> > > > Is this still also true for IP fragments?
> > > >
> > >
> > > This point was already raised. IP fragments have rxhash = 0,
> obviously,
> > > since we dont have full information (source / destination ports for
> > > example)
> >
> > Can we not do something like:
> >
> > a = src_ip_addr;
> > b = dst_ip_addr;
> >
> > if (ip_is_fragment(ip_hdr(skb)))
> > c = ip_hdr->id;
> > else
> > c = src_port | dest_port ; /* port_32 etc - Similar to what we
> have today */
> >
> > /* swap a/b etc */
> > jhash3_words(a,b,c);
> >
> >
> >
>
> Sure, but non fragmented packets will then get a different rxhash.
>
> Remember, goal is that _all_ packets of a given flow end in same queue.
>
>
Sure, a lookup is needed(to steer what I call - Hot/Cold flows) and was proposed by me on the oisf mailing list. Always, use the ip_id bit then? Another problem that needs to be solved is, what if some decoders are overloaded, then what? How will this scheme work? How will we utilize other CPUs? RPS is needed for sure.
If we maintain a i) per port lookup-table ii) 2^20 flows/table and iii) 16 bytes/flow(one can also squeeze it down to 8 bytes) then we will need around 32MB worth memory/port. It's not a huge memory pressure for folks who want to use linux for doing IPS/IDS sort of stuff.
User-space decoders end up copying the packet anyways. So fanout can be implemented in user-space to achieve effective CPU utilization.
As long as we don't bounce on different CPU-socket we could be ok.
Chetan Loke
Powered by blists - more mailing lists