[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWhsj+byMeT=WHb9eLR_vgsUe58Li8JDLaTvsSPWp1DKQ@mail.gmail.com>
Date: Fri, 14 Nov 2014 14:27:45 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Tom Herbert <therbert@...gle.com>,
Michael Kerrisk <mtk.manpages@...il.com>,
David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>, Ying Cai <ycai@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>,
Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
On Fri, Nov 14, 2014 at 2:24 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Fri, 2014-11-14 at 14:10 -0800, Andy Lutomirski wrote:
>
>> I have a bunch of threads that are pinned to various CPUs or groups of
>> CPUs. Each thread is responsible for a fixed set of flows. I'd like
>> those flows to go to those CPUs.
>>
>> RFS will eventually do it, but it would be nice if I could
>> deterministically ask for a flow to be routed to the right CPU. Also,
>> if my thread bounces temporarily to another CPU, I don't really need
>> the flow to follow it -- I'd like it to stay put.
>>
>> This has a significant benefit over using automatic steering: with
>> automatic steering, I have to make all of the hash tables have a size
>> around the square of the total number of the flows in order to make it
>> reliable.
>>
>> Something like SO_STEER_TO_THIS_CPU would be fine, as long as it
>> reported whether it worked (for my diagnostics).
>
> This requires some kind of hardware support, and unfortunately this is
> not generic.
>
> With SO_INCOMING_CPU, you simply can pass fd of sockets around threads,
> so that a dumb RSS multiqueue NIC is OK (assuming you are not using some
> encapsulation that NIC is not able to parse to find L4 information)
I can't really do this. It means that the performance of my system
will be wildly different every time I restart it. I don't have enough
connections for everything to average out.
>
> Steering is a dream, I really think its easier to build flows so that
> their RX queue matches your requirements.
I have supporting hardware :) I just want it to work without
programming the ntuple table myself.
>
> We usually can pick at least one element of the 4-tuple, so its actually
> possible to get this before connect().
>
Hmm. An API for that would be quite nice :)
>
> Two cases :
>
> 1) Passive connections.
>
> After accept(), get SO_INCOMING_CPU, then pass the fd to appropriate
> thread of your pool.
>
> 2) Active connections .
> find a proper 4-tuple, bind() then connect(). Eventually check
> SO_INCOMING_CPU to verify your expectations.
The people at the other end will be really pissed if that results in
lots of reconnections.
--Andy
>
>
>
--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists