[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87blsxngvh.fsf@cloudflare.com>
Date: Wed, 27 Nov 2019 22:34:10 +0100
From: Jakub Sitnicki <jakub@...udflare.com>
To: Martin Lau <kafai@...com>
Cc: John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
"bpf\@vger.kernel.org" <bpf@...r.kernel.org>,
"netdev\@vger.kernel.org" <netdev@...r.kernel.org>,
"kernel-team\@cloudflare.com" <kernel-team@...udflare.com>
Subject: Re: [PATCH bpf-next 5/8] bpf: Allow selecting reuseport socket from a SOCKMAP
On Tue, Nov 26, 2019 at 08:03 PM CET, Martin Lau wrote:
> On Tue, Nov 26, 2019 at 03:30:57PM +0100, Jakub Sitnicki wrote:
>> On Mon, Nov 25, 2019 at 11:07 PM CET, Martin Lau wrote:
>> > On Mon, Nov 25, 2019 at 11:40:41AM +0100, Jakub Sitnicki wrote:
[...]
>> I agree, it's not obvious. When I first saw this check in
>> reuseport_array_update_check it got me puzzled too. I should have added
>> an explanatory comment there.
>>
>> Thing is we're not matching on just TCP_LISTEN. REUSEPORT_SOCKARRAY
>> allows selecting a connected UDP socket as a target as well. It takes
>> some effort to set up but it's possible even if obscure.
> How about this instead:
> if (!reuse)
> /* reuseport_array only has sk that has non NULL sk_reuseport_cb.
> * The only (!reuse) case here is, the sk has already been removed from
> * reuseport_array, so treat it as -ENOENT.
> *
> * Other maps (e.g. sock_map) do not provide this guarantee and the sk may
> * never be in the reuseport to begin with.
> */
> return map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY ? -ENOENT : -EINVAL;
Right, apart from established TCP sockets we must not select a listening
socket that's not in a reuseport group either. This covers both
cases. Clever. Thanks for the suggestion.
>
>>
>> > Note that the SOCK_RCU_FREE check at the 'slow-path'
>> > reuseport_array_update_check() is because reuseport_array does depend on
>> > call_rcu(&sk->sk_rcu,...) to work, e.g. the reuseport_array
>> > does not hold the sk_refcnt.
>>
>> Oh, so it's not only about socket state like I thought.
>>
>> This raises the question - does REUSEPORT_SOCKARRAY allow storing
>> connected UDP sockets by design or is it a happy accident? It doesn't
>> seem particularly useful.
> Not by design/accident on the REUSEPORT_SOCKARRAY side ;)
>
> The intention of REUSEPORT_SOCKARRAY is to allow sk that can be added to
> reuse->socks[].
Ah, makes sense. REUSEPORT_SOCKARRAY had to mimic reuseport groups.
-Jakub
Powered by blists - more mailing lists