[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSdV0mAZ+-GzikjTJWMxW70q4DLSKAaKu8hXMeoFCoWSWg@mail.gmail.com>
Date: Wed, 4 Sep 2019 10:23:29 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Mark KEATON <mark.keaton@...theon.com>,
Steve Zabele <zabele@...cast.net>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Network Development <netdev@...r.kernel.org>,
"shum@...ndrew.org" <shum@...ndrew.org>,
"vladimir116@...il.com" <vladimir116@...il.com>,
"saifi.khan@...ikr.in" <saifi.khan@...ikr.in>,
Daniel Borkmann <daniel@...earbox.net>,
"on2k16nm@...il.com" <on2k16nm@...il.com>,
Stephen Hemminger <stephen@...workplumber.org>
Subject: Re: Is bug 200755 in anyone's queue??
On Wed, Sep 4, 2019 at 8:23 AM Eric Dumazet <eric.dumazet@...il.com> wrote:
>
>
>
> On 9/4/19 2:00 PM, Mark KEATON wrote:
> > Hi Willem,
> >
> > I am the person who commented on the original bug report in bugzilla.
> >
> > In communicating with Steve just now about possible solutions that maintain the efficiency that you are after, what would you think of the following: keep two lists of UDP sockets, those connected and those not connected, and always searching the connected list first.
>
> This was my suggestion.
>
> Note that this requires adding yet another hash table, and yet another lookup
> (another cache line miss per incoming packet)
>
> This lookup will slow down DNS and QUIC servers, or any application solely using not connected sockets.
Exactly.
The only way around it that I see is to keep the single list and
optionally mark a struct reuseport_sock as having no connected
members, in which case the search can break on the first reuseport
match, as it does today.
"
On top of the main patch it requires something like
@@ -22,6 +22,7 @@ struct sock_reuseport {
/* ID stays the same even after the size of socks[] grows. */
unsigned int reuseport_id;
bool bind_inany;
+ unsigned int connected;
struct bpf_prog __rcu *prog; /* optional BPF sock selector */
struct sock *socks[0]; /* array of sock pointers */
};
@@ -73,6 +74,15 @@ int __ip4_datagram_connect(struct sock *sk, struct
sockaddr *uaddr, int addr_len
sk_set_txhash(sk);
inet->inet_id = jiffies;
+ if (rcu_access_pointer(sk->sk_reuseport_cb)) {
+ struct sock_reuseport *reuse;
+
+ rcu_read_lock();
+ reuse = rcu_dereference(sk->sk_reuseport_cb);
+ reuse->connected = 1;
+ rcu_read_unlock();
+ }
+
sk_dst_set(sk, &rt->dst);
err = 0;
"
plus a way for reuseport_select_sock to communicate that. Probably a
variant __reuseport_select_sock with an extra argument.
As for BPF: the example I pointed out does read ip addresses and uses
a BPF map for socket selection. But as that feature is new with 4.19
it is probably moot for this purpose, as we are targeting a fix that
can be backported to 4.19 stable.
Powered by blists - more mailing lists