lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 4 May 2021 23:54:18 -0700
From:   Martin KaFai Lau <kafai@...com>
To:     Kuniyuki Iwashima <kuniyu@...zon.co.jp>, <edumazet@...gle.com>,
        <jbaron@...mai.com>
CC:     <andrii@...nel.org>, <ast@...nel.org>, <benh@...zon.com>,
        <bpf@...r.kernel.org>, <daniel@...earbox.net>,
        <davem@...emloft.net>, <kuba@...nel.org>, <kuni1840@...il.com>,
        <linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v4 bpf-next 00/11] Socket migration for SO_REUSEPORT.

On Thu, Apr 29, 2021 at 12:16:09PM +0900, Kuniyuki Iwashima wrote:
[ ... ]

> > > > It may be but perhaps its more flexible? It gives the new server the
> > > > chance to re-use the existing listen fds, close, drain and/or start new
> > > > ones. It also addresses the non-REUSEPORT case where you can't bind right
> > > > away.

> > > If the flexibility is really worth the complexity, we do not care about it.
> > > But, SO_REUSEPORT can give enough flexibility we want.
> > >
> > > With socket migration, there is no need to reuse listener (fd passing),
> > > drain children (incoming connections are automatically migrated if there is
> > > already another listener bind()ed), and of course another listener can
> > > close itself and migrated children.
> > >
> > > If two different approaches resolves the same issue and one does not need
> > > complexity in userspace, we select the simpler one.

> > 
> > Kernel bloat and complexity is _not_ the simplest choice.
> > 
> > Touching a complex part of TCP stack is quite risky.

> 
> Yes, we understand that is not a simple decision and your concern. So many
> reviews are needed to see if our approach is really risky or not.

If fd passing is sufficient for a set of use cases, it is great.

However, it does not work well for everyone.  We are not saying
the SO_REUSEPORT(+ optional bpf) is better in all cases also.

After SO_REUSEPORT was added, some people had moved from fd-passing
to SO_REUSEPORT instead and have one bpf policy to select for both
TCP and UDP sk.

Since SO_REUSEPORT was first added, there has been multiple contributions
from different people and companies.  For example, first adding bpf
support to UDP, then to TCP, then a much more flexible way to select sk
from reuseport_array, and then sock_map/sock_hash support.  That is another
perspective showing that people find it useful.  Each of the contributions
changed the kernel code also for practical use cases.

This set is an extension/improvement to address a lacking in SO_REUSEPORT
when some of the sk is closed.  Patch 2 to 4 are the prep work
in sock_reuseport.c and they have the most changes in this set.
Patch 5 to 7 are the changes in tcp.  The code has been structured
to be as isolated as possible.  It will be most useful to at least
review and getting feedback in this part.  The remaining is bpf
related.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ