lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Mar 2016 17:50:47 +0100
From:	Willy Tarreau <w@....eu>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Tolga Ceylan <tolga.ceylan@...il.com>,
	Tom Herbert <tom@...bertland.com>, cgallek@...gle.com,
	Josh Snyder <josh@...e406.com>,
	Aaron Conole <aconole@...heb.org>,
	"David S. Miller" <davem@...emloft.net>,
	Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as
 drain mode

On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote:
> > --- a/net/ipv4/inet_hashtables.c
> > +++ b/net/ipv4/inet_hashtables.c
> > @@ -189,6 +189,8 @@ static inline int compute_score(struct sock *sk, struct net *net,
> >                                 return -1;
> >                         score += 4;
> >                 }
> > +               if (sk->sk_reuseport)
> > +                       score++;
> 
> This wont work with BPF
> 
> >                 if (sk->sk_incoming_cpu == raw_smp_processor_id())
> >                         score++;
> 
> This one does not work either with BPF

But this *is* in 4.5. Does this mean that this part doesn't work anymore or
just that it's not usable in conjunction with BPF ? In this case I'm less
worried, because it would mean that we have a solution for non-BPF aware
applications and that BPF-aware applications can simply use BPF.

> Whole point of BPF was to avoid iterate through all sockets [1],
> and let user space use whatever selection logic it needs.
> 
> [1] This was okay with up to 16 sockets. But with 128 it does not scale.

Indeed.

> If you really look at how BPF works, implementing another 'per listener' flag
> would break the BPF selection.

OK.

> You can certainly implement the SO_REUSEPORT_LISTEN_OFF by loading an
> updated BPF, why should we add another way in the kernel to do the same,
> in a way that would not work in some cases ?

I don't try to reimplement something already available, but I'm confused
by a few points :
  - the code above already exists and you mention it cannot be used with BPF
  - for the vast majority of applications not using BPF, would the above *still*
    work (it worked in 4.4-rc at least)
  - it seems to me that for BPF to be usable on process shutting down, we'd
    need to have some form of central knowledge if the goal is to redefine
    how to distribute the load. In my case there are multiple independant
    processes forked on startup, so it's unclear to me how each of them could
    reconfigure BPF when shutting down without risking to break the other ones.
  - the doc makes me believe that BPF would require privileges to be unset, so
    that would not be compatible with a process shutting down which has already
    dropped its privileges after startup, but I could be wrong.

Thanks for your help on this,
Willy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ