lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1482209536.1521.21.camel@edumazet-glaptop3.roam.corp.google.com>
Date:   Mon, 19 Dec 2016 20:52:16 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Josef Bacik <jbacik@...com>
Cc:     Tom Herbert <tom@...bertland.com>,
        David Miller <davem@...emloft.net>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>,
        Craig Gallek <kraigatgoog@...il.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Soft lockup in inet_put_port on 4.6

On Tue, 2016-12-20 at 03:40 +0000, Josef Bacik wrote:
> > On Dec 19, 2016, at 9:42 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> > 
> >> On Mon, 2016-12-19 at 18:07 -0800, Tom Herbert wrote:
> >> 
> >> When sockets created SO_REUSEPORT move to TW state they are placed
> >> back on the the tb->owners. fastreuse port is no longer set so we have
> >> to walk potential long list of sockets in tb->owners to open a new
> >> listener socket. I imagine this is happens when we try to open a new
> >> listener SO_REUSEPORT after the system has been running a while and so
> >> we hit the long tb->owners list.
> > 
> > Hmm...  __inet_twsk_hashdance() does not change tb->fastreuse
> > 
> > So where tb->fastreuse is cleared ?
> > 
> > If all your sockets have SO_REUSEPORT set, this should not happen.
> > 
> 
> The app starts out with no SO_REUSEPORT, and then we restart it with
> that option enabled.

But... why would the application do this dance ?

I now better understand why we never had these issues...


>   What I suspect is we have all the twsks from the original service,
> and the fastreuse stuff is cleared.  My naive patch resets it once we
> add a reuseport sk to the tb and that makes the problem go away.  I'm
> reworking all of this logic and adding some extra info to the tb to
> make the reset actually safe.  I'll send those patches out tomorrow.
> Thanks,

Okay, we will review them ;)

Note that Willy Tarreau wants some mechanism to be able to freeze a
listener, to allow haproxy to be replaced without closing any sessions.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ