linux-kernel - Re: Bug 4.1.16: self-detected stall in net/unix/?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56B4BF9D.9070609@pmhahn.de>
Date:	Fri, 5 Feb 2016 16:28:29 +0100
From:	Philipp Hahn <pmhahn@...ahn.de>
To:	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	Sasha Levin <sasha.levin@...cle.com>,
	Rainer Weikusat <rweikusat@...ileactivedefense.com>,
	Andrey Vagin <avagin@...nvz.org>,
	Aaron Conole <aconole@...heb.org>,
	"David S. Miller" <davem@...emloft.net>,
	linux-kernel@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: Bug 4.1.16: self-detected stall in net/unix/?

Hello Hannes,

thank you for your previous reply.

Am 03.02.2016 um 02:43 schrieb Hannes Frederic Sowa:
> On 02.02.2016 17:25, Philipp Hahn wrote:
>> we recently updated our kernel to 4.1.16 + patch for "unix: properly
>> account for FDs passed over unix sockets" and have since then
>> self-detected stalls triggered by the Samba daemon:
...
>> We have not yet been able to reproduce the hang, but going back to our
>> previous kernel 4.1.12 makes the problem go away.
> 
> Can you remove the patch "unix: properly account for FDs passed over
> unix sockets" and see if the problem still happens?

I will try.
The problem is that I can't trigger the bug reliably. It always happens
to "smbd", but I don't know the triggering condition.

> I couldn't quickly see any problems with your added patch. I currently
> suspect a tight loop because of a SOCK_DEAD flag set but the socket not
> removed from unix_socket_table or the vfs. Hmmm...

All occurrences of the bug show _raw_spin_lock() as the sleeping
function, never anything other thus far.
I have seen the bug both on amd64 and i386.

If it would spin in
> 1097 restart:
> 1098 »···»···other = unix_find_other(net, sunaddr, alen, sock->type, hash, &err);
> 1099 »···»···if (!other)
> 1100 »···»···»···goto out;
> 1101 
> 1102 »···»···unix_state_double_lock(sk, other);
> 1103 
> 1104 »···»···/* Apparently VFS overslept socket death. Retry. */
> 1105 »···»···if (sock_flag(other, SOCK_DEAD)) {
> 1106 »···»···»···unix_state_double_unlock(sk, other);
> 1107 »···»···»···sock_put(other);
> 1108 »···»···»···goto restart;
> 1109 »···»···}

I would expect some other function to show up in the trace, but it's
always the call chain "unix_dgram_connect -> unix_state_double_lock ->
_raw_spin_lock".

The only case I see is that unix_find_other() returns the same "other"
again and again.

Q: How will that dead "other" be garbage collected finally? The kernel is
> # CONFIG_PREEMPT_NONE is not set
> CONFIG_PREEMPT_VOLUNTARY=y
> # CONFIG_PREEMPT is not set


During my hunt I found the following difference between 4.4 and 4.1.16
in net/unix/af_unix.c:

> commit 1586a5877db9eee313379738d6581bc7c6ffb5e3
> Author: Eric Dumazet <edumazet@...gle.com>
> Date:   Fri Oct 23 10:59:16 2015 -0700
> 
>     af_unix: do not report POLLOUT on listeners
>     
>     poll(POLLOUT) on a listener should not report fd is ready for
>     a write().
>     
>     This would break some applications using poll() and pfd.events = -1,
>     as they would not block in poll()
>     
>     Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>     Reported-by: Alan Burlison <Alan.Burlison@...cle.com>
>     Tested-by: Eric Dumazet <edumazet@...gle.com>
>     Signed-off-by: David S. Miller <davem@...emloft.net>
...
> -static inline int unix_writable(struct sock *sk)
> +static int unix_writable(const struct sock *sk)
>  {
> -       return (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
> +       return sk->sk_state != TCP_LISTEN &&
> +              (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
>  }

Q: That *TCP*_LISTEN in net/*unix*/ looks strange to my eyes, but I'm
not a kernel guru (yet).

There is one hunk in 5c77e26862ce604edea05b3442ed765e9756fe0f, which
uses the result of that function, which might explain the difference,
why it shows with 4.1.16 but not with 4.4?

> @@ -2245,14 +2388,16 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
>         writable = unix_writable(sk);
> -       other = unix_peer_get(sk);
> -       if (other) {
> -               if (unix_peer(other) != sk) {
> -                       sock_poll_wait(file, &unix_sk(other)->peer_wait, wait);
> -                       if (unix_recvq_full(other))
> -                               writable = 0;
> -               }
> -               sock_put(other);
> +       if (writable) {
> +               unix_state_lock(sk);
> +
> +               other = unix_peer(sk);
> +               if (other && unix_peer(other) != sk &&
> +                   unix_recvq_full(other) &&
> +                   unix_dgram_peer_wake_me(sk, other))
> +                       writable = 0;
> +
> +               unix_state_unlock(sk);
>         }
>         if (writable)


I will for now build a new kernel with
> $ git log --oneline  v4.1.12..v4.1.17 -- net/unix
> dc6b0ec unix: properly account for FDs passed over unix sockets
> cc01a0a af_unix: Revert 'lock_interruptible' in stream receive code
> 5c77e26 unix: avoid use-after-free in ep_remove_wait_queue
reverted to see if it still happens. The "middle" patch seems harmless,
as it only changes a code path for STREAMS, while the bug triggers with
DGRAMS only.

> The stack trace is rather unreliable, maybe something completely
> different happend. Do you happend to see better reports?

So far they look all the same.
Anything more I can do to prepare for collection better information next
time I get that bug?

Thanks again.

Philipp