[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56B4BF9D.9070609@pmhahn.de>
Date: Fri, 5 Feb 2016 16:28:29 +0100
From: Philipp Hahn <pmhahn@...ahn.de>
To: Hannes Frederic Sowa <hannes@...essinduktion.org>,
Sasha Levin <sasha.levin@...cle.com>,
Rainer Weikusat <rweikusat@...ileactivedefense.com>,
Andrey Vagin <avagin@...nvz.org>,
Aaron Conole <aconole@...heb.org>,
"David S. Miller" <davem@...emloft.net>,
linux-kernel@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: Bug 4.1.16: self-detected stall in net/unix/?
Hello Hannes,
thank you for your previous reply.
Am 03.02.2016 um 02:43 schrieb Hannes Frederic Sowa:
> On 02.02.2016 17:25, Philipp Hahn wrote:
>> we recently updated our kernel to 4.1.16 + patch for "unix: properly
>> account for FDs passed over unix sockets" and have since then
>> self-detected stalls triggered by the Samba daemon:
...
>> We have not yet been able to reproduce the hang, but going back to our
>> previous kernel 4.1.12 makes the problem go away.
>
> Can you remove the patch "unix: properly account for FDs passed over
> unix sockets" and see if the problem still happens?
I will try.
The problem is that I can't trigger the bug reliably. It always happens
to "smbd", but I don't know the triggering condition.
> I couldn't quickly see any problems with your added patch. I currently
> suspect a tight loop because of a SOCK_DEAD flag set but the socket not
> removed from unix_socket_table or the vfs. Hmmm...
All occurrences of the bug show _raw_spin_lock() as the sleeping
function, never anything other thus far.
I have seen the bug both on amd64 and i386.
If it would spin in
> 1097 restart:
> 1098 »···»···other = unix_find_other(net, sunaddr, alen, sock->type, hash, &err);
> 1099 »···»···if (!other)
> 1100 »···»···»···goto out;
> 1101
> 1102 »···»···unix_state_double_lock(sk, other);
> 1103
> 1104 »···»···/* Apparently VFS overslept socket death. Retry. */
> 1105 »···»···if (sock_flag(other, SOCK_DEAD)) {
> 1106 »···»···»···unix_state_double_unlock(sk, other);
> 1107 »···»···»···sock_put(other);
> 1108 »···»···»···goto restart;
> 1109 »···»···}
I would expect some other function to show up in the trace, but it's
always the call chain "unix_dgram_connect -> unix_state_double_lock ->
_raw_spin_lock".
The only case I see is that unix_find_other() returns the same "other"
again and again.
Q: How will that dead "other" be garbage collected finally? The kernel is
> # CONFIG_PREEMPT_NONE is not set
> CONFIG_PREEMPT_VOLUNTARY=y
> # CONFIG_PREEMPT is not set
During my hunt I found the following difference between 4.4 and 4.1.16
in net/unix/af_unix.c:
> commit 1586a5877db9eee313379738d6581bc7c6ffb5e3
> Author: Eric Dumazet <edumazet@...gle.com>
> Date: Fri Oct 23 10:59:16 2015 -0700
>
> af_unix: do not report POLLOUT on listeners
>
> poll(POLLOUT) on a listener should not report fd is ready for
> a write().
>
> This would break some applications using poll() and pfd.events = -1,
> as they would not block in poll()
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Reported-by: Alan Burlison <Alan.Burlison@...cle.com>
> Tested-by: Eric Dumazet <edumazet@...gle.com>
> Signed-off-by: David S. Miller <davem@...emloft.net>
...
> -static inline int unix_writable(struct sock *sk)
> +static int unix_writable(const struct sock *sk)
> {
> - return (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
> + return sk->sk_state != TCP_LISTEN &&
> + (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
> }
Q: That *TCP*_LISTEN in net/*unix*/ looks strange to my eyes, but I'm
not a kernel guru (yet).
There is one hunk in 5c77e26862ce604edea05b3442ed765e9756fe0f, which
uses the result of that function, which might explain the difference,
why it shows with 4.1.16 but not with 4.4?
> @@ -2245,14 +2388,16 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
> writable = unix_writable(sk);
> - other = unix_peer_get(sk);
> - if (other) {
> - if (unix_peer(other) != sk) {
> - sock_poll_wait(file, &unix_sk(other)->peer_wait, wait);
> - if (unix_recvq_full(other))
> - writable = 0;
> - }
> - sock_put(other);
> + if (writable) {
> + unix_state_lock(sk);
> +
> + other = unix_peer(sk);
> + if (other && unix_peer(other) != sk &&
> + unix_recvq_full(other) &&
> + unix_dgram_peer_wake_me(sk, other))
> + writable = 0;
> +
> + unix_state_unlock(sk);
> }
> if (writable)
I will for now build a new kernel with
> $ git log --oneline v4.1.12..v4.1.17 -- net/unix
> dc6b0ec unix: properly account for FDs passed over unix sockets
> cc01a0a af_unix: Revert 'lock_interruptible' in stream receive code
> 5c77e26 unix: avoid use-after-free in ep_remove_wait_queue
reverted to see if it still happens. The "middle" patch seems harmless,
as it only changes a code path for STREAMS, while the bug triggers with
DGRAMS only.
> The stack trace is rather unreliable, maybe something completely
> different happend. Do you happend to see better reports?
So far they look all the same.
Anything more I can do to prepare for collection better information next
time I get that bug?
Thanks again.
Philipp
Powered by blists - more mailing lists