[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTikPvJLEZetB-CbVSO0VuYLnAQz5y8D_F0HGML=5@mail.gmail.com>
Date: Wed, 5 Jan 2011 08:27:01 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: David Miller <davem@...emloft.net>,
Network Development <netdev@...r.kernel.org>,
Jeremy Fitzhardinge <jeremy@...p.org>,
James Morris <jmorris@...ei.org>
Subject: Gaah: selinux_socket_unix_stream_connect oops
This was actually a regression entry, but only ever reported once by
Jeremy, I think. So it was basically ignored as not being very common
and there not being any hints about what causes it.
But after doing the 2.6.37 release, and intending to put it on all the
machines I have access to, guess what I find on the kids computer?
Right.
It must be a reasonably rare race condition, because that computer had
been up for three weeks or so (since middle of December), but
yesterday evening it crashed due to that thing.
The code disassembly is
13: 55 push %ebp
14: 89 e5 mov %esp,%ebp
16: 57 push %edi
17: 8d 7d 90 lea -0x70(%ebp),%edi
1a: 56 push %esi
1b: 53 push %ebx
1c: 83 ec 6c sub $0x6c,%esp
1f: 8b 40 14 mov 0x14(%eax),%eax
22: 8b 52 14 mov 0x14(%edx),%edx
25: 8b 98 58 01 00 00 mov 0x158(%eax),%ebx
2b:* 8b 82 58 01 00 00 mov 0x158(%edx),%eax <-- trapping
instruction
31: 89 45 8c mov %eax,-0x74(%ebp)
34: 31 c0 xor %eax,%eax
36: 8b b1 58 01 00 00 mov 0x158(%ecx),%esi
3c: 89 7d 88 mov %edi,-0x78(%ebp)
which means that it's "other->sk" that is NULL, which I think matches
Jeremy's case exactly.
The logs have a hint: this seems to have coincided with the
console-kit-daemon giving a warning like:
WARNING: Couldn't read /proc/13585/environ: Failed to open file
'/proc/13585/environ': No such file or directory
and then NetworkManager having a bunch of authentication warnings that
end up about being
Could not get UID of name ':1.3871': no such name
(full text in the attachment).
So I wonder if there is some subtle race that happens when one end of
a unix domain socket attaches just as another end disconnects?
Especially as "security_unix_stream_connect()" is called before the
whole connect sequence is really final. It's generally
"unix_release()" that sets 'sock->sk' to NULL.
Btw, why do we pass in "sock" and "other->sk_socket" ("struct
socket"), when it appears that what the security code really wants to
get "struct sock" (which would be "sk" and "other" in the caller)? The
calling convention seems to result in (a) this NULL pointer thing and
(b) all these extra dereferences.
Comments? Ideas?
Linus
View attachment "kids.txt" of type "text/plain" (5215 bytes)
Powered by blists - more mailing lists