lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTikPvJLEZetB-CbVSO0VuYLnAQz5y8D_F0HGML=5@mail.gmail.com>
Date:	Wed, 5 Jan 2011 08:27:01 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	David Miller <davem@...emloft.net>,
	Network Development <netdev@...r.kernel.org>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	James Morris <jmorris@...ei.org>
Subject: Gaah: selinux_socket_unix_stream_connect oops

This was actually a regression entry, but only ever reported once by
Jeremy, I think. So it was basically ignored as not being very common
and there not being any hints about what causes it.

But after doing the 2.6.37 release, and intending to put it on all the
machines I have access to, guess what I find on the kids computer?
Right.

It must be a reasonably rare race condition, because that computer had
been up for three weeks or so (since middle of December), but
yesterday evening it crashed due to that thing.

The code disassembly is

  13:	55                   	push   %ebp
  14:	89 e5                	mov    %esp,%ebp
  16:	57                   	push   %edi
  17:	8d 7d 90             	lea    -0x70(%ebp),%edi
  1a:	56                   	push   %esi
  1b:	53                   	push   %ebx
  1c:	83 ec 6c             	sub    $0x6c,%esp
  1f:	8b 40 14             	mov    0x14(%eax),%eax
  22:	8b 52 14             	mov    0x14(%edx),%edx
  25:	8b 98 58 01 00 00    	mov    0x158(%eax),%ebx
  2b:*	8b 82 58 01 00 00    	mov    0x158(%edx),%eax     <-- trapping
instruction
  31:	89 45 8c             	mov    %eax,-0x74(%ebp)
  34:	31 c0                	xor    %eax,%eax
  36:	8b b1 58 01 00 00    	mov    0x158(%ecx),%esi
  3c:	89 7d 88             	mov    %edi,-0x78(%ebp)

which means that it's "other->sk" that is NULL, which I think matches
Jeremy's case exactly.

The logs have a hint: this seems to have coincided with the
console-kit-daemon giving a warning like:

  WARNING: Couldn't read /proc/13585/environ: Failed to open file
'/proc/13585/environ': No such file or directory

and then NetworkManager having a bunch of authentication warnings that
end up about being

  Could not get UID of name ':1.3871': no such name

(full text in the attachment).

So I wonder if there is some subtle race that happens when one end of
a unix domain socket attaches just as another end disconnects?
Especially as "security_unix_stream_connect()" is called before the
whole connect sequence is really final. It's generally
"unix_release()" that sets 'sock->sk' to NULL.

Btw, why do we pass in "sock" and "other->sk_socket" ("struct
socket"), when it appears that what the security code really wants to
get "struct sock" (which would be "sk" and "other" in the caller)? The
calling convention seems to result in (a) this NULL pointer thing and
(b) all these extra dereferences.

Comments? Ideas?

                              Linus

View attachment "kids.txt" of type "text/plain" (5215 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ