linux-kernel - Re: use-after-free in sock_wake

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87io4qevdp.fsf@doppelsaurus.mobileactivedefense.com>
Date:	Wed, 25 Nov 2015 01:10:58 +0000
From:	Rainer Weikusat <rweikusat@...ileactivedefense.com>
To:	Eric Dumazet <edumazet@...gle.com>
Cc:	Rainer Weikusat <rweikusat@...ileactivedefense.com>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	Benjamin LaHaise <bcrl@...ck.org>,
	"David S. Miller" <davem@...emloft.net>,
	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	Al Viro <viro@...iv.linux.org.uk>,
	David Howells <dhowells@...hat.com>,
	Ying Xue <ying.xue@...driver.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	syzkaller <syzkaller@...glegroups.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>
Subject: Re: use-after-free in sock_wake_async

Eric Dumazet <edumazet@...gle.com> writes:
> On Tue, Nov 24, 2015 at 3:34 PM, Rainer Weikusat
> <rweikusat@...ileactivedefense.com> wrote:
>> Eric Dumazet <edumazet@...gle.com> writes:
>>> On Tue, Nov 24, 2015 at 6:18 AM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
>>>> Hello,
>>>>
>>>> The following program triggers use-after-free in sock_wake_async:
>>
>> [...]
>>
>>>> void *thr1(void *arg)
>>>> {
>>>>         syscall(SYS_close, r2, 0, 0, 0, 0, 0);
>>>>         return 0;
>>>> }
>>>>
>>>> void *thr2(void *arg)
>>>> {
>>>>         syscall(SYS_write, r3, 0x20003000ul, 0xe7ul, 0, 0, 0);
>>>>         return 0;
>>>> }
>>
>> [...]
>>
>>>>         pthread_t th[3];
>>>>         pthread_create(&th[0], 0, thr0, 0);
>>>>         pthread_create(&th[1], 0, thr1, 0);
>>>>         pthread_create(&th[2], 0, thr2, 0);
>>>>         pthread_join(th[0], 0);
>>>>         pthread_join(th[1], 0);
>>>>         pthread_join(th[2], 0);
>>>>         return 0;
>>>> }
>>
>> [...]
>>
>>> Looks like commit 830a1e5c212fb3fdc83b66359c780c3b3a294897 should be reverted ?
>>>
>>> commit 830a1e5c212fb3fdc83b66359c780c3b3a294897
>>> Author: Benjamin LaHaise <benjamin.c.lahaise@...el.com>
>>> Date:   Tue Dec 13 23:22:32 2005 -0800
>>>
>>>     [AF_UNIX]: Remove superfluous reference counting in unix_stream_sendmsg
>>>
>>>     AF_UNIX stream socket performance on P4 CPUs tends to suffer due to a
>>>     lot of pipeline flushes from atomic operations.  The patch below
>>>     removes the sock_hold() and sock_put() in unix_stream_sendmsg().  This
>>>     should be safe as the socket still holds a reference to its peer which
>>>     is only released after the file descriptor's final user invokes
>>>     unix_release_sock().  The only consideration is that we must add a
>>>     memory barrier before setting the peer initially.
>>>
>>>     Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@...el.com>
>>>     Signed-off-by: David S. Miller <davem@...emloft.net>
>>
>> JFTR: This seems to be unrelated. (As far as I understand this), the
>> problem is that sk_wake_async accesses sk->sk_socket. That's invoked via
>> the
>>
>> other->sk_data_ready(other)
>>
>> in unix_stream_sendmsg after an
>>
>> unix_state_unlock(other);
>>
>> because of this, it can race with the code in unix_release_sock clearing
>> this pointer (via sock_orphan). The structure this pointer points to is
>> freed via iput in sock_release (net/socket.c) after the af_unix release
>> routine returned (it's really one part of a "twin structure" with the
>> socket inode being the other).
>>
>> A quick way to test if this was true would be to swap the
>>
>> unix_state_unlock(other);
>> other->sk_data_ready(other);
>>
>> in unix_stream_sendmsg and in case it is, a very 'hacky' fix could be to
>> put a pointer to the socket inode into the struct unix_sock, do an iget
>> on that in unix_create1 and a corresponding iput in
>> unix_sock_destructor.
>
> This is interesting, but is not the problem or/and the fix.
>
> We are supposed to own a reference on the 'other' socket or make sure
> it cannot disappear under us.

The af_unix part of this, yes, ie, what gets allocated in
unix_create1. But neither the socket inode nor the struct sock
originally passed to unix_create. Since these are part of the same
umbrella structure, they'll both be freed as consequence of the
sock_release iput. As far as I can tell (I don't claim that I'm
necessarily right on this, this is just the result of spending ca 2h
reading the code with the problem report in mind and looking for
something which could cause it), doing a sock_hold on the unix peer of
the socket in unix_stream_sendmsg is indeed not needed, however, there's
no additional reference to the inode or the struct sock accompanying it,
ie, both of these will be freed by unix_release_sock. This also affects
unix_dgram_sendmsg.

It's also easy to verify: Swap the unix_state_lock and
other->sk_data_ready and see if the issue still occurs. Right now (this
may change after I had some sleep as it's pretty late for me), I don't
think there's another local fix: The ->sk_data_ready accesses a
pointer after the lock taken by the code which will clear and
then later free it was released.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/