linux-kernel - Re: [PATCH] unix: avoid use-after-free in ep_remove_wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <874mgtn49l.fsf@doppelsaurus.mobileactivedefense.com>
Date:	Tue, 10 Nov 2015 17:38:46 +0000
From:	Rainer Weikusat <rweikusat@...ileactivedefense.com>
To:	Jason Baron <jbaron@...mai.com>
Cc:	Rainer Weikusat <rweikusat@...ileactivedefense.com>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	syzkaller <syzkaller@...glegroups.com>,
	Michal Kubecek <mkubecek@...e.cz>,
	Al Viro <viro@...iv.linux.org.uk>,
	"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	David Miller <davem@...emloft.net>,
	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	David Howells <dhowells@...hat.com>,
	Paul Moore <paul@...l-moore.com>, salyzyn@...roid.com,
	sds@...ho.nsa.gov, ying.xue@...driver.com,
	netdev <netdev@...r.kernel.org>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Andrey Konovalov <andreyknvl@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Julien Tinnes <jln@...gle.com>,
	Kees Cook <keescook@...gle.com>,
	Mathias Krause <minipli@...glemail.com>
Subject: Re: [PATCH] unix: avoid use-after-free in ep_remove_wait_queue

Jason Baron <jbaron@...mai.com> writes:
> On 11/09/2015 09:40 AM, Rainer Weikusat wrote:

[...]

>> -	if (unix_peer(other) != sk && unix_recvq_full(other)) {
>> +	if (!unix_dgram_peer_recv_ready(sk, other)) {
>>  		if (!timeo) {
>> -			err = -EAGAIN;
>> -			goto out_unlock;
>> +			if (unix_dgram_peer_wake_me(sk, other)) {
>> +				err = -EAGAIN;
>> +				goto out_unlock;
>> +			}
>> +
>> +			goto restart;
>>  		}
>
>
> So this will cause 'unix_state_lock(other) to be called twice in a
> row if we 'goto restart' (and hence will softlock the box). It just
> needs a 'unix_state_unlock(other);' before the 'goto restart'.

The goto restart was nonsense to begin with in this code path:
Restarting something is necessary after sleeping for some time but for
the case above, execution just continues. I've changed that (updated
patch should follow 'soon') to

	if (!unix_dgram_peer_recv_ready(sk, other)) {
		if (timeo) {
			timeo = unix_wait_for_peer(other, timeo);

			err = sock_intr_errno(timeo);
			if (signal_pending(current))
				goto out_free;

			goto restart;
		}
		
		if (unix_dgram_peer_wake_me(sk, other)) {
			err = -EAGAIN;
			goto out_unlock;
		}
	}

> I also tested this patch with a single unix server and 200 client
> threads doing roughly epoll() followed by write() until -EAGAIN in a
> loop. The throughput for the test was roughly the same as current
> upstream, but the cpu usage was a lot higher. I think its b/c this patch
> takes the server wait queue lock in the _poll() routine. This causes a
> lot of contention. The previous patch you posted for this where you did
> not clear the wait queue on every wakeup and thus didn't need the queue
> lock in poll() (unless we were adding to it), performed much better.

I'm somewhat unsure what to make of that: The previous patch would also
take the wait queue lock whenever poll was about to return 'not
writable' because of the length of the server receive queue unless
another thread using the same client socket also noticed this and
enqueued this same socket already. And "hundreds of clients using a
single client socket in order to send data to a single server socket"
doesn't seem very realistic to me.

Also, this code shouldn't usually be executed as the server should
usually be capable of keeping up with the data sent by clients. If it's
permanently incapable of that, you're effectively performing a
(successful) DDOS against it. Which should result in "high CPU
utilization" in either case. It may be possible to improve this by
tuning/ changing the flow control mechanism. Out of my head, I'd suggest
making the queue longer (the default value is 10) and delaying wake ups
until the server actually did catch up, IOW, the receive queue is empty
or almost empty. But this ought to be done with a different patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/