netdev - Re: Killing sk->sk_callback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4857A334.5020501@gmail.com>
Date:	Tue, 17 Jun 2008 07:42:44 -0400
From:	Gregory Haskins <gregory.haskins@...il.com>
To:	David Miller <davem@...emloft.net>
CC:	herbert@...dor.apana.org.au, pmullaney@...ell.com,
	chuck.lever@...cle.com, netdev@...r.kernel.org,
	Gregory Haskins <ghaskins@...ell.com>
Subject: Re: Killing sk->sk_callback_lock

>>> On Tue, Jun 17, 2008 at 12:56 AM, in message
<20080616.215632.119969915.davem@...emloft.net>, David Miller
<davem@...emloft.net> wrote:
> From: "Gregory Haskins" <ghaskins@...ell.com>
> Date: Mon, 16 Jun 2008 22:01:13 -0600
> 
>> This seemed odd to us, so we investigated further to see if an
>> improvement was lurking or whether this was expected.  We traced
>> back the source of each wakeup to be coming from 1) the wmem/nospace
>> code, and 2) from the rx-wakeup code from the softirq.  First the
>> softirq would process the tx-completions which would wake_up() the
>> wait-queue for NOSPACE signaling.  Since the client was waiting for
>> a packet on the same wait-queue, this was where the first wakeup
>> came from.  Then later the softirq finally pushed an actual packet
>> to the queue, and the client was once again re-awoken via the same
>> overloaded wait-queue.  This time it would successfully find a
>> packet and return to userspace.
>>
>> Since the client does not care about wmem/nospace in the UDP rx
>> path, yet the two events share a single wait-queue, the first wakeup
>> was completely wasted.  It just causes extra scheduling activity
>> that does not help in any way (and is quite expensive in the
>> grand-scheme of things).  Based on this lead, Pat devised a solution
>> which eliminates the extra wake-up() when there are no clients
>> waiting for that particular NOSPACE event.  With his patch applied,
>> we observed two things:
> 
> Why is the application checking for receive packets even on the
> write-space wakeup?
> 
> poll/select/epoll should be giving the correct event indication,
> therefore the application would know to not check for receive
> packets when a write-wakeup event occurs.
> 
> Yes the wakeup is spurious and we should avoid it.  But this
> application is also buggy.

The application is blocked inside a system call (I forget which one 
right now..probably recv()).  So the wakeup is not against a 
poll/select.  Rather, the kernel is in 
net/core/datagram.c::wait_for_packet() (blocked on skb->sk_sleep).

Since both the wmem code and the rx code use skb->sk_sleep to wake up 
waiters, the wmem processing inadvertently kicks the client to go 
through __skb_recv_datagram() one more time.  And since there aren't yet 
any packets in skb->sk_receive_queue, the client loops and once again 
calls wait_for_packet().

So long story short:  This is entirely a kernel-space issue (unless you 
believe the usage of that system-call itself is a bug?)

HTH

Regards,
-Greg






Download attachment "signature.asc" of type "application/pgp-signature" (251 bytes)