lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1267529273.2964.111.camel@edumazet-laptop>
Date:	Tue, 02 Mar 2010 12:27:53 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Mike Galbraith <efault@....de>
Cc:	netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>
Subject: Re: [rfc/rft][patch] should use scheduler sync hint in
 tcp_prequeue()?

Le mardi 02 mars 2010 à 10:41 +0100, Mike Galbraith a écrit :
> Greetings network land.
> 
> The reason for this query is that wake_affine() fails if there is one
> and only one task on a runqueue to encourage tasks spreading out, which
> increases cpu utilization.  However, for tasks which are communicating
> at high frequency, the cost of the resulting cache misses, should
> partners land in non-shared caches, is horrible to behold.  My Q6600 has
> shared caches, which may or may not be hit IFF something perturbs the
> system, and bounces partner to the right core.  That won't happen on a
> box with no shared caches of course, and even with shared caches
> available, the pain is highly visible in the TCP numbers below. 
> 
> The sync hint tells wake_affine() that the waker is likely going to
> sleep soonish, so it subtracts the waker from the load imbalance
> calculation, allowing the partner task to be awakened affine.  In the
> shared cache available case, that is also an enabler that the task be
> placed in an idle shared cache, which can increase throughput quite a
> bit (see .31 vs .33 AF UNIX), or may cost a bit if there is little to no
> execution overlap (see pipe).
> 
> Now, I _could_ change wake_affine() to globally succeed in the one task
> case, but am loath to do so because that very well may upset delicate
> load balancing apple cart.  I think it's much safer to target the spot
> that I know hurts like hell.  Thoughts?
> 
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 34f5cc2..ba3fc64 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -939,7 +939,7 @@ static inline int tcp_prequeue(struct sock *sk, struct sk_buff *skb)
>  
>  		tp->ucopy.memory = 0;
>  	} else if (skb_queue_len(&tp->ucopy.prequeue) == 1) {
> -		wake_up_interruptible_poll(sk->sk_sleep,
> +		wake_up_interruptible_sync_poll(sk->sk_sleep,
>  					   POLLIN | POLLRDNORM | POLLRDBAND);
>  		if (!inet_csk_ack_scheduled(sk))
>  			inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
> 

I suspect this discussion is more a lkml topic but anyway...

This wake_up_interruptible_sync_poll() change might be good for loopback
communications (and pleases tbench), but is it desirable for regular
multi flows NIC traffic ? 

Ingo probably can answer to this question, since he changed
sock_def_readable() (and others) in commit 6f3d09291b498299
I suspect he missed tcp_prequeue() case, maybe not...

sched, net: socket wakeups are sync

'sync' wakeups are a hint towards the scheduler that (certain)
networking related wakeups likely create coupling between tasks.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ