lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1236505323.6281.57.camel@marge.simson.net>
Date:	Sun, 08 Mar 2009 10:42:03 +0100
From:	Mike Galbraith <efault@....de>
To:	Balazs Scheidler <bazsi@...abit.hu>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: scheduler oddity [bug?]

On Sat, 2009-03-07 at 18:47 +0100, Balazs Scheidler wrote:
> Hi,
> 
> I'm experiencing an odd behaviour from the Linux scheduler. I have an
> application that feeds data to another process using a pipe. Both
> processes use a fair amount of CPU time apart from writing to/reading
> from this pipe.
> 
> The machine I'm running on  is an Opteron Quad-Core CPU:
> model name	: Quad-Core AMD Opteron(tm) Processor 2347 HE
> stepping	: 3
> 
> What I see is that only one of the cores is used, the other three is
> idling without doing any work. If I explicitly set the CPU affinity of
> the processes to use distinct CPUs the performance goes up
> significantly. (e.g. it starts to use the other cores and the load
> scales linearly).
> 
> I've tried to reproduce the problem by writing a small test program,
> which you can find attached. The program creates two processes, one
> feeds the other using a pipe and each does a series of memset() calls to
> simulate CPU load. I've also added capability to the program to set its
> own CPU affinity. The results (the more the better):
> 
> Without enabling CPU affinity:
> $ ./a.out
> Check: 0 loops/sec, sum: 1 
> Check: 12 loops/sec, sum: 13 
> Check: 41 loops/sec, sum: 54 
> Check: 41 loops/sec, sum: 95 
> Check: 41 loops/sec, sum: 136 
> Check: 41 loops/sec, sum: 177 
> Check: 41 loops/sec, sum: 218 
> Check: 40 loops/sec, sum: 258 
> Check: 41 loops/sec, sum: 299 
> Check: 41 loops/sec, sum: 340 
> Check: 41 loops/sec, sum: 381 
> Check: 41 loops/sec, sum: 422 
> Check: 41 loops/sec, sum: 463 
> Check: 41 loops/sec, sum: 504 
> Check: 41 loops/sec, sum: 545 
> Check: 40 loops/sec, sum: 585 
> Check: 41 loops/sec, sum: 626 
> Check: 41 loops/sec, sum: 667 
> Check: 41 loops/sec, sum: 708 
> Check: 41 loops/sec, sum: 749 
> Check: 41 loops/sec, sum: 790 
> Check: 41 loops/sec, sum: 831 
> Final: 39 loops/sec, sum: 831
> 
> 
> With CPU affinity:
> # ./a.out 1
> Check: 0 loops/sec, sum: 1 
> Check: 41 loops/sec, sum: 42 
> Check: 49 loops/sec, sum: 91 
> Check: 49 loops/sec, sum: 140 
> Check: 49 loops/sec, sum: 189 
> Check: 49 loops/sec, sum: 238 
> Check: 49 loops/sec, sum: 287 
> Check: 50 loops/sec, sum: 337 
> Check: 49 loops/sec, sum: 386 
> Check: 49 loops/sec, sum: 435 
> Check: 49 loops/sec, sum: 484 
> Check: 49 loops/sec, sum: 533 
> Check: 49 loops/sec, sum: 582 
> Check: 49 loops/sec, sum: 631 
> Check: 49 loops/sec, sum: 680 
> Check: 49 loops/sec, sum: 729 
> Check: 49 loops/sec, sum: 778 
> Check: 49 loops/sec, sum: 827 
> Check: 49 loops/sec, sum: 876 
> Check: 49 loops/sec, sum: 925 
> Check: 50 loops/sec, sum: 975 
> Check: 49 loops/sec, sum: 1024 
> Final: 48 loops/sec, sum: 1024
> 
> The difference is about 20%, which is about the same work performed by
> the slave process. If the two processes race for the same CPU this 20%
> of performance is lost.
> 
> I've tested this on 3 computers and each showed the same symptoms:
>  * quad core Opteron, running Ubuntu kernel 2.6.27-13.29
>  * Core 2 Duo, running Ubuntu kernel 2.6.27-11.27
>  * Dual Core Opteron, Debian backports.org kernel 2.6.26-13~bpo40+1
> 
> Is this a bug, or a feature?

Both.  Affine wakeups are cache friendly, and generally a feature, but
can lead to underutilized CPUs in some cases, thus turning feature into
bug as your testcase demonstrates.  The metric we for the affinity hint
works well, but clearly wants some refinement.

You can turn this scheduler hint off via:
	echo NO_SYNC_WAKEUPS > /sys/kernel/debug/sched_features

	-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ