lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 16 Mar 2010 17:18:31 +0000
From:	David Howells <dhowells@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	dhowells@...hat.com, torvalds@...ux-foundation.org, mingo@...e.hu,
	peterz@...radead.org, awalls@...ix.net,
	linux-kernel@...r.kernel.org, jeff@...zik.org,
	akpm@...ux-foundation.org, jens.axboe@...cle.com,
	rusty@...tcorp.com.au, cl@...ux-foundation.org,
	arjan@...ux.intel.com, avi@...hat.com, johannes@...solutions.net,
	andi@...stfloor.org, oleg@...hat.com
Subject: Re: [PATCHSET] workqueue: concurrency managed workqueue, take#4

Tejun Heo <tj@...nel.org> wrote:

> Sure, there could be a bug in the non-reentrance implementation but
> I'm leaning more towards a bug in work flushing before freeing thing
> which also seems to show up in the debugfs path.  I'll try to
> reproduce the problem here and debug it.

I haven't managed to reproduce it since I reported it:-/

> That said, the numbers look generally favorable to CMWQ although the
> sample size is too small to draw conclusions.  I'll try to get things
> fixed up so that testing can be smoother.

You have to take the numbers with a large pinch of salt, I think, in both
cases.  Pulling over the otherwise unladen GigE network from the server with
the data in RAM is somewhat faster than sucking from disk.  Furthermore, since
the test is massively parallel, with each thread reading separate data, the
result is going to be very much dependent on what order the reads happen to be
issued this time compared to the order they were issued when the cache was
filled.

I need to fix my slow-test server that's dangling at the end of an ethernet
over mains connection.  That gives much more consistent results as the disk
speed is greater than the network connection speed.


Looking at the numbers, I think CMWQ may appear to give better results in the
cold-cache case by starting off confining many accesses to the cache to a
single CPU, given that cache object creation and data storage is done
asynchronously in the background.  This is due to object creation getting
deferred until index creation is achieved (several loopups, mkdirs and
setxattrs), and then all dumped at once onto the CPU that handled the index
creation, as we discussed elsewhere.

The program I'm using to read the data doesn't give any real penalty when its
threads can't actually run in parallel, so it probably doesn't mind being
largely confined to the other CPU.  But that's benchmarking for you...

You should probably also disregard the coldish-server numbers.  I'm not sure
my desktop machine (which was acting as the server) was purged of the dataset.
I'd need to reboot the server to be sure, but that's inconvenient of my
desktop.

But, at a glance, the numbers don't appear to be too different.  There are
cases where CMWQ definitely appears better, and some where it definitely
appears worse, but the spread is so huge, that could just be noise.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ