linux-kernel - Re: [PATCHSET] workqueue: concurrency managed workqueue, take#4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <18911.1268759911@redhat.com>
Date:	Tue, 16 Mar 2010 17:18:31 +0000
From:	David Howells <dhowells@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	dhowells@...hat.com, torvalds@...ux-foundation.org, mingo@...e.hu,
	peterz@...radead.org, awalls@...ix.net,
	linux-kernel@...r.kernel.org, jeff@...zik.org,
	akpm@...ux-foundation.org, jens.axboe@...cle.com,
	rusty@...tcorp.com.au, cl@...ux-foundation.org,
	arjan@...ux.intel.com, avi@...hat.com, johannes@...solutions.net,
	andi@...stfloor.org, oleg@...hat.com
Subject: Re: [PATCHSET] workqueue: concurrency managed workqueue, take#4

Tejun Heo <tj@...nel.org> wrote:

> Sure, there could be a bug in the non-reentrance implementation but
> I'm leaning more towards a bug in work flushing before freeing thing
> which also seems to show up in the debugfs path.  I'll try to
> reproduce the problem here and debug it.

I haven't managed to reproduce it since I reported it:-/

> That said, the numbers look generally favorable to CMWQ although the
> sample size is too small to draw conclusions.  I'll try to get things
> fixed up so that testing can be smoother.

You have to take the numbers with a large pinch of salt, I think, in both
cases.  Pulling over the otherwise unladen GigE network from the server with
the data in RAM is somewhat faster than sucking from disk.  Furthermore, since
the test is massively parallel, with each thread reading separate data, the
result is going to be very much dependent on what order the reads happen to be
issued this time compared to the order they were issued when the cache was
filled.

I need to fix my slow-test server that's dangling at the end of an ethernet
over mains connection.  That gives much more consistent results as the disk
speed is greater than the network connection speed.

Looking at the numbers, I think CMWQ may appear to give better results in the
cold-cache case by starting off confining many accesses to the cache to a
single CPU, given that cache object creation and data storage is done
asynchronously in the background.  This is due to object creation getting
deferred until index creation is achieved (several loopups, mkdirs and
setxattrs), and then all dumped at once onto the CPU that handled the index
creation, as we discussed elsewhere.

The program I'm using to read the data doesn't give any real penalty when its
threads can't actually run in parallel, so it probably doesn't mind being
largely confined to the other CPU.  But that's benchmarking for you...

You should probably also disregard the coldish-server numbers.  I'm not sure
my desktop machine (which was acting as the server) was purged of the dataset.
I'd need to reboot the server to be sure, but that's inconvenient of my
desktop.

But, at a glance, the numbers don't appear to be too different.  There are
cases where CMWQ definitely appears better, and some where it definitely
appears worse, but the spread is so huge, that could just be noise.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/