[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <47DA6C1E.8010000@hp.com>
Date: Fri, 14 Mar 2008 08:14:22 -0400
From: "Alan D. Brunelle" <Alan.Brunelle@...com>
To: linux-kernel@...r.kernel.org
Cc: Jens Axboe <jens.axboe@...cle.com>, npiggin@...e.de, dgc@....com
Subject: IO CPU affinity test results
Good morning Jens -
I had two machines running the latest patches hang last night:
o 2-way AMD64 - I inadvertently left the patched kernel running, and I was moving a ton of data (100+GB) back up over the net to this node. It hard hung (believe it or not) about 99% of the way through. Hard hang, wouldn't respond to anything.
o 4-way IA64 - I was performing a simple test: [mkfs / mount / untar linux sources / make allnoconfig / make -j 5 / umount] repeatedly switching rq_affinity to 0/1 between each run. After 22 passes it had a hard hang with rq_affinity set to 1.
Of course, there is no way of knowing if either hang had anything to do with the patches, but it seems a bit ominous as RQ=1 was set in both cases.
This same test worked fine for 30 passes on a 2-way AMD64 box, with the following results:
Part RQ MIN AVG MAX Dev
----- -- ------ ------ ------ ------
mkfs 0 41.656 41.862 42.086 0.141
mkfs 1 41.618 41.909 42.270 0.192
untar 0 18.055 19.611 20.906 0.720
untar 1 18.523 19.905 21.988 0.738
make 0 50.480 50.991 51.752 0.340
make 1 49.819 50.442 51.000 0.292
comb 0 110.433 112.464 114.176 0.932
comb 1 110.694 112.256 114.683 0.948
psys 0 10.28% 10.91% 11.29% 0.243
psys 1 10.21% 11.05% 11.80% 0.350
All results are in seconds (as measured by Python's time.time()), except for the psys - which was the average of mpstat's %sys column over the life of the whole run. The mkfs part consisted of [mkfs -t ext2 ; sync ; sync], untar [mount; untar linux sources; umount; sync; sync], make [mount; make allnoconfig; make -j 3; umount; sync; sync], and comb is the combined times of the mkfs, untar and make parts.
So, in a nutshell, we saw slightly better overall performance, but not conclusively, and we saw slightly elevated %system time to accomplish the task.
On the 4-way, results were much worse: the final data shown before the system hung showed the rq=1 passes taking significantly longer, albeit at lower %system. I'm going to try the runs again, but I have a feeling that the latest "clean" patch based upon Nick's single call mechanism is a step backwards.
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists