[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070128184116.GA12150@elte.hu>
Date: Sun, 28 Jan 2007 19:41:16 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Christoph Hellwig <hch@...radead.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linus Torvalds <torvalds@...l.org>,
Andrew Morton <akpm@...l.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/7] breaking the global file_list_lock
* Christoph Hellwig <hch@...radead.org> wrote:
> On Sun, Jan 28, 2007 at 12:51:18PM +0100, Peter Zijlstra wrote:
> > This patch-set breaks up the global file_list_lock which was found to be a
> > severe contention point under basically any filesystem intensive workload.
>
> Benchmarks, please. Where exactly do you see contention for this?
the following very simple workload:
http://redhat.com/~mingo/file-ops-test/file-ops-test.c
starts one process per CPU and open()s/close()s a file all over again,
simulating an open/close-intense workload. This pattern is quite typical
of several key Linux applications.
Using Peter's s_files patchset the following scalability improvement can
be measured (lower numbers are better):
----------------------------------------------------------------------
v2.6.20-rc6 | v2.6.20-rc6+Peter's s_files queue
----------------------------------------------------------------------
dual-core: 2.11 usecs/op | 1.51 usecs/op ( +39.7% win )
8-socket: 6.30 usecs/op | 2.70 usecs/op ( +233.3% win )
[ i'd expect something a 64-way box to improve its open/close
scalability dramatically, factor 10x or 20x better. On a 1024-CPU (or
4096-CPU) system the effects are very likely even more dramatic. ]
Why does this patch make such a difference? Not because the critical
section is in any way contended - it isnt, we only do a simple list
operation there. But this lock is touched in every sys_open() and
sys_close() system-call, so it is a high-frequency accessed cacheline.
The cost is there due to the global cacheline ping-pong of files_lock.
Furthermore, in this very important VFS codepath this is the 'last'
global cacheline that got eliminated, hence all the scalability benefits
(of previous improvements) get reaped all at once.
Now could you please tell me why i had to waste 3.5 hours on measuring
and profiling this /again/, while a tiny little bit of goodwill from
your side could have avoided this? I told you that we lock-profiled this
under -rt, and that it's an accurate measurement of such things - as the
numbers above prove it too. Would it have been so hard to say something
like: "Cool Peter! That lock had been in our way of good open()/close()
scalability for such a long time and it's an obviously good idea to
eliminate it. Now here's a couple of suggestions of how to do it even
simpler: [...]." Why did you have to in essence piss on his patchset?
Any rational explanation?
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists