[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100328200411.GC5116@nowhere>
Date: Sun, 28 Mar 2010 22:04:14 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: linux-kernel@...r.kernel.org, Matthew Wilcox <matthew@....cx>,
Thomas Gleixner <tglx@...utronix.de>, jblunck@...e.de,
Alan Cox <alan@...ux.intel.com>, Ingo Molnar <mingo@...e.hu>
Subject: Re: [GIT, RFC] Killing the Big Kernel Lock
On Wed, Mar 24, 2010 at 10:40:54PM +0100, Arnd Bergmann wrote:
> I've spent some time continuing the work of the people on Cc and many others
> to remove the big kernel lock from Linux and I now have bkl-removal branch
> in my git tree at git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git
> that lets me run a kernel on my quad-core machine with the only users of the BKL
> being mostly obscure device driver modules.
>
> The oldest patch in this series is roughly eight years old and is Willy's patch
> to remove the BKL from fs/locks.c, and I took a series of patches from Jan that
> removes it from most of the VFS.
>
> The other non-obvious changes are:
>
> - all file operations that either have an .ioctl method or do not have their
> own .llseek method used to implicitly require the BKL. I've changed that
> so they need to explicitly set .llseek = default_llseek, .unlocked_ioctl =
> default_ioctl, and changed all the code that either has supplied a .ioctl
> method or looks like it needs the BKL somewhere else, meaning the
> default_llseek function might actually do something.
>
> - The block layer now has a global bkldev_mutex that is used in all block
> drivers in place of the BKL. The only recursive instance of the BKL was
> __blkdev_get(), which is now called with the blkdev_mutex held instead of
> grabbing the BKL. This has some possible performance implications that
> need to be looked into.
>
> - The init/main.c code no longer take the BKL. I figured that this was
> completely unnecessary because there is no other code running at the
> same time that takes the BKL.
>
> - The most invasive change is in the TTY layer, which has a new global
> mutex (sorry!). I know that Alan has plans of his own to remove the BKL
> from this subsystem, so my patches may not go anywhere, but they seem
> to work fine for me.
> I've called the new lock the 'Big TTY Mutex' (BTM), a name that probably
> makes more sense if you happen to speak German.
> The basic idea here is to make recursive locking and the release-on-sleep
> explicit, so every mutex_lock, wait_event, workqueue_flush and schedule
> in the TTY layer now explicitly releases the BTM before blocking.
>
> - All drivers that still require the BKL are now listed as 'depends on BKL'
> in Kconfig, and you can set that symbol to 'y', 'm' or 'n'. If the lock
> itself is a module, only other modules can use it, and /proc/modules
> will tell you exactly which ones those are. I've thought about adding
> a module_init function in that module that will taint the kernel, but so
> far I haven't done that.
>
> - Included is a debugfs file that gives statistics over the BKL usage from
> early boot on. This is now obsolete and will not get merged, but I'm
> including it for reference.
>
> Frederic has volunteered to help merging all of this upstream, which I
> very much welcome. The shape that the tree is in now is very inconsistent,
> especially some of the bits at the end are a bit dodgy and all of it needs
> more testing.
>
> I've built-tested an allmodconfig kernel with CONFIG_BKL disabled
> on x86_64, i386, powerpc64, powerpc32, s390 and arm to make sure I
> catch all the modules that depend on BKL, and I've been running
> various versions of this tree on my desktop machine over the last few
> weeks while adding stuff.
>
> Arnd
>
> ---
>
> Arnd Bergmann (44):
> input: kill BKL, fix input_open_file locking
> ptrace: kill BKL
> procfs: kill BKL in llseek
> random: forbid llseek on random chardev
> x86/microcode: use nonseekable_open
> perf_event: use nonseekable_open
I just queued the perf_event one. It looks pretty good. I'm also
looking at some of the most trivials (ehm..less hards) in the list
and see which we can submit right away.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists