[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <520f0cf11003290545k752e9a32v7c6ae15e515ecb51@mail.gmail.com>
Date: Mon, 29 Mar 2010 14:45:34 +0200
From: John Kacur <jkacur@...il.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
linux-kernel@...r.kernel.org, Matthew Wilcox <matthew@....cx>,
Thomas Gleixner <tglx@...utronix.de>, jblunck@...e.de,
Alan Cox <alan@...ux.intel.com>, Ingo Molnar <mingo@...e.hu>
Subject: Re: [GIT, RFC] Killing the Big Kernel Lock
On Wed, Mar 24, 2010 at 11:40 PM, Arnd Bergmann <arnd@...db.de> wrote:
> I've spent some time continuing the work of the people on Cc and many others
> to remove the big kernel lock from Linux and I now have bkl-removal branch
> in my git tree at git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git
> that lets me run a kernel on my quad-core machine with the only users of the BKL
> being mostly obscure device driver modules.
>
> The oldest patch in this series is roughly eight years old and is Willy's patch
> to remove the BKL from fs/locks.c, and I took a series of patches from Jan that
> removes it from most of the VFS.
>
> The other non-obvious changes are:
>
> - all file operations that either have an .ioctl method or do not have their
> own .llseek method used to implicitly require the BKL. I've changed that
> so they need to explicitly set .llseek = default_llseek, .unlocked_ioctl =
> default_ioctl, and changed all the code that either has supplied a .ioctl
> method or looks like it needs the BKL somewhere else, meaning the
> default_llseek function might actually do something.
>
> - The block layer now has a global bkldev_mutex that is used in all block
> drivers in place of the BKL. The only recursive instance of the BKL was
> __blkdev_get(), which is now called with the blkdev_mutex held instead of
> grabbing the BKL. This has some possible performance implications that
> need to be looked into.
>
> - The init/main.c code no longer take the BKL. I figured that this was
> completely unnecessary because there is no other code running at the
> same time that takes the BKL.
>
> - The most invasive change is in the TTY layer, which has a new global
> mutex (sorry!). I know that Alan has plans of his own to remove the BKL
> from this subsystem, so my patches may not go anywhere, but they seem
> to work fine for me.
> I've called the new lock the 'Big TTY Mutex' (BTM), a name that probably
> makes more sense if you happen to speak German.
> The basic idea here is to make recursive locking and the release-on-sleep
> explicit, so every mutex_lock, wait_event, workqueue_flush and schedule
> in the TTY layer now explicitly releases the BTM before blocking.
>
> - All drivers that still require the BKL are now listed as 'depends on BKL'
> in Kconfig, and you can set that symbol to 'y', 'm' or 'n'. If the lock
> itself is a module, only other modules can use it, and /proc/modules
> will tell you exactly which ones those are. I've thought about adding
> a module_init function in that module that will taint the kernel, but so
> far I haven't done that.
>
> - Included is a debugfs file that gives statistics over the BKL usage from
> early boot on. This is now obsolete and will not get merged, but I'm
> including it for reference.
>
> Frederic has volunteered to help merging all of this upstream, which I
> very much welcome. The shape that the tree is in now is very inconsistent,
> especially some of the bits at the end are a bit dodgy and all of it needs
> more testing.
>
> I've built-tested an allmodconfig kernel with CONFIG_BKL disabled
> on x86_64, i386, powerpc64, powerpc32, s390 and arm to make sure I
> catch all the modules that depend on BKL, and I've been running
> various versions of this tree on my desktop machine over the last few
> weeks while adding stuff.
>
> Arnd
>
> ---
>
> Arnd Bergmann (44):
> input: kill BKL, fix input_open_file locking
> ptrace: kill BKL
> procfs: kill BKL in llseek
> random: forbid llseek on random chardev
> x86/microcode: use nonseekable_open
> perf_event: use nonseekable_open
> dm: use nonseekable_open
> vgaarb: use nonseekable_open
> kvm: don't require BKL
> nvram: kill BKL
> do_coredump: do not take BKL
> hpet: kill BKL, add compat_ioctl
> proc/pci: kill BKL
> autofs/autofs4: move compat_ioctl handling into fs
> usb/mon: kill BKL usage
> fat: push down BKL
> sunrpc: push down BKL
> pcmcia: push down BKL
> vfs: kill BKL in default_llseek
> BKL: introduce CONFIG_BKL.
> bkl-removal: make fops->ioctl and default_llseek optional
> x86: update defconfig to CONFIG_BKL=m
> bkl removal: make unlocked_ioctl mandatory
> bkl removal: use default_llseek in code that uses the BKL
> BKL removal: mark remaining users as 'depends on BKL'
> tty: replace BKL with a new tty_lock
> tty: make atomic_write_lock release tty_lock
> tty: make tty_port->mutex nest under tty_lock
> tty: make termios mutex nest under tty_lock
> tty: make ldisc_mutex nest under tty_lock
> tty: never hold tty_lock() while getting tty_mutex
> ppp: use big tty mutex
> tty: release tty lock when blocking
> tty: implement BTM as mutex instead of BKL
> briq_panel: do not use BTM
> affs: remove leftover unlock_kernel
> kvm: don't require BKL
> block: replace BKL with global mutex
> init: kill BKL usage
> debug: instrument big kernel lock
> BKL removal: make the BKL modular
>
> Matthew Wilcox (1):
> [RFC] Remove BKL from fs/locks.c
>
> Jan Blunck (19):
> JFS: Free sbi memory in error path
> BKL: Explicitly add BKL around get_sb/fill_super
> BKL: Remove outdated comment and include
> BKL: Remove BKL from Amiga FFS
> BKL: Remove BKL from BFS
> BKL: Remove BKL from CifsFS
> BKL: Remove BKL from ext3 fill_super()
> BKL: Remove BKL from ext3_put_super() and ext3_remount()
> BKL: Remove BKL from ext4 filesystem
> BKL: Remove smp_lock.h from exofs
> BKL: Remove BKL from HFS
> BKL: Remove BKL from HFS+
> BKL: Remove BKL from JFS
> BKL: Remove BKL from NILFS2
> BKL: Remove BKL from NTFS
> BKL: Remove BKL from cgroup
> BKL: Remove BKL from do_new_mount()
> ext2: Add ext2_sb_info s_lock spinlock
> BKL: Remove BKL from ext2 filesystem
> --
Great, Arnd, I like this.
I also have a private but stale tree where I have collected some
remove bkl patches (which I will review against your tree.)
I think that it is important that we keep chipping away at it though,
and that we all keep sending stuff upstream when it is ready.
Thanks
John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists