lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120922200629.GC14004@kroah.com>
Date:	Sat, 22 Sep 2012 13:06:29 -0700
From:	Greg KH <gregkh@...uxfoundation.org>
To:	Daniel Vetter <daniel.vetter@...ll.ch>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
	DRI Development <dri-devel@...ts.freedesktop.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Alan Cox <alan@...rguk.ukuu.org.uk>
Subject: Re: [Intel-gfx] [PATCH] console: implement lockdep support for
 console_lock

On Sat, Sep 22, 2012 at 07:52:11PM +0200, Daniel Vetter wrote:
> Dave Airlie recently discovered a locking bug in the fbcon layer,
> where a timer_del_sync (for the blinking cursor) deadlocks with the
> timer itself, since both (want to) hold the console_lock:
> 
> https://lkml.org/lkml/2012/8/21/36
> 
> Unfortunately the console_lock isn't a plain mutex and hence has no
> lockdep support. Which resulted in a few days wasted of tracking down
> this bug (complicated by the fact that printk doesn't show anything
> when the console is locked) instead of noticing the bug much earlier
> with the lockdep splat.
> 
> Hence I've figured I need to fix that for the next deadlock involving
> console_lock - and with kms/drm growing ever more complex locking
> that'll eventually happen.
> 
> Now the console_lock has rather funky semantics, so after a quick irc
> discussion with Thomas Gleixner and Dave Airlie I've quickly ditched
> the original idead of switching to a real mutex (since it won't work)
> and instead opted to annotate the console_lock with lockdep
> information manually.
> 
> There are a few special cases:
> - The console_lock state is protected by the console_sem, and usually
>   grabbed/dropped at _lock/_unlock time. But the suspend/resume code
>   drops the semaphore without dropping the console_lock (see
>   suspend_console/resume_console). But since the same thread that did
>   the suspend will do the resume, we don't need to fix up anything.
> 
> - In the printk code there's a special trylock, only used to kick off
>   the logbuffer printk'ing in console_unlock. But all that happens
>   while lockdep is disable (since printk does a few other evil
>   tricks). So no issue there, either.
> 
> - The console_lock can also be acquired form irq context (but only
>   with a trylock). lockdep already handles that.
> 
> This all leaves us with annotating the normal console_lock, _unlock
> and _trylock functions.
> 
> And yes, it works - simply unloading a drm kms driver resulted in
> lockdep complaining about the deadlock in fbcon_deinit:
> 
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 3.6.0-rc2+ #552 Not tainted
> -------------------------------------------------------
> kms-reload/3577 is trying to acquire lock:
>  ((&info->queue)){+.+...}, at: [<ffffffff81058c70>] wait_on_work+0x0/0xa7
> 
> but task is already holding lock:
>  (console_lock){+.+.+.}, at: [<ffffffff81264686>] bind_con_driver+0x38/0x263
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (console_lock){+.+.+.}:
>        [<ffffffff81087440>] lock_acquire+0x95/0x105
>        [<ffffffff81040190>] console_lock+0x59/0x5b
>        [<ffffffff81209cb6>] fb_flashcursor+0x2e/0x12c
>        [<ffffffff81057c3e>] process_one_work+0x1d9/0x3b4
>        [<ffffffff810584a2>] worker_thread+0x1a7/0x24b
>        [<ffffffff8105ca29>] kthread+0x7f/0x87
>        [<ffffffff813b1204>] kernel_thread_helper+0x4/0x10
> 
> -> #0 ((&info->queue)){+.+...}:
>        [<ffffffff81086cb3>] __lock_acquire+0x999/0xcf6
>        [<ffffffff81087440>] lock_acquire+0x95/0x105
>        [<ffffffff81058cab>] wait_on_work+0x3b/0xa7
>        [<ffffffff81058dd6>] __cancel_work_timer+0xbf/0x102
>        [<ffffffff81058e33>] cancel_work_sync+0xb/0xd
>        [<ffffffff8120a3b3>] fbcon_deinit+0x11c/0x1dc
>        [<ffffffff81264793>] bind_con_driver+0x145/0x263
>        [<ffffffff81264a45>] unbind_con_driver+0x14f/0x195
>        [<ffffffff8126540c>] store_bind+0x1ad/0x1c1
>        [<ffffffff8127cbb7>] dev_attr_store+0x13/0x1f
>        [<ffffffff8116d884>] sysfs_write_file+0xe9/0x121
>        [<ffffffff811145b2>] vfs_write+0x9b/0xfd
>        [<ffffffff811147b7>] sys_write+0x3e/0x6b
>        [<ffffffff813b0039>] system_call_fastpath+0x16/0x1b
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(console_lock);
>                                lock((&info->queue));
>                                lock(console_lock);
>   lock((&info->queue));
> 
>  *** DEADLOCK ***
> 
> v2: Mark the lockdep_map static, noticed by Jani Nikula.
> 
> Cc: Dave Airlie <airlied@...il.com>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Alan Cox <alan@...rguk.ukuu.org.uk>
> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Signed-off-by: Daniel Vetter <daniel.vetter@...ll.ch>
> ---
>  kernel/printk.c |    9 +++++++++
>  1 file changed, 9 insertions(+)

So I'm guessing I should take this through the tty tree, right?  Any
objections to that for 3.7?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ