lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 28 Jan 2016 15:08:19 -0800
From:	Peter Hurley <peter@...leysoftware.com>
To:	Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc:	Byungchul Park <byungchul.park@....com>, akpm@...ux-foundation.org,
	mingo@...nel.org, linux-kernel@...r.kernel.org,
	akinobu.mita@...il.com, jack@...e.cz,
	torvalds@...ux-foundation.org,
	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Subject: Re: [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the
 debug code

On 01/28/2016 07:42 AM, Sergey Senozhatsky wrote:
> On (01/28/16 19:53), Sergey Senozhatsky wrote:
>>> ah... silly me... you mean the first CPU that triggers the spin_dump() will
>> 					^^^ this, of course, is true for
>> 					console_sem->lock and logbuf_lock
>> 					only.
>>
>>> deadlock itself, so the rest of CPUs will see endless recursive
>>> spin_lock()->spin_dump()->spin_lock()->spin_dump() calls?
> [..]
>>> Can you please update your bug description in the commit message?
>>> It's the deadlock that is causing the recursion on other CPUs in the
>>> first place.
> 
> no, don't update anything. I was completely wrong. it's not a deadlock
> that is the root cause here.
> 
> even if at some level of recursion (nested printk calls)
> spin_dump()->__spin_lock_debug()->arch_spin_trylock() acquires the
> lock, it returns back with the spin lock unlocked anyway.
> 
> vprintk_emit()
>  console_trylock()
>   spin_lock()
>    spin_dump()
>     vprintk_emit()
>      console_trylock()
>       spin_lock()
>        spin_dump()
>         vprintk_emit()
>          console_trylock()
>           spin_lock()     << OK, got the lock finally

The problem is you have postulated a very shallow recursion.
This looks much worse if this happens 1000 times, and
probably won't recover to output anything.

Additionally, what if the console_sem is simply corrupted?
A livelock with no output ever is not very helpful.

As I wrote earlier, I don't think this is the way to fix
recursion problems with printk() [by eliding output].

Rather, a way to effectively determine a recursion is in progress,
and _at a minimum_ guaranteeing that the recursive output will
eventually be output should be the goal.

Including dumb recursion like a console driver printing
an error :/

Then, lockdep could remain enabled while calling console drivers.

Regards,
Peter Hurley

>            sem->count--
>           spin_unlock()   << unlock, return
>        arch_spin_lock()   << got the lock, return
>       sem->count--
>       spin_unlock() << unlock, return
>    arch_spin_lock() << got the lock, return
>   sem->count--
>   spin_unlock() << unlock, return
> 
> 
> ...um
> 
> 
>> But I found there's a possiblity in the debug code *itself* to cause a
>> lockup.
> 
> please explain.
> 
> 	-ss
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ