lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 Dec 2017 11:10:24 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Tejun Heo <tj@...nel.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Petr Mladek <pmladek@...e.com>, Jan Kara <jack@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Rafael Wysocki <rjw@...ysocki.net>,
        Pavel Machek <pavel@....cz>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        linux-kernel@...r.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

Hello,

On (12/14/17 10:11), Tejun Heo wrote:
> Hey, Steven.
> 
> On Thu, Dec 14, 2017 at 12:55:06PM -0500, Steven Rostedt wrote:
> > Yes! Please create a reproducer, because I still don't believe there is
> > one. And it's all hand waving until there's an actual report that we can
> > lock up the system with my approach.
> 
> Yeah, will do, but out of curiosity, Sergey and I already described
> what the root problem was and you didn't really seem to take that.  Is
> that because the explanation didn't make sense to you or us
> misunderstanding what your code does?

I second _everything_ that Tejun has said.


Steven, your approach works ONLY when we have the following preconditions:

 a) there is a CPU that is calling printk() from the 'safe' (non-atomic,
    etc) context

        what does guarantee that? what happens if there is NO non-atomic
        CPU or that non-atomic simplky missses the console_owner != false
        point? we are going to conclude

        "if printk() doesn't work for you, it's because you are holding it wrong"?


        what if that non-atomic CPU does not call printk(), but instead
        it does console_lock()/console_unlock()? why there is no handoff?

        CPU0				CPU1 ~ CPU10
					in atomic contexts [!]. ping-ponging console_sem
					ownership to each other. while what they really
					need to do is to simply up() and let CPU0 to
					handle it.
					printk
	console_lock()
	 schedule()
					...
					printk
					printk
					...
					printk
					printk

					up()

	// woken up
	console_unlock()

        why do we make an emphasis on fixing vprintk_printk()?


 b) non-atomic CPU sees console_owner set (which is set for a very short
    period of time)

        again. what if that non-atomic CPU does not see console_owner?
        "don't use printk()"?

 c) the task that is looping in console_unlock() sees non-atomic CPU when
    console_owner is set.


IOW, we need to have


   the right CPU (a) at the very right moment (b && c) doing the very right thing.


   * and the "very right moment" is tiny and additionally depends
     on a foreign CPU [the one that is looping in console_unlock()].



a simple question - how is that going to work for everyone? are we
"fixing" a small fraction of possible use-cases?



Steven, I thought we reached the agreement [**] that the solution we should
be working on is a combination of prinkt_kthread and console_sem hand
off. Simply because it adds the missing "there is a non-atomic CPU wishing
to console_unlock()" thing.

	lkml.kernel.org/r/20171108162813.GA983427@...big577.frc2.facebook.com

	https://marc.info/?l=linux-kernel&m=151011840830776&w=2
	https://marc.info/?l=linux-kernel&m=151015141407368&w=2
	https://marc.info/?l=linux-kernel&m=151018900919386&w=2
	https://marc.info/?l=linux-kernel&m=151019815721161&w=2
	https://marc.info/?l=linux-kernel&m=151020275921953&w=2
**	https://marc.info/?l=linux-kernel&m=151020404622181&w=2
**	https://marc.info/?l=linux-kernel&m=151020565222469&w=2


what am I missing?

	-ss

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ