linux-kernel - Re: [RFC 0/5] printk: Implement WARN

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20161005103717.GD23809@pathway.suse.cz>
Date:   Wed, 5 Oct 2016 12:37:17 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:     Matt Fleming <matt@...eblueprint.co.uk>,
        Byungchul Park <byungchul.park@....com>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Jan Kara <jack@...e.cz>, Luca Abeni <luca.abeni@...tn.it>,
        Rik van Riel <riel@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Wanpeng Li <wanpeng.li@...mail.com>,
        Yuyang Du <yuyang.du@...el.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Mike Galbraith <umgwanakikbuti@...il.com>,
        Tejun Heo <tj@...nel.org>, Calvin Owens <calvinowens@...com>,
        linux-kernel@...r.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()

On Fri 2016-09-30 09:48:32, Sergey Senozhatsky wrote:
> On (09/29/16 13:28), Petr Mladek wrote:
> > On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote:
> > > On (09/27/16 18:02), Petr Mladek wrote:
> > > > The main trick is that we replace the per-CPU function pointer
> > > > by a preempt_count-like variable that could track the printk context.
> > > > 
> > > > I know that Sergey has another ideas in this area. But I wanted to see
> > > > how this approach would look like.
> > > 
> > > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I
> > > think, the maintenance cost of that solution is just too high:
> > > 
> > > a) every existing WARN_* in sched/timekeeping/who knows where else
> > >    must be evaluated to ensure that in can't be called from printk()
> > >    path. if `false' - then the corresponding macro must be replaced
> > >    with _DEFERRED flavor.
> > > 
> > > b) any patch that adds new WARN_* usages must be additionally checked
> > >    to ensure that each of new WARN_* macros cannot be called from printk
> > >    path. if `false' -- the corresponding macro must be replaced with
> > >    _DEFERRED flavor.
> > > 
> > > c) any patch that refactors the code or moves some function calls around
> > >    etc. must be additionally checked for any accidental WARN_* from printk
> > >    path. even though if none of the patches added any new WARN_* to the code.
> > > 
> > > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not
> > >    necessarily newly added (see 'c').
> > > 
> > > 
> > > that's too much.
> > > 
> > > it takes a lot of additional effort, because both reviewer and contributor
> > > must consider printk() internals. and, what's worse, if something goes
> > > unnoticed we end up having a printk() deadlock again.
> > > 
> > > so I decided to address some of printk() issues in printk.c, not in
> > > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else.

I do not longer see how this might be achieved. If a printk()/WARN()
in the scheduler/timekeeping code can be reached from printk() then
it might too be reached outside printk. In this case, printk()
will not know about it and will happily call the scheduler/timekeeping
code recursively. This might still cause deadlock. 


> > I see the point.
> 
> well, just my 5 cents.
> 
> > Your approach (alt buffer) adds some complexity to the printk code
> 
> it does.
> the other thing is that there are several ways to deadlock printk().
> alt_printk is addressing deadlocks that were caused by printk()
> recursion only.
> 
>    printk()
>      acquire_lock(&foo)
>        printk()
>          acquire_lock(&foo)

This looks theoretical. The recursion in printk() is not easily
possible at the moment. It is prevented by logbuf_cpu check when
logbug_lock is taken. It is prevented by console_trylock() when
console_sem is taken.

> which is a sub-set of all of the printk() deadlock scenarios. all of
> the locks that printk() acquires can be taken outside of printk() path.
> 
> for example, cat /proc/console locks the console_lock() for seq output.
> thus we can have something like
> 
>         console_unlock()	// lock  &sem->lock
>           up()
>             activate_task()
>               WARN_ON()
>                 printk()
>                   console_trylock() // lock &sem->lock

The WARN_ON() here is called under &p->pi_lock that is taken
by try_to_wake_up(). This WARN_ON() can be triggered also
outside printk()/console_unlock(). Therefore it needs to get
replaced by WARN_DEFERRED() anyway.


> DEFERRED_WARN is a good thing; it's just quite hard to keep everything
> working, given that any of those "9 patches per hour" can break something
> with just one WARN_ON().
> 
> 
> I assume that doing something like this
> 
> #define WARN_ON(condition, format...) ({	\
> 	printk_deferred_enter();		\
> 	WARN(condition, ##format);		\
> 	printk_deferred_exit();			\
> })
>
> is less than exciting because WARN_ON from irq won't immediately print
> the backtrace anymore.

Yup, we might need WARN_ON_DEFERRED() variant.

Best Regards,
Petr