linux-kernel - Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180120104931.1942483e@gandalf.local.home>
Date:   Sat, 20 Jan 2018 10:49:31 -0500
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:     Tejun Heo <tj@...nel.org>, Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        akpm@...ux-foundation.org, linux-mm@...ck.org,
        Cong Wang <xiyou.wangcong@...il.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Mel Gorman <mgorman@...e.de>, Michal Hocko <mhocko@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Jan Kara <jack@...e.cz>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        rostedt@...e.goodmis.org, Byungchul Park <byungchul.park@....com>,
        Pavel Machek <pavel@....cz>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

On Sat, 20 Jan 2018 16:14:02 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com> wrote:

> [..]
> >  asmlinkage int vprintk_emit(int facility, int level,
> >  			    const char *dict, size_t dictlen,
> > @@ -1849,6 +1918,17 @@ asmlinkage int vprintk_emit(int facility, int level,
> >  
> >  	/* This stops the holder of console_sem just where we want him */
> >  	logbuf_lock_irqsave(flags);
> > +
> > +	if (recursion_check_test()) {
> > +		/* A printk happened within a printk at the same context */
> > +		if (this_cpu_inc_return(recursion_count) > recursion_max) {
> > +			atomic_inc(&recursion_overflow);
> > +			logbuf_unlock_irqrestore(flags);
> > +			printed_len = 0;
> > +			goto out;
> > +		}
> > +	}  
> 
> didn't have time to look at this carefully, but is this possible?
> 
> printks from console_unlock()->call_console_drivers() are redirected
> to printk_safe buffer. we need irq_work on that CPU to flush its
> printk_safe buffer.

So is the issue that we keep triggering this irq work then? Then this
solution does seem to be one that would work. Because after x amount of
recursive printks (printk called by printk) it would just stop printing
them, and end the irq work.

Perhaps what Tejun is seeing is:

 printk()
   net_console()
     printk() --> redirected to irq work

 <irq work>
  printk
    net_console()
      printk() --> redirected to another irq work

and so on and so on.

This solution would need to be tweaked to add a timer to allow only so
many nested printks in a given time. Otherwise it too would be an issue:

 printk()
   net_console()
     printk() -> redirected
     printk() -> throttled

But the first x printk()s would still be redirected. and that x gets
reset in this current patch at he end of the outermost printk. Perhaps
it shouldn't reset x, or it can flush the printk safe buffer first. Is
there a reason that console_unlock() doesn't flush the
printk_safe_buffer? With a throttle number and flushing the
printk_safe_buffer, that should solve the issue Tejun explained.


> 
> how are we going to distinguish between lockdep splats, for instance,
> or WARNs from call_console_drivers() -> foo_write(), which are valuable,
> and kmalloc() print outs, which might be less valuable? are we going to

The problem is that printk causing more printks is extremely dangerous,
and ANY printk that is caused by a printk is of equal value, whether it
is a console driver running out of memory or a lockdep splat. And
the chances of having two hit at the same time is extremely low.

> lose all of them now? then we can do a much simpler thing - steal one
> bit from `printk_context' and use if for a new PRINTK_NOOP_CONTEXT, which
> will be set around call_console_drivers(). vprintk_func() would redirect
> printks to vprintk_noop(fmt, args), which will do nothing.

Not sure what you mean here. Have some pseudo code to demonstrate with?

-- Steve