linux-kernel - Re: [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170413140321.GH16938@pathway.suse.cz>
Date:   Thu, 13 Apr 2017 16:03:21 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc:     Pavel Machek <pavel@....cz>, Jan Kara <jack@...e.cz>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Ye Xiaolong <xiaolong.ye@...el.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jiri Slaby <jslaby@...e.com>, Len Brown <len.brown@...el.com>,
        linux-kernel@...r.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Subject: Re: [printk]  fbc14616f4:
 BUG:kernel_reboot-without-warning_in_test_stage

On Wed 2017-04-12 01:19:53, Sergey Senozhatsky wrote:
> On (04/11/17 10:46), Sergey Senozhatsky wrote:
> > On (04/10/17 20:48), Pavel Machek wrote:
> > [..]
> > > > but, once again, I see your point.
> > > 
> > > Good. Does that mean that the next version of patches will work ok in
> > > that case?
> > 
> > yes.
> 
> ok... so I'm looking at something like below right now.
> not really tested yet.
> 
> I put some comments into the code.
> 
> it does offloading after X printed lines by the same process.
> if we reschedule, then the counter resets. which is probably OK,
> we don't really want any process, except for printk_kthread, to
> stay in console_unlock() forever. "number of lines printed" is
> probably easier to understand (easily converted to the number of
> pageup/pagedown you need to press, terminal buffer history size,
> etc.) than seconds we spent on printing (which doesn't even
> correspond to messages' timestamps in general case).
> 
> when the limit of "number of lines printed" is 0, then no
> offloading takes place.
> 
> it also has some simple mechanism to handle cases when
> we try to wake_up() printk_kthread, but it never becomes alive.
> it's a bit simple minded, probably.
> 
> wake_up is done from printk_safe context, so warnings/printks
> from there should do no harm (in fact, we even do pr_crit()
> error reporting, when we enforce printk_emergency mode).
> 
> I'll do more tests tomorrow, and will take a closer look.
> this code is basically just compiles, boots and passes some
> trivial tests. quite possible I've missed something important.
> 
> once verified, then the next question will be -- do we even
> need printk_emergency_begin/end or we can leave without it.
> 
> // given that printk_emergency enforcement works properly
> 
> 
> but, once again, the code might be stupid and wrong.
> and I need some sleep.
> 
> ---
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 6cf756dbee39..c0075e8b3a09 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2141,6 +2197,89 @@ static inline int can_use_console(void)
>  	return cpu_online(raw_smp_processor_id()) || have_callable_console();
>  }
>  
> +/*
> + * Under heavy printing load/slow serial console/etc console_unlock() can
> + * stall CPUs, which can result in soft/hard-lockups, lost interrupts, RCU
> + * stalls, etc. Therefore we attempt to print the messages to console from
> + * a dedicated printk_kthread, which always runs in schedulable context.
> + *
> + * There are several possible scenarios:
> + *
> + * a) When we got a large number of pending messages to print.
> + * e.g.
> + *    vprintk_emit() or console_lock()
> + *     console_unlock()
> + *      <<massive dump>>
> + *
> + * b) When we printk() a large number of messages.
> + * e.g.
> + *   vprintk_emit()->console_unlock()   <<print 1 message>>
> + *   vprintk_emit()->console_unlock()   <<print 1 message>>
> + *   ...
> + *   vprintk_emit()->console_unlock()   <<print 1 message>>
> + *
> + * In all those cases we can be in atomic context, we need to offload
> + * printing at some point.
> + *
> + * This function must be called from 'printk_safe' context.
> + */
> +static inline bool console_offload_printing(void)
> +{
> +	static struct task_struct *printing_task = NULL;
> +	static unsigned long lines_printed = 0;
> +
> +	if (!atomic_print_limit || !printk_kthread_enabled())
> +		return false;
> +
> +	/* We rescheduled - reset the counters. */
> +	if (printing_task != current) {
> +		lines_printed = 0;
> +		printing_task = current;
> +		return false;
> +	}

If we want to check that the process rescheduled, we should
store/check also current->nvcsw + current->nivcsw.

This might be even more important in the detection of
the emergency situation, see below.


> +	if (current == printk_kthread)
> +		return false;

Yup, printk_kthread is special. You suggest another solution
in the other reply.

IMHO, the best solution would be if printk_kthread calls
console_unlock() with disabled preemption and does the offload
(releases console_sem) when need_resched() returns true.

By other words, printk_kthread should use its allotted time
as much as possible. But it should not block the console_sem
when sleeping.

My only fear is that it is getting more and more complicated.
On the other hand, any partial solution is asking for
troubles and complains.

> +
> +	/*
> +	 * Don't reset the counter, let CPU overrun the limit.
> +	 * The idea is that
> +	 *
> +	 *   a) woken up printk_kthread (if succeeded)
> +	 * or
> +	 *   b) concurrent printk from another CPU (if any)
> +	 *
> +	 * will change `printing_task' and reset the counter. This also
> +	 * let us to introduce additional policies later, for instance,
> +	 * if we can't wakeup printk_kthread for Y times, e.g.
> +	 *
> +	 *            lines_printed > 2 * atomic_print_limit
> +	 *
> +	 * then we can declare emergency and stop relying on printk_kthread.
> +	 *
> +	 * If neither a) nor b) happens - we continue printing from
> +	 * current process. Which is bad and can be risky, but we can't
> +	 * wake_up() printk_kthread, so things already don't look normal.
> +	 */
> +	lines_printed++;
> +	if (lines_printed < atomic_print_limit)
> +		return false;
> +
> +	/*
> +	 * A trivial emergency enforcement.
> +	 * Assumes that `atomic_print_limit' is large enough.
> +	 */
> +	if (lines_printed > 2 * (unsigned long)atomic_print_limit) {
> +		printk_enforce_emergency = true;
> +		pr_crit("Declaring printk emergency mode.\n");
> +		return false;
> +	}

The only messages that are printed on my workstation are the same
few lines everytime I connect my phone over USB to get it charged.
I am not sure if they are printed by the same process. But I would
get scared if printk switches to the emergency mode just because
there is only one process "regularly" producing messages.

It might help to check the number of process switch counts as
suggested above.


> +	/* Must be executed in 'printk_safe' context. */
> +	wake_up_process(printk_kthread);
> +	return true;
> +}
> +
>  /**
>   * console_unlock - unlock the console system
>   *
> @@ -2163,8 +2302,17 @@ void console_unlock(void)
>  	unsigned long flags;
>  	bool wake_klogd = false;
>  	bool do_cond_resched, retry;
> +	bool did_offload;
>  
>  	if (console_suspended) {
> +		/*
> +		 * Here and later, we need to clear the PENDING_OUTPUT bit
> +		 * in order to avoid an infinite loop in printk_kthread
> +		 * function when console_unlock() cannot flush messages
> +		 * because we suspended consoles. Someone else will print
> +		 * the messages from resume_console().
> +		 */
> +		clear_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
>  		up_console_sem();
>  		return;
>  	}
> @@ -2186,6 +2334,7 @@ void console_unlock(void)
>  	do_cond_resched = console_may_schedule;
>  again:
>  	console_may_schedule = 0;
> +	did_offload = 0;

It would make more sense to clear the variable before the again:
target. In fact, there is a logic mistake, see below.
>  
>  	/*
>  	 * We released the console_sem lock, so we need to recheck if
> @@ -2193,6 +2342,7 @@ void console_unlock(void)
>  	 * console.
>  	 */
>  	if (!can_use_console()) {
> +		clear_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
>  		console_locked = 0;
>  		up_console_sem();
>  		return;
> @@ -2221,7 +2371,7 @@ void console_unlock(void)
>  			len = 0;
>  		}
>  skip:
> -		if (console_seq == log_next_seq)
> +		if (did_offload || console_seq == log_next_seq)
>  			break;
>  
>  		msg = log_from_idx(console_idx);
> @@ -2253,9 +2403,28 @@ void console_unlock(void)
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		/*
> +		 * Sometimes we may lock console_sem before printk_kthread.
> +		 * In this case we will jump to `again' label (if there are
> +		 * pending messages), print one more line from current
> +		 * process, break out of printing loop (we don't reset the
> +		 * counter of printed lines) and do up_console_sem() to
> +		 * wakeup printk_kthread again.
> +		 *
> +		 * If printk_kthread never wakes up (which may indicate that
> +		 * the system is unstable or something weird is going on),
> +		 * then we will keep jumping to `again' label and printing
> +		 * one message from the logbuf. This is a bit ugly, but at
> +		 * least we will print out the logbuf.
> +		 *
> +		 * If such condition occurs, console_offload_printing() can
> +		 * declare `printk_emergency' at some point.

I am a bit confused by the comment above. The again goto target is
used only when there is a race between leaving the loop and releasing
the console_sem. It is a corner case.

If there are messages from other CPUs, they most likely appear during
the slow call_console_drivers(). They are handled inside the
for(;;) cycle unless we reach the atomic_print_limit and
wake up the printk_kthread.

IMHO, we should just mention here that the jobs might get offloaded
to prevent softlookups when console_unlock() is called in atomic
context.

> +		 */
> +		did_offload = console_offload_printing();
>  		printk_safe_exit_irqrestore(flags);
>  
> -		if (do_cond_resched)
> +		if (!did_offload && do_cond_resched)
>  			cond_resched();
>  	}
>  	console_locked = 0;
> @@ -2276,6 +2445,8 @@ void console_unlock(void)
>  	 */
>  	raw_spin_lock(&logbuf_lock);
>  	retry = console_seq != log_next_seq;

This should be:

	retry = console_seq != log_next_seq && !did_offload;

Otherwise, it would never leave.

I know that this was only proposal and you did it late until
late night.

Anyway, I like the proposal. I have a good feeling about
this way. Also I really like how the main logic is localized
in console_offload_printing().

Best Regards,
Petr