linux-kernel - Re: [PATCH 2/2] Subject: printk: Don't trap random context in infinite log

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171107002254.GI3252168@devbig577.frc2.facebook.com>
Date:   Mon, 6 Nov 2017 16:22:54 -0800
From:   Tejun Heo <tj@...nel.org>
To:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc:     Petr Mladek <pmladek@...e.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-kernel@...r.kernel.org, kernel-team@...com,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 2/2] Subject: printk: Don't trap random context in
 infinite log_buf flush

Hello,

On Sat, Nov 04, 2017 at 01:24:08PM +0900, Sergey Senozhatsky wrote:
> thanks for the patch set. we are currently looking at another approach:
> lkml.kernel.org/r/20171102134515.6eef16de@...dalf.local.home
> 
> would you be interested in taking a look?
 
Hmm... It took some tweaking but the code at the end locks up the
machine w/ Steven's patch applied and it's not that contrived a case
(e.g. out of memory messages from packet tx/rx paths while OOM is in
progress).

> there are some concerns, like a huge number of printk-s happening while
> console_sem is locked. e.g. console_lock()/console_unlock() on one of the
> CPUs, or console_lock(); printk(); ... printk(); console_unlock();

Unless we make all messages fully synchronous, I don't think there's a
good solution for that and I don't think we wanna make everything
fully synchronous.  The information value of messages go down pretty
fast in a message deluge and as long as we stay synchronous in the
beginning, information loss usually isn't the problem.

The current code can lose a lot of messages but the way it loses
messages usually doesn't lead to loss of important information.  The
proposed patch doesn't materially change when and what we lose.  It
just moves the flush infinite loop into a safe context.

Thanks.

#include <linux/module.h>
#include <linux/delay.h>
#include <linux/sched.h>
#include <linux/mutex.h>
#include <linux/workqueue.h>
#include <linux/hrtimer.h>

static bool in_printk;
static bool stop_testing;
static struct hrtimer printk_timer;
static ktime_t timer_interval;

static enum hrtimer_restart printk_timerfn(struct hrtimer *timer)
{
	int i;

	if (READ_ONCE(in_printk))
		for (i = 0; i < 1000; i++)
			printk("%-80s\n", "XXX TIMER");

	hrtimer_forward_now(&printk_timer, timer_interval);
	return READ_ONCE(stop_testing) ? HRTIMER_NORESTART : HRTIMER_RESTART;
}	

static void preempt_printk_workfn(struct work_struct *work)
{
	int i;

	hrtimer_init(&printk_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
	printk_timer.function = printk_timerfn;
	timer_interval = ktime_set(0, NSEC_PER_MSEC);
	hrtimer_start(&printk_timer, timer_interval, HRTIMER_MODE_REL);

	while (!READ_ONCE(stop_testing)) {
		preempt_disable();
		WRITE_ONCE(in_printk, true);
		for (i = 0; i < 100; i++)
			printk("%-80s\n", "XXX PREEMPT");
		WRITE_ONCE(in_printk, false);
		preempt_enable();
		msleep(1);
	}
}
static DECLARE_WORK(preempt_printk_work, preempt_printk_workfn);

static int __init test_init(void)
{
	queue_work_on(0, system_wq, &preempt_printk_work);
	return 0;
}

static void __exit test_exit(void)
{
	WRITE_ONCE(stop_testing, true);
	flush_work(&preempt_printk_work);
	hrtimer_cancel(&printk_timer);
}

module_init(test_init);
module_exit(test_exit);
MODULE_LICENSE("GPL");