linux-kernel - Re: [RFC][PATCH v4 2/2] printk: Skip messages on oops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 17 Mar 2016 19:56:34 +0900
From:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:	Jan Kara <jack@...e.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Petr Mladek <pmladek@...e.com>, Tejun Heo <tj@...nel.org>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	linux-kernel@...r.kernel.org,
	Byungchul Park <byungchul.park@....com>,
	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [RFC][PATCH v4 2/2] printk: Skip messages on oops

Hello Jan,

On (03/14/16 23:13), Sergey Senozhatsky wrote:
> 
> From: Jan Kara <jack@...e.cz>
> 
> When there are too many messages in the kernel printk buffer it can take
> very long to print them to console (especially when using slow serial
> console). This is undesirable during oops so when we encounter oops and
> there are more than 100 messages to print, print just the newest 100
> messages and then the oops message.

I think this patch will introduce a regression, so I'd probably prefer
not to include it now in the series.

the pattern "print something important then panic()" is quite common.
given that other CPUs can printk() a lot before panic_cpu send out
stop_ipi, we can lose the "print something important" part.

...
arch/metag/kernel/cachepart.c:                  pr_emerg("Potential cache aliasing detected in %s on Thread %d\n",
arch/metag/kernel/cachepart.c-                           cache_type ? "DCACHE" : "ICACHE", thread_id);
arch/metag/kernel/cachepart.c-                  pr_warn("Total %s size: %u bytes\n",
arch/metag/kernel/cachepart.c-                          cache_type ? "DCACHE" : "ICACHE",
arch/metag/kernel/cachepart.c-                          cache_type ? get_dcache_size()
arch/metag/kernel/cachepart.c-                          : get_icache_size());
arch/metag/kernel/cachepart.c-                  pr_warn("Thread %s size: %d bytes\n",
arch/metag/kernel/cachepart.c-                          cache_type ? "CACHE" : "ICACHE",
arch/metag/kernel/cachepart.c-                          thread_cache_size);
arch/metag/kernel/cachepart.c-                  pr_warn("Page Size: %lu bytes\n", PAGE_SIZE);
arch/metag/kernel/cachepart.c-                  panic("Potential cache aliasing detected");
...
arch/s390/kernel/jump_label.c:  pr_emerg("Jump label code mismatch at %pS [%p]\n", ipc, ipc);
arch/s390/kernel/jump_label.c:  pr_emerg("Found:    %6ph\n", ipc);
arch/s390/kernel/jump_label.c:  pr_emerg("Expected: %6ph\n", ipe);
arch/s390/kernel/jump_label.c:  pr_emerg("New:      %6ph\n", ipn);
arch/s390/kernel/jump_label.c-  panic("Corrupted kernel text");
...

another example is hardlockup detector with sysctl_hardlockup_all_cpu_backtrace.

static void watchdog_overflow_callback(...)
{
	...
	if (is_hardlockup()) {
	...
		if (sysctl_hardlockup_all_cpu_backtrace &&
				!test_and_set_bit(0, &hardlockup_allcpu_dumped))
			trigger_allbutself_cpu_backtrace();

		nmi_panic(regs, msg);
	...
	}
	...
}

trigger_allbutself_cpu_backtrace() can be much more than 100 lines.
trigger_allbutself_cpu_backtrace() may or may not be implemented via
NMI. for example arch/sparc/kernel/process_64.c

thus, we better avoid skipping any messages when in panic() I think.

	-ss