lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 03 Feb 2011 13:13:07 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Seiji Aguchi <seiji.aguchi@....com>
Cc:	Vivek Goyal <vgoyal@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux kernel mailing list <linux-kernel@...r.kernel.org>,
	Jarod Wilson <jwilson@...hat.com>
Subject: Re: Query about kdump_msg hook into crash_kexec()

Seiji Aguchi <seiji.aguchi@....com> writes:

> Hi,
>
>>PS: FWIW, Hitach folks have usage idea for their enterprise purpose, but 
>>    unfortunately I don't know its detail. I hope anyone tell us it.
>
> I explain the usage of kmsg_dump(KMSG_DUMP_KEXEC) in enterprise area.
>
> [Background]
> In our support service experience, we always need to detect root cause 
> of OS panic.
> So, customers in enterprise area never forgive us if kdump fails and 
> we can't detect the root cause of panic due to lack of materials for 
> investigation.
>
>>- Why do you need a notification from inside crash_kexec(). IOW, what
>>  is the usage of KMSG_DUMP_KEXEC.
>
>
> The usage of kdump(KMSG_DUMP_KEXEC) in enterprise area is getting 
> useful information for investigating kernel crash in case kdump
> kernel doesn't boot.
>
> Kdump kernel may not start booting because there is a sha256 checksum
> verified over the kdump kernel before it starts booting.
> This means kdump kernel may fail even if there is no bug in kdump and
> we can't get any information for detecting root cause of kernel crash

Sure it is theoretically possible that the sha256 checksum gets
corrupted (I have never seen it happen or heard reports of it
happening).  It is a feature that if someone has corrupted your code the
code doesn't try and run anyway and corrupt anything else.

That you are arguing against have such a feature in the code you use to
write to persistent storage is scary.

> As I mentioned in [Background], We must avoid lack of materials for 
> investigation.
> So, kdump(KMSG_DUMP_KEXEC) is very important feature in enterprise
> area.

That sounds wonderful, but it doesn't jive with the
code. kmsg_dump(KMSG_DUMP_KEXEC) when I read through it was simply not
written to be robust when most of the kernel is suspect.  Making it in
appropriate for use on the crash_kexec path.  I do not believe kmsg_dump
has seen any testing in kernel failure scenarios.

There is this huge assumption that kmsg_dump is more reliable than
crash_kexec, from my review of the code kmsg_dump is simply not safe in
the context of a broken kernel.  The kmsg_dump code last I looked code
won't work if called with interrupts disabled.

Furthermore kmsg_dump(KMSG_DUMP_KEXEC) is only useful for debugging
crash_kexec.  Which has it backwards as it is kmsg_dump that needs the
debugging.

You just argued that it is better to corrupt the target of your
kmsg_dump in the event of a kernel failure instead of to fail silently.

I don't want that unreliable code that wants to corrupt my jffs
partition anywhere near my machines.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ