linux-kernel - RE: Query about kdump_msg hook into crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5C4C569E8A4B9B42A84A977CF070A35B2C147F43B7@USINDEVS01.corp.hds.com>
Date:	Thu, 3 Feb 2011 17:08:01 -0500
From:	Seiji Aguchi <seiji.aguchi@....com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
CC:	Vivek Goyal <vgoyal@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux kernel mailing list <linux-kernel@...r.kernel.org>,
	Jarod Wilson <jwilson@...hat.com>
Subject: RE: Query about kdump_msg hook into crash_kexec()

Hi Eric,

Thank you for your prompt reply.

I would like to consider "Needs in enterprise area" and "Implementation of kmsg_dump()" separately.

(1) Needs in enterprise area
  In case of kdump failure, we would like to store kernel buffer to NVRAM/flush memory
  for detecting root cause of kernel crash.

(2) Implementation of kmsg_dump 
  You suggest to review/test cording of kmsg_dump() more.

What do you think about (1)?
Is it acceptable for you?

Seiji

>-----Original Message-----
>From: Eric W. Biederman [mailto:ebiederm@...ssion.com]
>Sent: Thursday, February 03, 2011 4:13 PM
>To: Seiji Aguchi
>Cc: Vivek Goyal; KOSAKI Motohiro; linux kernel mailing list; Jarod Wilson
>Subject: Re: Query about kdump_msg hook into crash_kexec()
>
>Seiji Aguchi <seiji.aguchi@....com> writes:
>
>> Hi,
>>
>>>PS: FWIW, Hitach folks have usage idea for their enterprise purpose, but
>>>    unfortunately I don't know its detail. I hope anyone tell us it.
>>
>> I explain the usage of kmsg_dump(KMSG_DUMP_KEXEC) in enterprise area.
>>
>> [Background]
>> In our support service experience, we always need to detect root cause
>> of OS panic.
>> So, customers in enterprise area never forgive us if kdump fails and
>> we can't detect the root cause of panic due to lack of materials for
>> investigation.
>>
>>>- Why do you need a notification from inside crash_kexec(). IOW, what
>>>  is the usage of KMSG_DUMP_KEXEC.
>>
>>
>> The usage of kdump(KMSG_DUMP_KEXEC) in enterprise area is getting
>> useful information for investigating kernel crash in case kdump
>> kernel doesn't boot.
>>
>> Kdump kernel may not start booting because there is a sha256 checksum
>> verified over the kdump kernel before it starts booting.
>> This means kdump kernel may fail even if there is no bug in kdump and
>> we can't get any information for detecting root cause of kernel crash
>
>Sure it is theoretically possible that the sha256 checksum gets
>corrupted (I have never seen it happen or heard reports of it
>happening).  It is a feature that if someone has corrupted your code the
>code doesn't try and run anyway and corrupt anything else.
>
>That you are arguing against have such a feature in the code you use to
>write to persistent storage is scary.
>
>> As I mentioned in [Background], We must avoid lack of materials for
>> investigation.
>> So, kdump(KMSG_DUMP_KEXEC) is very important feature in enterprise
>> area.
>
>That sounds wonderful, but it doesn't jive with the
>code. kmsg_dump(KMSG_DUMP_KEXEC) when I read through it was simply not
>written to be robust when most of the kernel is suspect.  Making it in
>appropriate for use on the crash_kexec path.  I do not believe kmsg_dump
>has seen any testing in kernel failure scenarios.
>
>There is this huge assumption that kmsg_dump is more reliable than
>crash_kexec, from my review of the code kmsg_dump is simply not safe in
>the context of a broken kernel.  The kmsg_dump code last I looked code
>won't work if called with interrupts disabled.
>
>Furthermore kmsg_dump(KMSG_DUMP_KEXEC) is only useful for debugging
>crash_kexec.  Which has it backwards as it is kmsg_dump that needs the
>debugging.
>
>You just argued that it is better to corrupt the target of your
>kmsg_dump in the event of a kernel failure instead of to fail silently.
>
>I don't want that unreliable code that wants to corrupt my jffs
>partition anywhere near my machines.
>
>Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/