lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 27 Jan 2015 09:07:09 +0100
From:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	Michael Kerrisk <mtk.manpages@...il.com>,
	lkml <linux-kernel@...r.kernel.org>,
	"linux-man@...r.kernel.org" <linux-man@...r.kernel.org>,
	Kexec Mailing List <kexec@...ts.infradead.org>,
	Andy Lutomirski <luto@...capital.net>,
	Dave Young <dyoung@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...en8.de>,
	"Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: Edited kexec_load(2) [kexec_file_load()] man page for review

Hello Vivek,

Ping!

Cheers,

Michael


On 16 January 2015 at 14:30, Michael Kerrisk (man-pages)
<mtk.manpages@...il.com> wrote:
> Hello Vivek,
>
> Thanks for your comments! I've added some further text to
> the page based on those comments. See some follow-up
> questions below.
>
> On 01/12/2015 11:16 PM, Vivek Goyal wrote:
>> On Wed, Jan 07, 2015 at 10:17:56PM +0100, Michael Kerrisk (man-pages) wrote:
>>
>> [..]
>>>>> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
>>>>> Execute the new kernel automatically on a system crash.
>>>>> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used
>>>
>>> I wasn't expecting that you would respond to the FIXMEs that were
>>> not labeled "kexec_file_load", but I was hoping you might ;-). Thanks!
>>> I have a few additional questions to your nice notes.
>>>
>>>> Upon boot first kernel reserves a chunk of contiguous memory (if
>>>> crashkernel=<> command line paramter is passed). This memory is
>>>> is used to load the crash kernel (Kernel which will be booted into
>>>> if first kernel crashes).
>>>
>>
>> Hi Michael,
>>
>>> Can I just confirm: is it in all cases only possible to use kexec_load()
>>> and kexec_file_load() if the kernel was booted with the 'crashkernel'
>>> parameter set?
>>
>> As of now, only kexec_load() and kexec_file_load() system calls can
>> make use of memory reserved by crashkernel=<> kernel parameter. And
>> this is used only if we are trying to load a crash kernel (KEXEC_ON_CRASH
>> flag specified).
>
> Okay.
>
>>>> Location of this reserved memory is exported to user space through
>>>> /proc/iomem file.
>>>
>>> Is that export via an entry labeled "Crash kernel" in the
>>> /proc/iomem file?
>>
>> Yes.
>
> Okay -- thanks.
>
>>>> User space can parse it and prepare list of segments
>>>> specifying this reserved memory as destination.
>>>
>>> I'm not quite clear on "specifying this reserved memory as destination".
>>> Is that done by specifying the address in the kexec_segment.mem fields?
>>
>> You are absolutely right. User space can specify in kexec_segment.mem
>> field the memory location where it expecting a particular segment to
>> be loaded by kernel.
>>
>>>
>>>> Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the
>>>> segments are destined for reserved memory otherwise kernel load operation
>>>> fails.
>>>
>>> Could you point me to where this checking is done? Also, what is the
>>> error (errno) that occurs when the load operation fails? (I think the
>>> answers to these questions are "at the start of kimage_alloc_init()"
>>> and "EADDRNOTAVAIL", but I'd like to confirm.)
>>
>> This checking happens in sanity_check_segment_list() which is called
>> by kimage_alloc_init().
>>
>> And yes, error code returned is -EADDRNOTAVAIL.
>
> Thanks. I added EADDRNOTAVAIL to the ERRORS.
>
>>>> [..]
>>>>> struct kexec_segment {
>>>>>     void   *buf;        /* Buffer in user space */
>>>>>     size_t  bufsz;      /* Buffer length in user space */
>>>>>     void   *mem;        /* Physical address of kernel */
>>>>>     size_t  memsz;      /* Physical address length */
>>>>> };
>>>>> .fi
>>>>> .in
>>>>> .PP
>>>>> .\" FIXME Explain the details of how the kernel image defined by segments
>>>>> .\" is copied from the calling process into previously reserved memory.
>>>>
>>>> Kernel image defined by segments is copied into kernel either in regular
>>>> memory
>>>
>>> Could you clarify what you mean by "regular memory"?
>>
>> I meant memory which is not reserved memory.
>
> Okay.
>
>>>> or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first
>>>> copies list of segments in kernel memory and then goes does various
>>>> sanity checks on the segments. If everything looks line, kernel copies
>>>> segment data to kernel memory.
>>>>
>>>> In case of normal kexec, segment data is loaded in any available memory
>>>> and segment data is moved to final destination at the kexec reboot time.
>>>
>>> By "moved to final destination", do you mean "moved from user space to the
>>> final kernel-space destination"?
>>
>> No. Segment data moves from user space to kernel space once kexec_load()
>> call finishes successfully. But when user does reboot (kexec -e), at that
>> time kernel moves that segment data to its final location. Kernel could
>> not place the segment at its final location during kexec_load() time as
>> that memory is already in use by running kernel. But once we are about
>> to reboot to new kernel, we can overwrite the old kernel's memory.
>
> Got it.
>
>>>> In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is
>>>> directly loaded to reserved memory and after crash kexec simply jumps
>>>
>>> By "directly", I assume you mean "at the time of the kexec_laod() call",
>>> right?
>>
>> Yes.
>
> Thanks.
>
> So, returning to the kexeec_segment structure:
>
>            struct kexec_segment {
>                void   *buf;        /* Buffer in user space */
>                size_t  bufsz;      /* Buffer length in user space */
>                void   *mem;        /* Physical address of kernel */
>                size_t  memsz;      /* Physical address length */
>            };
>
> Are the following statements correct:
> * buf + bufsz identify a memory region in the caller's virtual
>   address space that is the source of the copy
> * mem + memsz specify the target memory region of the copy
> * mem is  physical memory address, as seen from kernel space
> * the number of bytes copied from userspace is min(bufsz, memsz)
> * if bufsz > memsz, then excess bytes in the user-space buffer
>   are ignored.
> * if memsz > bufsz, then excess bytes in the target kernel buffer
>   are filled with zeros.
> ?
>
> Also, it seems to me that 'mem' need not be page aligned.
> Is that correct? Should the man page say something about that?
> (E.g., is it generally desirable that 'mem' should be page aligned?)
>
> Likewise, 'memsz' doesn't need to be a page multiple, IIUC.
> Should the man page say anything about this? For example, should
> it note that the initialized kernel segment will be of size:
>
>      (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE
>
> And should it note that if 'mem' is not a multiple of the page size, then
> the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment
> will be zeros?
>
> (Hopefully I have read kimage_load_normal_segment() correctly.)
>
> And one further question. Other than the fact that they are used with
> different system calls, what is the difference between KEXEC_ON_CRASH
> and KEXEC_FILE_ON_CRASH?
>
> Thanks,
>
> Michael
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists