lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zhi51ers.fsf@x220.int.ebiederm.org>
Date:   Sat, 12 Oct 2019 22:54:31 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Dave Young <dyoung@...hat.com>
Cc:     Lianbo Jiang <lijiang@...hat.com>, linux-kernel@...r.kernel.org,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
        x86@...nel.org, bhe@...hat.com, jgross@...e.com,
        dhowells@...hat.com, Thomas.Lendacky@....com, vgoyal@...hat.com,
        kexec@...ts.infradead.org
Subject: Re: [PATCH 3/3 v3] x86/kdump: clean up all the code related to the backup region

Dave Young <dyoung@...hat.com> writes:

> Hi Eric,
>
> On 10/12/19 at 06:26am, Eric W. Biederman wrote:
>> Lianbo Jiang <lijiang@...hat.com> writes:
>> 
>> > When the crashkernel kernel command line option is specified, the
>> > low 1MiB memory will always be reserved, which makes that the memory
>> > allocated later won't fall into the low 1MiB area, thereby, it's not
>> > necessary to create a backup region and also no need to copy the first
>> > 640k content to a backup region.
>> >
>> > Currently, the code related to the backup region can be safely removed,
>> > so lets clean up.
>> >
>> > Signed-off-by: Lianbo Jiang <lijiang@...hat.com>
>> > ---
>> 
>> > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>> > index eb651fbde92a..cc5774fc84c0 100644
>> > --- a/arch/x86/kernel/crash.c
>> > +++ b/arch/x86/kernel/crash.c
>> > @@ -173,8 +173,6 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>> >  
>> >  #ifdef CONFIG_KEXEC_FILE
>> >  
>> > -static unsigned long crash_zero_bytes;
>> > -
>> >  static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
>> >  {
>> >  	unsigned int *nr_ranges = arg;
>> > @@ -234,9 +232,15 @@ static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
>> >  {
>> >  	struct crash_mem *cmem = arg;
>> >  
>> > -	cmem->ranges[cmem->nr_ranges].start = res->start;
>> > -	cmem->ranges[cmem->nr_ranges].end = res->end;
>> > -	cmem->nr_ranges++;
>> > +	if (res->start >= SZ_1M) {
>> > +		cmem->ranges[cmem->nr_ranges].start = res->start;
>> > +		cmem->ranges[cmem->nr_ranges].end = res->end;
>> > +		cmem->nr_ranges++;
>> > +	} else if (res->end > SZ_1M) {
>> > +		cmem->ranges[cmem->nr_ranges].start = SZ_1M;
>> > +		cmem->ranges[cmem->nr_ranges].end = res->end;
>> > +		cmem->nr_ranges++;
>> > +	}
>> 
>> What is going on with this chunk?  I can guess but this needs a clear
>> comment.
>
> Indeed it needs some code comment, this is based on some offline
> discussion.  cat /proc/vmcore will give a warning because ioremap is
> mapping the system ram.
>
> We pass the first 1M to kdump kernel in e820 as system ram so that 2nd
> kernel can use the low 1M memory because for example the trampoline
> code.
>
>> 
>> >  
>> >  	return 0;
>> >  }
>> 
>> > @@ -356,9 +337,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
>> >  	memset(&cmd, 0, sizeof(struct crash_memmap_data));
>> >  	cmd.params = params;
>> >  
>> > -	/* Add first 640K segment */
>> > -	ei.addr = image->arch.backup_src_start;
>> > -	ei.size = image->arch.backup_src_sz;
>> > +	/*
>> > +	 * Add the low memory range[0x1000, SZ_1M], skip
>> > +	 * the first zero page.
>> > +	 */
>> > +	ei.addr = PAGE_SIZE;
>> > +	ei.size = SZ_1M - PAGE_SIZE;
>> >  	ei.type = E820_TYPE_RAM;
>> >  	add_e820_entry(params, &ei);
>> 
>> Likewise here.  Why do we need a special case?
>> Why the magic with PAGE_SIZE?
>
> Good catch, the zero page part is useless, I think no other special
> reason, just assumed zero page is not usable, but it should be ok to
> remove the special handling, just pass 0 - 1M is good enough.

But if we have stopped special casing the low 1M.  Why do we need a
special case here at all?

If you need the special case it is almost certainly wrong to say you
have ram above 640KiB and below 1MiB.  That is the legacy ROM and video
MMIO area.

There is a reason the original code said 640KiB.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ