[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170623084210.6gn2hqguzfpbvqwi@gmail.com>
Date: Fri, 23 Jun 2017 10:42:10 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Chen Yu <yu.c.chen@...el.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
"Rafael J . Wysocki" <rafael@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Len Brown <lenb@...nel.org>, Ying Huang <ying.huang@...el.com>,
Xunlei Pang <xlpang@...hat.com>
Subject: Re: [PATCH 1/2][RFC] x86/boot/e820: Introduce e820_table_ori to
represent the real original e820 layout
* Chen Yu <yu.c.chen@...el.com> wrote:
> Hi Ingo,
> On Thu, Jun 22, 2017 at 11:40:30AM +0200, Ingo Molnar wrote:
> >
> > * Chen Yu <yu.c.chen@...el.com> wrote:
> >
> > > Currently we try to have e820_table_firmware to represent the
> > > original firmware memory layout passed to us by the bootloader,
> > > however it is not the case, the e820_table_firmware might still
> > > be modified by linux:
> > > 1. During bootup, the efi boot stub might allocate memory via
> > > efi service for the PCI device information structure, then
> > > later e820_reserve_setup_data() reserved these dynamically
> > > allocated structures(AKA, setup_data) in e820_table_firmware
> > > accordingly.
> > > 2. The kexec might also modify the e820_table_firmware.
> >
> > Hm, so why does the EFI code modify e280_table_firmware - why doesn't
> > it modify e820_table?
> >
> Both the e820_table and e820_table_firmware will be updated in
> e820__reserve_setup_data():
> Changing the PCI device information structures from E820_TYPE_RAM
> to E820_TYPE_RESERVED_KERN.
> > I.e. what is the point of having 3 different versions of the
> > memory layout table?
> My original thought was that, we should not record the modification
> from the efi boot stub into the e820_tabel_firmware and we are done.
> But after checking the code, I realized that if we do so the
> kexec might have potiential problem.
>
> The e820_table_firmware was introduced mainly for kexec and
> was used to pass the original memory layout to the second
> kernel:
>
> commit 5dfcf14d5b28174f94cbe9b4fb35d415db61c64a
> Author: Bernhard Walle <bwalle@...e.de>
> Date: Fri Jun 27 13:12:55 2008 +0200
>
> x86: use FIRMWARE_MEMMAP on x86/E820
>
> Besides, the second kernel will not re-enter the efi boot stub
> code and it will reuse the PCI device information structure created
> by the first kernel, which is stored in the E820_TYPE_RESERVED_KERN
> region. So these PCI device information structures will not be
> modified by the second kernel, as kexec will only pass the E820_TYPE_RAM
> to the second kernel, thus the latter could leverage ioremap to access
> the PCI information.
>
> So the problem is, if we do not record the PCI information in
> the e820_table_firmware, the PCI information will be kept as
> type E820_TYPE_RAM, and all the E820_TYPE_RAM type regions will
> be passed to the second kernel and might be allocated for ordinary
> use in the second kernel, as a result the second kernel might not
> get valid PCI information(might be overwritten by others). So
> currently we try to introduce a new e820_table_ori to represent
> the original one provided by the BIOS(mainly for hibernation
> memory layout md5 checking).
So there's 3 versions we need:
- the original 'firmware' table as-is - for MD5 check and other potential
purposes
- some intermediate version of the table for kexec: what is the exact definition
of that table, what changes from the real table does it _not_ want?
- the 'real' table
all the naming should reflect that. I.e. instead of some nonsensical "_ori"
postfix, that is really the _firmware table. If kexec needs a separate one then
name it _kexec and copy it at the right stage.
Ok?
Thanks,
Ingo
Powered by blists - more mailing lists