[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <50ceccb8039847c253b68c59af0ceaa5e04eefb4.camel@intel.com>
Date: Fri, 5 Jul 2024 10:35:52 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>
CC: "ardb@...nel.org" <ardb@...nel.org>, "luto@...nel.org" <luto@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>, "tzimmermann@...e.de"
<tzimmermann@...e.de>, "akpm@...ux-foundation.org"
<akpm@...ux-foundation.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "seanjc@...gle.com" <seanjc@...gle.com>,
"mingo@...hat.com" <mingo@...hat.com>, "bhe@...hat.com" <bhe@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "hpa@...or.com" <hpa@...or.com>,
"peterz@...radead.org" <peterz@...radead.org>, "bp@...en8.de" <bp@...en8.de>,
"rafael@...nel.org" <rafael@...nel.org>, "linux-acpi@...r.kernel.org"
<linux-acpi@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH 3/3] x86/64/kexec: Rewrite init_transition_pgtable() with
kernel_ident_mapping_init()
On Thu, 2024-07-04 at 16:44 +0300, kirill.shutemov@...ux.intel.com wrote:
> On Wed, Jul 03, 2024 at 11:06:21AM +0000, Huang, Kai wrote:
> > > static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
> > > {
> > > - pgprot_t prot = PAGE_KERNEL_EXEC_NOENC;
> > > - unsigned long vaddr, paddr;
> > > - int result = -ENOMEM;
> > > - p4d_t *p4d;
> > > - pud_t *pud;
> > > - pmd_t *pmd;
> > > - pte_t *pte;
> > > + struct x86_mapping_info info = {
> > > + .alloc_pgt_page = alloc_transition_pgt_page,
> > > + .context = image,
> > > + .page_flag = __PAGE_KERNEL_LARGE_EXEC,
> > > + .kernpg_flag = _KERNPG_TABLE_NOENC,
> > > + .offset = __START_KERNEL_map - phys_base,
> > > + };
> > > + unsigned long mstart = PAGE_ALIGN_DOWN(__pa(relocate_kernel));
> > > + unsigned long mend = mstart + PAGE_SIZE;
> > >
> > > - vaddr = (unsigned long)relocate_kernel;
> > > - paddr = __pa(page_address(image->control_code_page)+PAGE_SIZE);
> >
> > Perhaps I am missing something, but this seems a functional change to me.
> >
> > IIUC the page after image->control_code_page is allocated when loading the
> > kexec kernel image. It is a different page from the page where the
> > relocate_kernel code resides in.
> >
> > The old code maps relocate_kernel kernel VA to the page after the
> > control_code_page. Later in machine_kexec(), the relocate_kernel code is
> > copied to that page so the mapping can work for that:
> >
> > control_page = page_address(image->control_code_page) + PAGE_SIZE;
> > __memcpy(control_page, relocate_kernel,
> > KEXEC_CONTROL_CODE_MAX_SIZE);
> >
> > The new code in this patch, however, seems just maps the relocate_kernel VA
> > to the PA of the relocate_kernel, which should be different from the old
> > mapping.
>
> Yes, original code maps at relocate_kernel() VA the page with copy of the
> relocate_kernel() in control_code_page. But it is safe to map original
> relocate_kernel() page there as well as it is not going to be overwritten
> until swap_pages(). We are not going to use original relocate_kernel()
> page after RET at the end of relocate_kernel().
I am not super familiar with this, but this doesn't seem 100% safe to me.
E.g, did you consider the kexec jump case?
The second half of control page is also used to store registers in kexec
jump. If the relocate_kernel VA isn't mapped to the control page, then IIUC
after jumping back to old kernel it seems we won't be able to read those
registers back?
>
> Does it make any sense?
>
> I will try to explain it in the commit message in the next version.
>
I think even it's safe to change to map to the relocate_kernel() page, it
should be done in a separate patch. This patch should just focus on removing
the duplicated page table setup code.
Powered by blists - more mailing lists