[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210201124720.GA66060@C02TD0UTHF1T.local>
Date: Mon, 1 Feb 2021 12:47:20 +0000
From: Mark Rutland <mark.rutland@....com>
To: Giancarlo Ferrari <giancarlo.ferrari89@...il.com>
Cc: linux-arm-kernel@...ts.infradead.org, linux@...linux.org.uk,
linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
rppt@...nel.org, penberg@...nel.org, geert@...ux-m68k.org,
giancarlo.ferrari@...ia.com
Subject: Re: [PATCH] ARM: kexec: Fix panic after TLB are invalidated
On Mon, Feb 01, 2021 at 12:44:56AM +0000, Giancarlo Ferrari wrote:
> machine_kexec() need to set rw permission in text and rodata sections
> to assign some variables (e.g. kexec_start_address). To do that at
> the end (after flushing pdm in memory, etc.) it needs to invalidate
> TLB [section] entries.
It'd be worth noting explicitly that set_kernel_text_rw() alters
current->active_mm...
> If during the TLB invalidation an interrupt occours, which might cause
> a context switch, there is the risk to inject invalid TLBs, with ro
> permissions.
... which is why if there's a context switch things can go wrong, since
active_mm isn't stable, and so it's possible that set_kernel_text_rw()
updates multiple tables, none of which might be the active table at the
point we try to make an access.
It would be nice to spell that out rather than saying "invalid TLBs".
We could disable preemption to prevent that, which is possibly better
than disabling interrupts.
Overall, it would be much better to avoid having to mess with the kernel
page tables. So rather than going:
1. mark kernel RW
2. alter variables in reloc code
3. copy reloc code into buffer
4. branch to buffer
... we should be able to go:
1. copy reloc code into buffer
2. alter variables in copy of reloc code
3. branch to buffer
... which would avoid this class of problem too.
Thanks,
Mark.
> When trying to assign .text labels, this lead to the following:
>
> Unable to handle kernel paging request at virtual address 80112f38
> pgd = fd7ef03e
> [80112f38] *pgd=0001141e(bad)
> Internal error: Oops: 80d [#1] PREEMPT SMP ARM
> ...
>
> Signed-off-by: Giancarlo Ferrari <giancarlo.ferrari89@...il.com>
> ---
> arch/arm/kernel/machine_kexec.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
> index 5d84ad3..23e8816 100644
> --- a/arch/arm/kernel/machine_kexec.c
> +++ b/arch/arm/kernel/machine_kexec.c
> @@ -174,6 +174,13 @@ void machine_kexec(struct kimage *image)
>
> reboot_code_buffer = page_address(image->control_code_page);
>
> + /*
> + * If below part is not atomic TLB entries might be corrupted after TLB
> + * invalidation, which leads to Data Abort in .text variable assignment
> + */
> + raw_local_irq_disable();
> + local_fiq_disable();
> +
> /* Prepare parameters for reboot_code_buffer*/
> set_kernel_text_rw();
> kexec_start_address = image->start;
> @@ -181,6 +188,9 @@ void machine_kexec(struct kimage *image)
> kexec_mach_type = machine_arch_type;
> kexec_boot_atags = image->arch.kernel_r2;
>
> + local_fiq_enable();
> + raw_local_irq_enable();
> +
> /* copy our kernel relocation code to the control code page */
> reboot_entry = fncpy(reboot_code_buffer,
> &relocate_new_kernel,
> --
> 2.7.4
>
Powered by blists - more mailing lists