[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXGdFj8KxqJB8WMOdKYWPChkHbM7+nLaAWBZ29o=q7N0Aw@mail.gmail.com>
Date: Mon, 21 Sep 2020 08:47:40 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: "Leizhen (ThunderTown)" <thunder.leizhen@...wei.com>
Cc: Russell King - ARM Linux admin <linux@...linux.org.uk>,
Jianguo Chen <chenjianguo3@...wei.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Catalin Marinas <catalin.marinas@....com>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Libin <huawei.libin@...wei.com>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
patches-armlinux <patches@...linux.org.uk>
Subject: Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary
On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown)
<thunder.leizhen@...wei.com> wrote:
>
>
>
> On 2020/9/17 22:00, Ard Biesheuvel wrote:
> > On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin
> > <linux@...linux.org.uk> wrote:
> >>
> >> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
> >>> Currently, only support the kernels where the base of physical memory is
> >>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
> >>> unrotated value. But we can use one more "add/sub" instructions to handle
> >>> bits 23-16. The performance will be slightly affected.
> >>>
> >>> Since most boards meet 16 MiB alignment, so add a new configuration
> >>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
> >>> anyone really needs it.
> >>>
> >>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
> >>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
> >>> the whole head.S file. So choose it.
> >>>
> >>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
> >>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
> >>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
> >>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
> >>> (above y).
> >>>
> >>> Signed-off-by: Zhen Lei <thunder.leizhen@...wei.com>
> >>> ---
> >>> arch/arm/Kconfig | 18 +++++++++++++++++-
> >>> arch/arm/include/asm/memory.h | 16 +++++++++++++---
> >>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------
> >>> 3 files changed, 49 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> >>> index e00d94b16658765..19fc2c746e2ce29 100644
> >>> --- a/arch/arm/Kconfig
> >>> +++ b/arch/arm/Kconfig
> >>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
> >>> kernel in system memory.
> >>>
> >>> This can only be used with non-XIP MMU kernels where the base
> >>> - of physical memory is at a 16MB boundary.
> >>> + of physical memory is at a 16MiB boundary.
> >>>
> >>> Only disable this option if you know that you do not require
> >>> this feature (eg, building a kernel for a single machine) and
> >>> you need to shrink the kernel to the minimal size.
> >>>
> >>> +config ARM_PATCH_PHYS_VIRT_RADICAL
> >>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
> >>> + default n
> >>
> >> Please drop the "default n" - this is the default anyway.
> >>
> >>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
> >>> * in place where 'r' 32 bit operand is expected.
> >>> */
> >>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
> >>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
> >>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
> >>
> >> t is already unsigned long, so this cast is not necessary.
> >>
> >> I've been debating whether it would be better to use "movw" for this
> >> for ARMv7. In other words:
> >>
> >> movw tmp, #16-bit
> >> adds %Q0, %1, tmp, lsl #16
> >> adc %R0, %R0, #0
> >>
> >> It would certainly be less instructions, but at the cost of an
> >> additional register - and we'd have to change the fixup code to
> >> know about movw.
> >>
> >> Thoughts?
> >>
> >
> > Since LPAE implies v7, we can use movw unconditionally, which is nice.
> >
> > There is no need to use an additional temp register, as we can use the
> > register holding the high word. (There is no need for the mov_hi macro
> > to be separate)
> >
> > 0: movw %R0, #low offset >> 16
> > adds %Q0, %1, %R0, lsl #16
> > 1: mov %R0, #high offset
> > adc %R0, %R0, #0
> > .pushsection .pv_table,"a"
> > .long 0b, 1b
> > .popsection
> >
> > The only problem is distinguishing the two mov instructions from each
>
> The #high offset can also consider use movw, it just save two bytes in
> the thumb2 scenario. We can store different imm16 value for high_offset
> and low_offset, so that we can distinguish them in __fixup_a_pv_table().
>
> This will make the final implementation of the code look more clear and
> consistent, especially THUMB2.
>
> Let me try it.
>
Hello Zhen Lei,
I am looking into this as well:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2
Could you please test this version on your hardware?
Powered by blists - more mailing lists