[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <265e97cc-1a7f-fbe6-f38b-6aba7d2061bb@huawei.com>
Date: Mon, 28 Sep 2020 17:30:20 +0800
From: "Leizhen (ThunderTown)" <thunder.leizhen@...wei.com>
To: Ard Biesheuvel <ardb@...nel.org>
CC: Jianguo Chen <chenjianguo3@...wei.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Catalin Marinas <catalin.marinas@....com>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Russell King - ARM Linux admin <linux@...linux.org.uk>,
linux-kernel <linux-kernel@...r.kernel.org>,
Libin <huawei.libin@...wei.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Andrew Morton" <akpm@...ux-foundation.org>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
patches-armlinux <patches@...linux.org.uk>
Subject: Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB
boundary
On 2020/9/28 9:30, Leizhen (ThunderTown) wrote:
>
>
> On 2020/9/22 20:30, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2020/9/21 16:53, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2020/9/21 14:47, Ard Biesheuvel wrote:
>>>> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown)
>>>> <thunder.leizhen@...wei.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2020/9/17 22:00, Ard Biesheuvel wrote:
>>>>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin
>>>>>> <linux@...linux.org.uk> wrote:
>>>>>>>
>>>>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
>>>>>>>> Currently, only support the kernels where the base of physical memory is
>>>>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
>>>>>>>> unrotated value. But we can use one more "add/sub" instructions to handle
>>>>>>>> bits 23-16. The performance will be slightly affected.
>>>>>>>>
>>>>>>>> Since most boards meet 16 MiB alignment, so add a new configuration
>>>>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
>>>>>>>> anyone really needs it.
>>>>>>>>
>>>>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
>>>>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
>>>>>>>> the whole head.S file. So choose it.
>>>>>>>>
>>>>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
>>>>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
>>>>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
>>>>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
>>>>>>>> (above y).
>>>>>>>>
>>>>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@...wei.com>
>>>>>>>> ---
>>>>>>>> arch/arm/Kconfig | 18 +++++++++++++++++-
>>>>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++---
>>>>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------
>>>>>>>> 3 files changed, 49 insertions(+), 10 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>>>>>>>> index e00d94b16658765..19fc2c746e2ce29 100644
>>>>>>>> --- a/arch/arm/Kconfig
>>>>>>>> +++ b/arch/arm/Kconfig
>>>>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
>>>>>>>> kernel in system memory.
>>>>>>>>
>>>>>>>> This can only be used with non-XIP MMU kernels where the base
>>>>>>>> - of physical memory is at a 16MB boundary.
>>>>>>>> + of physical memory is at a 16MiB boundary.
>>>>>>>>
>>>>>>>> Only disable this option if you know that you do not require
>>>>>>>> this feature (eg, building a kernel for a single machine) and
>>>>>>>> you need to shrink the kernel to the minimal size.
>>>>>>>>
>>>>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL
>>>>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
>>>>>>>> + default n
>>>>>>>
>>>>>>> Please drop the "default n" - this is the default anyway.
>>>>>>>
>>>>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
>>>>>>>> * in place where 'r' 32 bit operand is expected.
>>>>>>>> */
>>>>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
>>>>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
>>>>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
>>>>>>>
>>>>>>> t is already unsigned long, so this cast is not necessary.
>>>>>>>
>>>>>>> I've been debating whether it would be better to use "movw" for this
>>>>>>> for ARMv7. In other words:
>>>>>>>
>>>>>>> movw tmp, #16-bit
>>>>>>> adds %Q0, %1, tmp, lsl #16
>>>>>>> adc %R0, %R0, #0
>>>>>>>
>>>>>>> It would certainly be less instructions, but at the cost of an
>>>>>>> additional register - and we'd have to change the fixup code to
>>>>>>> know about movw.
>>>>>>>
>>>>>>> Thoughts?
>>>>>>>
>>>>>>
>>>>>> Since LPAE implies v7, we can use movw unconditionally, which is nice.
>>>>>>
>>>>>> There is no need to use an additional temp register, as we can use the
>>>>>> register holding the high word. (There is no need for the mov_hi macro
>>>>>> to be separate)
>>>>>>
>>>>>> 0: movw %R0, #low offset >> 16
>>>>>> adds %Q0, %1, %R0, lsl #16
>>>>>> 1: mov %R0, #high offset
>>>>>> adc %R0, %R0, #0
>>>>>> .pushsection .pv_table,"a"
>>>>>> .long 0b, 1b
>>>>>> .popsection
>>>>>>
>>>>>> The only problem is distinguishing the two mov instructions from each
>>>>>
>>>>> The #high offset can also consider use movw, it just save two bytes in
>>>>> the thumb2 scenario. We can store different imm16 value for high_offset
>>>>> and low_offset, so that we can distinguish them in __fixup_a_pv_table().
>>>>>
>>>>> This will make the final implementation of the code look more clear and
>>>>> consistent, especially THUMB2.
>>>>>
>>>>> Let me try it.
>>>>>
>>>>
>>>> Hello Zhen Lei,
>>>>
>>>> I am looking into this as well:
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2
>>>>
>>>> Could you please test this version on your hardware?
>>>
>>> OK, I will test it on my boards.
>> Hi Ard Biesheuvel:
>> I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues
>> from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow.
>
> Hi, Ard Biesheuvel:
> I'm sorry to keep you waiting so long. You patch series works well on 2MiB aligned + BE board
> also. I spent a lot of time, because our 2MiB aligned + BE board loads zImage. Therefore, special
> processing is required for the following code:
>
> arch/arm/boot/compressed/head.S:
> #ifdef CONFIG_AUTO_ZRELADDR
> mov r4, pc
> and r4, r4, #0xf8000000 //currently only support 128MiB alignment
> add r4, r4, #TEXT_OFFSET
> #else
>
> This is a special scenario that does not conflict with your code framework. So I'm trying to fix it.
>
> Tested-by: Zhen Lei <thunder.leizhen@...wei.com>
Hi, Ard Biesheuvel:
I just sent the above problem's fix patch.
[PATCH 0/2] ARM: decompressor: relax the loading restriction of the decompressed kernel
>
>
>>
>>
>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@...ts.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>> .
>>>
Powered by blists - more mailing lists