lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05648f6c-1792-80b7-ae80-388664fda580@ghiti.fr>
Date:   Mon, 15 Jun 2020 01:35:18 -0400
From:   Alex Ghiti <alex@...ti.fr>
To:     Atish Patra <atishp@...shpatra.org>,
        Palmer Dabbelt <palmer@...belt.com>
Cc:     Anup Patel <anup@...infault.org>,
        linux-riscv <linux-riscv@...ts.infradead.org>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        "linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>,
        Atish Patra <Atish.Patra@....com>
Subject: Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

Hi Atish,

Le 6/12/20 à 1:43 PM, Atish Patra a écrit :
> On Fri, Jun 12, 2020 at 6:17 AM Alex Ghiti <alex@...ti.fr> wrote:
>> Le 6/12/20 à 8:59 AM, Alex Ghiti a écrit :
>>> Hi Atish,
>>>
>>> Le 6/11/20 à 1:29 PM, Atish Patra a écrit :
>>>> On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti<alex@...ti.fr>  wrote:
>>>>> Hi Atish,
>>>>>
>>>>> Le 6/10/20 à 2:32 PM, Atish Patra a écrit :
>>>>>> On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti<alex@...ti.fr>  wrote:
>>>>>>> This small patchset intends to use PUD/PGDIR entries for linear
>>>>>>> mapping
>>>>>>> in order to better utilize TLB.
>>>>>>>
>>>>>>> At the moment, only PMD entries can be used since on common platforms
>>>>>>> (qemu/unleashed), the kernel is loaded at DRAM + 2MB which
>>>>>>> dealigns virtual
>>>>>>> and physical addresses and then prevents the use of PUD/PGDIR
>>>>>>> entries.
>>>>>>> So the kernel must be able to get those 2MB for PAGE_OFFSET to map
>>>>>>> the
>>>>>>> beginning of the DRAM: this is achieved in patch 1.
>>>>>>>
>>>>>> I don't have in depth knowledge of how mm code works so this question
>>>>>> may be a completely
>>>>>> stupid one :). Just for my understanding,
>>>>>> As per my understanding, kernel will map those 2MB of memory but
>>>>>> never use it.
>>>>>> How does the kernel ensure that it doesn't allocate any memory from
>>>>>> those 2MB
>>>>>> memory if it is not marked as reserved?
>>>>> Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
>>>>> stage to mark this region
>>>>> as reserved if there is something there (like opensbi). Otherwise, the
>>>>> kernel will indeed try to
>>>>> allocate memory from there :)
>>>>>
>>>> In that case, this patch mandates that the firmware region has to be
>>>> mark "reserved"
>>>> the device tree so that the Linux kernel doesn't try to allocate
>>>> memory from there.
>>>> OpenSBI is already doing it from v0.7. Thus, any user using latest
>>>> OpenSBI can leverage
>>>> this patch for a better TLB utilization.
>>>
>>> Note that *currently* OpenSBI v0.7 still adds the "no-map" property
>>> which prevents such optimization.
>>>
> Thanks for the clarification. When I said latest, I meant including
> your patch in the mailing list.
>
>>>> However, legacy previous boot stages(BBL) do not reserve this area via
>>>> DT which may
>>>> result in an unexpected crash. I am not sure how many developers still
>>>> use BBL though.
>>>>
>>>> Few general suggestions to tackle this problem:
>>>> 1. This mandatory requirement should be added to the booting document
>>>> so that any other
>>>> SBI implementation is also aware of it.
>>>> 2. You may have to move the patch1 to a separate config so that any
>>>> users of legacy boot stages
>>>> can disable this feature.
>>>
>>> IMHO, the region occupied by runtime services should be marked as
>>> reserved in the device-tree. So it seems redundant to add this as a
>>> requirement, I would rather consider its absence as a bug.
>>>
> I agree. I was just suggesting to document this bug :).

Oh ok then, we meant the same thing :)
>
>>> Even if I understand that this might break some system, I don't like
>>> the idea of a new config to support old "buggy" bootloaders: when will
>>> we be able to remove it ? We'll never know when people will stop using
>>> those bootloaders, so it will stay here forever...Where can I find the
> Personally, I am fine with that. However, there were few concerns in the past.
> I am leaving it to Palmer to decide.
>
> @Palmer Dabbelt : Any thoughts ?
>
>>> boot document you are talking about ? Can we simply state here that
>>> this kernel version will not be compatible with those bootloaders
>>> (we'll draw an exhaustive list here) ?
> Yes.
>
>> Ok, I have just found Documentation/riscv/boot-image-header.rst: could
>> we imagine doing something like incrementing the version and use that as
>> a hint in the kernel not to map the 2MB offset ? That's still legacy,
>> but at least it does not require to recompile a kernel as the check
>> would be done at runtime.
>>
> I was suggesting to add a risc-v specific booting document and
> document this "bug".
> Documentation/riscv/boot-image-header.rst can be linked from that document or
> the boot hader content can be included in that. No changes in code is necessary.
>
> Eventually, this booting document will also include other additional
> booting constraints for RISC-V
> such as minimum extension required to boot Linux, csr state upon
> entering S-mode, mmu state.


Ok I will prepare a boot document that links to the existing documents and
add all of that, I will need you for the last constraints that I don't 
know about.

Thanks Atish,

Alex

>>> Alex
>>>
>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>>> But furthermore, at the moment, the firmware (opensbi) explicitly
>>>>>>> asks the
>>>>>>> kernel not to map the region it occupies, which is on those common
>>>>>>> platforms at the very beginning of the DRAM and then it also dealigns
>>>>>>> virtual and physical addresses. I proposed a patch here:
>>>>>>>
>>>>>>> https://github.com/riscv/opensbi/pull/167
>>>>>>>
>>>>>>> that removes this 'constraint' but *not* all the time as it offers
>>>>>>> some
>>>>>>> kind of protection in case PMP is not available. So sometimes, we may
>>>>>>> have a part of the memory below the kernel that is removed creating a
>>>>>>> misalignment between virtual and physical addresses. So for
>>>>>>> performance
>>>>>>> reasons, we must at least make sure that PMD entries can be used:
>>>>>>> that
>>>>>>> is guaranteed by patch 1 too.
>>>>>>>
>>>>>>> Finally the second patch simply improves best_map_size so that
>>>>>>> whenever
>>>>>>> possible, PUD/PGDIR entries are used.
>>>>>>>
>>>>>>> Below is the kernel page table without this patch on a 6G platform:
>>>>>>>
>>>>>>> ---[ Linear mapping ]---
>>>>>>> 0xffffc00000000000-0xffffc00176e00000 0x0000000080200000 5998M
>>>>>>> PMD     D A . . . W R V
>>>>>>>
>>>>>>> And with this patchset + opensbi patch:
>>>>>>>
>>>>>>> ---[ Linear mapping ]---
>>>>>>> 0xffffc00000000000-0xffffc00140000000 0x0000000080000000
>>>>>>> 5G PUD     D A . . . W R V
>>>>>>> 0xffffc00140000000-0xffffc00177000000 0x00000001c0000000 880M
>>>>>>> PMD     D A . . . W R V
>>>>>>>
>>>>>>> Alexandre Ghiti (2):
>>>>>>>      riscv: Get memory below load_pa while ensuring linear mapping
>>>>>>> is PMD
>>>>>>>        aligned
>>>>>>>      riscv: Use PUD/PGDIR entries for linear mapping when possible
>>>>>>>
>>>>>>>     arch/riscv/include/asm/page.h |  8 ++++
>>>>>>>     arch/riscv/mm/init.c          | 69
>>>>>>> +++++++++++++++++++++++++++++------
>>>>>>>     2 files changed, 65 insertions(+), 12 deletions(-)
>>>>>>>
>>>>>>> --
>>>>>>> 2.20.1
>>>>>>>
>>>>>>>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ