lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 17 May 2023 12:27:30 +0100
From:   Conor Dooley <conor.dooley@...rochip.com>
To:     Alexandre Ghiti <alexghiti@...osinc.com>
CC:     Song Shuai <suagrfillet@...il.com>, <robh@...nel.org>,
        Andrew Jones <ajones@...tanamicro.com>, <anup@...infault.org>,
        <palmer@...osinc.com>, <jeeheng.sia@...rfivetech.com>,
        <leyfoon.tan@...rfivetech.com>, <mason.huo@...rfivetech.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Guo Ren <guoren@...nel.org>, <linux-riscv@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: Bug report: kernel paniced when system hibernates

Hey Alex,

On Wed, May 17, 2023 at 10:58:02AM +0200, Alexandre Ghiti wrote:
> On Tue, May 16, 2023 at 1:12 PM Alexandre Ghiti <alexghiti@...osinc.com> wrote:

> > On Tue, May 16, 2023 at 11:24 AM Song Shuai <suagrfillet@...il.com> wrote:
> > I actually removed this flag a few years ago, and I have to admit that
> > I need to check if that's necessary: the goal of commit 3335068f8721
> > ("riscv: Use PUD/P4D/PGD pages for the linear mapping") is to expose
> > the "right" start of DRAM so that we can align virtual and physical
> > addresses on a 1GB boundary.
> >
> > So I have to check if a nomap region is actually added as a
> > memblock.memory.regions[] or not: if yes, that's perfect, let's add
> > the nomap attributes to the PMP regions, otherwise, I don't think that
> > is a good solution.
> 
> So here is the current linear mapping without nomap in openSBI:
> 
> ---[ Linear mapping ]---
> 0xff60000000000000-0xff60000000200000    0x0000000080000000         2M
> PMD     D A G . . W R V
> 0xff60000000200000-0xff60000000e00000    0x0000000080200000        12M
> PMD     D A G . . . R V
> 
> And below the linear mapping with nomap in openSBI:
> 
> ---[ Linear mapping ]---
> 0xff60000000080000-0xff60000000200000    0x0000000080080000      1536K
> PTE     D A G . . W R V
> 0xff60000000200000-0xff60000000e00000    0x0000000080200000        12M
> PMD     D A G . . . R V
> 
> So adding nomap does not misalign virtual and physical addresses, it
> prevents the usage of 1GB page for this area though, so that's a
> solution, we just lose this 1GB page here.
> 
> But even though that may be the fix, I think we also need to fix that
> in the kernel as it would break compatibility with certain versions of
> openSBI *if* we fix openSBI...So here are a few solutions:
> 
> 1. we can mark all "mmode_resv" nodes in the device tree as nomap,
> before the linear mapping is established (IIUC, those nodes are added
> by openSBI to advertise PMP regions)
>     -> This amounts to the same fix as opensbi and we lose the 1GB hugepage.

AFAIU, losing the 1 GB hugepage is a regression, which would make this
not an option, right?

> 2. we can tweak pfn_is_nosave function to *not* save pfn corresponding
> to PMP regions
>     -> We don't lose the 1GB hugepage \o/
> 3. we can use register_nosave_region() to not save the "mmode_resv"
> regions (x86 does that
> https://elixir.bootlin.com/linux/v6.4-rc1/source/arch/x86/kernel/e820.c#L753)
>     -> We don't lose the 1GB hugepage \o/
> 4. Given JeeHeng pointer to
> https://elixir.bootlin.com/linux/v6.4-rc1/source/kernel/power/snapshot.c#L1340,
> we can mark those pages as non-readable and make the hibernation
> process not save those pages
>     -> Very late-in-the-day idea, not sure what it's worth, we also
> lose the 1GB hugepage...

Ditto here re: introducing another regression.

> To me, the best solution is 3 as it would prepare for other similar
> issues later, it is similar to x86 and it allows us to keep 1GB
> hugepages.
> 
> I have been thinking, and to me nomap does not provide anything since
> the kernel should not address this memory range, so if it does, we
> must fix the kernel.
> 
> Let me know what you all think, I'll be preparing a PoC of 3 in the meantime!

#3 would probably get my vote too. It seems like you could use it
dynamically if there was to be a future other provider of "mmode_resv"
regions, rather than doing something location-specific.

We should probably document these opensbi reserved memory nodes though
in a dt-binding or w/e if we are going to be relying on them to not
crash!

Thanks for working on this,
Conor.


Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ