lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 18 Oct 2023 10:42:10 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Gavin Shan <gshan@...hat.com>
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        catalin.marinas@....com, will@...nel.org, ryan.roberts@....com,
        anshuman.khandual@....com, shan.gavin@...il.com
Subject: Re: [PATCH] arm64: mm: Validate CONFIG_PGTABLE_LEVELS conditionally

On Wed, Oct 18, 2023 at 04:33:09PM +1000, Gavin Shan wrote:
> Hi Mark,
> 
> On 10/17/23 21:05, Mark Rutland wrote:
> > On Tue, Oct 17, 2023 at 10:53:00AM +1000, Gavin Shan wrote:
> > > It's allowed for the fixmap virtual address space to span multiple
> > > PMD entries. Instead, the address space isn't allowed to span multiple
> > > PUD entries. However, PMD entries are folded to PUD and PGD entries
> > > in the following combination. In this particular case, the validation
> > > on NR_BM_PMD_TABLES should be avoided.
> > > 
> > >    CONFIG_ARM64_PAGE_SHIFT = 14
> > >    CONFIG_ARM64_VA_BITS_36 = y
> > >    CONFIG_PGTABLE_LEVELS   = 2
> > 
> > Is this something you found by inspection, or are you hitting a real issue on a
> > particular config?
> > 
> > I built a kernel with:
> > 
> >    defconfig + CONFIG_ARM64_16K_PAGES=y + CONFIG_ARM64_VA_BITS_36=y
> > 
> > ... which gives the CONFIG_* configuration you list above, and that works just
> > fine.
> > 
> > For 2-level 16K pages we'd need to reserve more than 32M of fixmap slots for
> > the assertion to fire, and we only reserve ~6M of slots in total today, so I
> > can't see how this would be a problem unless you have 26M+ of local additions
> > to the fixmap?
> > 
> > Regardless of that, I don't think it's right to elide the check entirely.
> > 
> 
> It's all about code inspection. When CONFIG_PGTABLE_LEVELS == 2, PGD/PUD/PMD
> are equivalent. The following two macros are interchangeable. The forthcoming
> static_assert() enforces that the fixmap virtual space can't span multiple
> PMD entries, meaning the space is limited to 32MB with above configuration.
> 
>   #define NR_BM_PMD_TABLES \
>           SPAN_NR_ENTRIES(FIXADDR_TOT_START, FIXADDR_TOP, PUD_SHIFT)
>   #define NR_BM_PMD_TABLES \
>           SPAN_NR_ENTRIES(FIXADDR_TOT_START, FIXADDR_TOP, PMD_SHIFT)
> 
>   static_assert(NR_BM_PMD_TABLES == 1);
> 
> However, multiple PTE tables are allowed. It means the fixmap virtual space
> can span multiple PMD entries, which is controversial to the above enforcement
> from the code level. Hopefully, I understood everything correctly.
> 
>   #define NR_BM_PTE_TABLES \
>           SPAN_NR_ENTRIES(FIXADDR_TOT_START, FIXADDR_TOP, PMD_SHIFT)
>   static pte_t bm_pte[NR_BM_PTE_TABLES][PTRS_PER_PTE] __page_aligned_bss;
> 

The intent is that the fixmap can span multiple PTE tables, but has to fall
within a single PMD table (and within a single PGD entry). See the next couple
of lines where we only allocate one PMD table and one PUD table:

    static pte_t bm_pte[NR_BM_PTE_TABLES][PTRS_PER_PTE] __page_aligned_bss;
    static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss __maybe_unused;
    static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss __maybe_unused;

The NR_BM_PMD_TABLES definition is only there for the static_assert().

> You're correct that the following edition is needed to trigger the assert.
> The point is to have the fixmap virtual space larger than 32MB.

It's not intended to be more than 32M.

If we want to support 32M and larger, we'd need to rework the rest of the code,
allocating more intermediate tables and manipulating multiple PGD entries. As
we have no need for that, it's simpler to leave it as-is, with the
static_assert() ther to catch if/when someone tries to expand it beyond what is supported.

> 
>   enum fixed_addresses {
>         FIX_HOLE,
>          :
>         FIX_PTE,
>         FIX_PMD,
>         FIX_PUD,
>         FIX_PGD,
>         FIX_DUMMY = FIX_PGD + 2048,
> 
>         __end_of_fixed_addresses
> };
> 
> 
> > The point of the check is to make sure that the fixmap VA range doesn't span
> > across multiple PMD/PUD/P4D/PGD entries, as the early_fixmap_init() and
> > fixmap_copy() code don't handle that in general. When using 2-level 16K pages,
> > we still want to ensure the fixmap is contained within a single PGD, and
> > checking that it falls within a single folded PMD will check that.
> > 
> > See the message for commit:
> > 
> >    414c109bdf496195 ("arm64: mm: always map fixmap at page granularity")
> > 
> > ... and the bits that deleted from early_fixmap_init().
> > 
> > AFAICT this is fine as-is.
> > 
> 
> As I can see, multiple PMD entries can be handled well in early_fixmap_init().
> However, multiple PMD entries aren't handled in fixmap_copy(), as you said.
> 
>   early_fixmap_init
>     early_fixmap_init_pud
>       early_fixmap_init_pmd       // multiple PMD entries handled in the loop

If you remove the restriction of a single PMD entry, you also permit multiple
PUD/P4D/PGD entries, and the early_fixmap_init() code cannot handle that.
Consider how early_fixmap_init_pud() and early_fixmap_init_pmd() use bm_pud and
bm_pmd respectively.

As above, this code doesn't need to change:

* It works today, there is no configuration where the statis_assert() fires
  spuriously.

* If the static_assert() were to fire, we'd need to alter some portion of the
  code to handle that case (e.g. expanding bm_pmd and/or bm_pud, altering
  fixmap_copy()).

* It's simpler and better to have the assertion today rather than making the
  code handle the currently-impossible cases. That avoids wasting memory on
  unusable tables, and avoids having code which is never tested.

Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ