lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZTZP66CA1r35yTmp@arm.com>
Date:   Mon, 23 Oct 2023 11:50:19 +0100
From:   Catalin Marinas <catalin.marinas@....com>
To:     Hyesoo Yu <hyesoo.yu@...sung.com>
Cc:     Alexandru Elisei <alexandru.elisei@....com>, will@...nel.org,
        oliver.upton@...ux.dev, maz@...nel.org, james.morse@....com,
        suzuki.poulose@....com, yuzenghui@...wei.com, arnd@...db.de,
        akpm@...ux-foundation.org, mingo@...hat.com, peterz@...radead.org,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
        mhiramat@...nel.org, rppt@...nel.org, hughd@...gle.com,
        pcc@...gle.com, steven.price@....com, anshuman.khandual@....com,
        vincenzo.frascino@....com, david@...hat.com, eugenis@...gle.com,
        kcc@...gle.com, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, kvmarm@...ts.linux.dev,
        linux-fsdevel@...r.kernel.org, linux-arch@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH RFC 06/37] mm: page_alloc: Allocate from movable pcp
 lists only if ALLOC_FROM_METADATA

On Mon, Oct 23, 2023 at 04:16:56PM +0900, Hyesoo Yu wrote:
> On Tue, Oct 17, 2023 at 11:26:36AM +0100, Catalin Marinas wrote:
> > BTW, I suggest for the next iteration we drop MIGRATE_METADATA, only use
> > CMA and assume that the tag storage itself supports tagging. Hopefully
> > it makes the patches a bit simpler.
> 
> I am curious about the plan for the next iteration.

Alex is working on it.

> Does tag storage itself supports tagging? Will the following version be unusable
> if the hardware does not support it? The document of google said that 
> "If this memory is itself mapped as Tagged Normal (which should not happen!)
> then tag updates on it either raise a fault or do nothing, but never change the
> contents of any other page."
> (https://github.com/google/sanitizers/blob/master/mte-dynamic-carveout/spec.md)
> 
> The support of H/W is very welcome because it is good to make the patches simpler.
> But if H/W doesn't support it, Can't the new solution be used?

AFAIK on the current interconnects this is supported but the offsets
will need to be configured by firmware in such a way that a tag access
to the tag carve-out range still points to physical RAM, otherwise, as
per Google's doc, you can get some unexpected behaviour.

Let's take a simplified example, we have:

  phys_addr - physical address, linearised, starting from 0
  ram_size - the size of RAM (also corresponds to the end of PA+1)

A typical configuration is to designate the top 1/32 of RAM for tags:

  tag_offset = ram_size - ram_size / 32
  tag_carveout_start = tag_offset

The tag address for a given phys_addr is calculated as:

  tag_addr = phys_addr / 32 + tag_offset

To keep things simple, we reserve the top 1/(32*32) of the RAM as tag
storage for the main/reusable tag carveout.

  tag_carveout2_start = tag_carveout_start / 32 + tag_offset

This gives us the end of the first reusable carveout:

  tag_carveout_end = tag_carveout2_start - 1

and this way in Linux we can have (inclusive ranges):

  0..(tag_carveout_start-1): normal memory, data only
  tag_carveout_start..tag_carveout_end: CMA, reused as tags or data
  tag_carveout2_start..(ram_size-1): reserved for tags (not touched by the OS)

For this to work, we need the last page in the first carveout to have
a tag storage within RAM. And, of course, all of the above need to be at
least 4K aligned.

The simple configuration of 1/(32*32) of RAM for the second carveout is
sufficient but not fully utilised. We could be a bit more efficient to
gain a few more pages. Apart from the page alignment requirements, the
only strict requirement we need is:

  tag_carverout2_end < ram_size

where tag_carveout2_end is the tag storage corresponding to the end of
the main/reusable carveout, just before tag_carveout2_start:

  tag_carveout2_end = tag_carveout_end / 32 + tag_offset

Assuming that my on-paper substitutions are correct, the inequality
above becomes:

  tag_offset < (1024 * ram_size + 32) / 1057

and tag_offset is a multiple of PAGE_SIZE * 32 (so that the
tag_carveout2_start is a multiple of PAGE_SIZE).

As a concrete example, for 16GB of RAM starting from 0:

  tag_offset = 0x3e0060000
  tag_carverout2_start = 0x3ff063000

Without the optimal placement, the default tag_offset of top 1/32 of RAM
would have been:

  tag_offset = 0x3e0000000
  tag_carveou2_start = 0x3ff000000

so an extra 396KB gained with optimal placement (out of 16G, not sure
it's worth).

One can put the calculations in some python script to get the optimal
tag offset in case I got something wrong on paper.

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ