linux-kernel - Re: arm64 MTE tag storage reuse

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e0b7c884-4345-44b1-b8c0-2711a28a980e@redhat.com>
Date: Tue, 20 Feb 2024 15:07:22 +0100
From: David Hildenbrand <david@...hat.com>
To: Alexandru Elisei <alexandru.elisei@....com>
Cc: catalin.marinas@....com, will@...nel.org, oliver.upton@...ux.dev,
 maz@...nel.org, james.morse@....com, suzuki.poulose@....com,
 yuzenghui@...wei.com, pcc@...gle.com, steven.price@....com,
 anshuman.khandual@....com, eugenis@...gle.com, kcc@...gle.com,
 hyesoo.yu@...sung.com, rppt@...nel.org, akpm@...ux-foundation.org,
 peterz@...radead.org, konrad.wilk@...cle.com, willy@...radead.org,
 jgross@...e.com, hch@....de, geert@...ux-m68k.org, vitaly.wool@...sulko.com,
 ddstreet@...e.org, sjenning@...hat.com, hughd@...gle.com,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 linux-arch@...r.kernel.org, linux-mm@...ck.org
Subject: Re: arm64 MTE tag storage reuse - alternatives to MIGRATE_CMA

>>
>> With large folios in place, we'd likely want to investigate not working on
>> individual pages, but on (possibly large) folios instead.
> 
> Yes, that would be interesting. Since the backend has no way of controlling
> what tag storage page will be needed for tags, and subsequently dropped
> from the cache, we would have to figure out what to do if one of the pages
> that is part of a large folio is dropped. The easiest solution that I can
> see is to remove the entire folio from the cleancache, but that would mean
> also dropping the rest of the pages from the folio unnecessarily.

Right, but likely that won't be an issue. Things get interesting when 
thinking about an efficient allocation approach.

> 
>>
>>>
>>> I believe this is a very good fit for tag storage reuse, because it allows
>>> tag storage to be allocated even in atomic contexts, which enables MTE in
>>> the kernel. As a bonus, all of the changes to MM from the current approach
>>> wouldn't be needed, as tag storage allocation can be handled entirely in
>>> set_ptes_at(), copy_*highpage() or arch_swap_restore().
>>>
>>> Is this a viable approach that would be upstreamable? Are there other
>>> solutions that I haven't considered? I'm very much open to any alternatives
>>> that would make tag storage reuse viable.
>>
>> As raised recently, I had similar ideas with something like virtio-mem in
>> the past (wanted to call it virtio-tmem back then), but didn't have time to
>> look into it yet.
>>
>> I considered both, using special device memory as "cleancache" backend, and
>> using it as backend storage for something similar to zswap. We would not
>> need a memmap/"struct page" for that special device memory, which reduces
>> memory overhead and makes "adding more memory" a more reliable operation.
> 
> Hm... this might not work with tag storage memory, the kernel needs to
> perform cache maintenance on the memory when it transitions to and from
> storing tags and storing data, so the memory must be mapped by the kernel.

The direct map will definitely be required I think (copy in/out data). 
But memmap for tag memory will likely not be required. Of course, it 
depends how to manage tag storage. Likely we have to store some 
metadata, hopefully we can avoid the full memmap and just use something 
else.

[...]

>> Similar to virtio-mem, there are ways for the hypervisor to request changes
>> to the memory consumption of a device (setting the requested size). So when
>> requested to consume less, clean pagecache pages can be dropped and the
>> memory can be handed back to the hypervisor.
>>
>> Of course, likely we would want to consider using "slower" memory in the
>> hypervisor to back such a device.
> 
> I'm not sure how useful that will be with tag storage reuse. KVM must
> assume that **all** the memory that the guest uses is tagged and it needs
> tag storage allocated (it's a known architectural limitation), so that will
> leave even less tag storage memory to distribute between the host and the
> guest(s).

Yes, I don't think this applies to tag storage.

> 
> Adding to that, at the moment Android is going to be the major (only?) user
> of tag storage reuse, and as far as I know pKVM is more restrictive with
> regards to the emulated devices and the memory that is shared between
> guests and the host.

Right, what I described here does not have overlap with tag storage 
besides requiring similar (cleancache) hooks.

-- 
Cheers,

David / dhildenb