[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b10d52ba-4a8d-43bd-96c1-cde848bec143@redhat.com>
Date: Tue, 20 Feb 2024 17:16:26 +0100
From: David Hildenbrand <david@...hat.com>
To: Alexandru Elisei <alexandru.elisei@....com>
Cc: catalin.marinas@....com, will@...nel.org, oliver.upton@...ux.dev,
maz@...nel.org, james.morse@....com, suzuki.poulose@....com,
yuzenghui@...wei.com, pcc@...gle.com, steven.price@....com,
anshuman.khandual@....com, eugenis@...gle.com, kcc@...gle.com,
hyesoo.yu@...sung.com, rppt@...nel.org, akpm@...ux-foundation.org,
peterz@...radead.org, konrad.wilk@...cle.com, willy@...radead.org,
jgross@...e.com, hch@....de, geert@...ux-m68k.org, vitaly.wool@...sulko.com,
ddstreet@...e.org, sjenning@...hat.com, hughd@...gle.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org, linux-mm@...ck.org
Subject: Re: arm64 MTE tag storage reuse - alternatives to MIGRATE_CMA
>>>>> I believe this is a very good fit for tag storage reuse, because it allows
>>>>> tag storage to be allocated even in atomic contexts, which enables MTE in
>>>>> the kernel. As a bonus, all of the changes to MM from the current approach
>>>>> wouldn't be needed, as tag storage allocation can be handled entirely in
>>>>> set_ptes_at(), copy_*highpage() or arch_swap_restore().
>>>>>
>>>>> Is this a viable approach that would be upstreamable? Are there other
>>>>> solutions that I haven't considered? I'm very much open to any alternatives
>>>>> that would make tag storage reuse viable.
>>>>
>>>> As raised recently, I had similar ideas with something like virtio-mem in
>>>> the past (wanted to call it virtio-tmem back then), but didn't have time to
>>>> look into it yet.
>>>>
>>>> I considered both, using special device memory as "cleancache" backend, and
>>>> using it as backend storage for something similar to zswap. We would not
>>>> need a memmap/"struct page" for that special device memory, which reduces
>>>> memory overhead and makes "adding more memory" a more reliable operation.
>>>
>>> Hm... this might not work with tag storage memory, the kernel needs to
>>> perform cache maintenance on the memory when it transitions to and from
>>> storing tags and storing data, so the memory must be mapped by the kernel.
>>
>> The direct map will definitely be required I think (copy in/out data). But
>> memmap for tag memory will likely not be required. Of course, it depends how
>> to manage tag storage. Likely we have to store some metadata, hopefully we
>> can avoid the full memmap and just use something else.
>
> So I guess instead of ZONE_DEVICE I should try to use arch_add_memory()
> directly? That has the limitation that it cannot be used by a driver
> (symbol not exported to modules).
You can certainly start with something simple, and we can work on
removing that memmap allocation later.
Maybe we have to expose new primitives in the context of such drivers.
arch_add_memory() likely also doesn't do what you need.
I recall that we had a way of only messing with the direct map.
Last time I worked with that was in the context of memtrace
(arch/powerpc/platforms/powernv/memtrace.c)
There, we call arch_create_linear_mapping()/arch_remove_linear_mapping().
.. and now my memory comes back: we never finished factoring out
arch_create_linear_mapping/arch_remove_linear_mapping so they would be
available on all architectures.
Your driver will be very arm64 specific, so doing it in an arm64-special
way might be good enough initially. For example, the arm64-core could
detect that special memory region and just statically prepare the direct
map and not expose the memory to the buddy/allocate a memmap. Similar to
how we handle the crashkernel/kexec IIRC (we likely do not have a direct
map for that, though; ).
[I was also wondering if we could simply dynamically map/unmap when
required so you can just avoid creating the entire direct map; might bot
be the best approach performance-wise, though]
There are a bunch of details to be sorted out, but I don't consider the
directmap/memmap side of things a big problem.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists