[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f57d73e3-fb6c-4c01-9897-c9686889fec2@redhat.com>
Date: Thu, 23 Oct 2025 13:49:22 +0200
From: David Hildenbrand <david@...hat.com>
To: Kiryl Shutsemau <kirill@...temov.name>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/filemap: Implement fast short reads
On 23.10.25 13:40, Kiryl Shutsemau wrote:
> On Thu, Oct 23, 2025 at 01:11:43PM +0200, David Hildenbrand wrote:
>> On 23.10.25 13:10, David Hildenbrand wrote:
>>> On 23.10.25 12:54, David Hildenbrand wrote:
>>>> On 23.10.25 12:31, Kiryl Shutsemau wrote:
>>>>> On Wed, Oct 22, 2025 at 07:28:27PM +0200, David Hildenbrand wrote:
>>>>>> "garbage" as in pointing at something without a direct map, something that's
>>>>>> protected differently (MTE? weird CoCo protection?) or even worse MMIO with
>>>>>> undesired read-effects.
>>>>>
>>>>> Pedro already points to the problem with missing direct mapping.
>>>>> _nofault() copy should help with this.
>>>>
>>>> Yeah, we do something similar when reading the kcore for that reason.
>>>>
>>>>>
>>>>> Can direct mapping ever be converted to MMIO? It can be converted to DMA
>>>>> buffer (which is fine), but MMIO? I have not seen it even in virtualized
>>>>> environments.
>>>>
>>>> I recall discussions in the context of PAT and the adjustment of caching
>>>> attributes of the direct map for MMIO purposes: so I suspect there are
>>>> ways that can happen, but I am not 100% sure.
>>>>
>>>>
>>>> Thinking about it, in VMs we have the direct map set on balloon inflated
>>>> pages that should not be touched, not even read, otherwise your
>>>> hypervisor might get very angry. That case we could likely handle by
>>>> checking whether the source page actually exists and doesn't have
>>>> PageOffline() set, before accessing it. A bit nasty.
>>>>
>>>> A more obscure cases would probably be reading a page that was poisoned
>>>> by hardware and is not expected to be used anymore. Could also be
>>>> checked by checking the page.
>>>>
>>>> Essentially all cases where we try to avoid reading ordinary memory
>>>> already when creating memory dumps that might have a direct map.
>>>>
>>>>
>>>> Regarding MTE and load_unaligned_zeropad(): I don't know unfortunately.
>>>
>>> Looking into this, I'd assume the exception handler will take care of it.
>>>
>>> load_unaligned_zeropad() is interesting if there is a direct map but the
>>> memory should not be touched (especially regarding PageOffline and
>>> memory errors).
>>>
>>> I read drivers/firmware/efi/unaccepted_memory.c where we there is a
>>> lengthy discussion about guard pages and how that works for unaccepted
>>> memory.
>>>
>>> While it works for unaccepted memory, it wouldn't work for other random
>>
>> Sorry I meant here "while that works for load_unaligned_zeropad()".
>
> Do we have other random reads?
>
> For unaccepted memory, we care about touching memory that was never
> allocated because accepting memory is one way road.
Right, but I suspect if you get a random read (as the unaccepted memory
doc states) you'd be in trouble as well.
The "nice" thing about unaccepted memory is that it's a one way road
indeed, and at some point the system will not have unaccepted memory
anymore.
>
> I only know about load_unaligned_zeropad() that does reads like this. Do
> you know others?
No, I am not aware of others. Most code that could read random memory
(kcore, vmcore) was fixed to exclude pages we know are unsafe to touch.
Code where might speculatively access the "struct page" after it might
already have been freed (speculative pagecache lookups, GUP-fast) will
just back off and never read page content.
We avoid such random memory reads as best we can, as it's just a pain to
deal with (like load_unaligned_zeropad(), which i would just wish we
could get rid of now that it's present again in my memory. :( ).
--
Cheers
David / dhildenb
Powered by blists - more mailing lists