[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <854511e0-d00c-477b-813a-3785baea846a@redhat.com>
Date: Thu, 25 Apr 2024 17:42:12 +0200
From: David Hildenbrand <david@...hat.com>
To: Guillaume Morin <guillaume@...infr.org>
Cc: oleg@...hat.com, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, muchun.song@...ux.dev
Subject: Re: [RFC][PATCH] uprobe: support for private hugetlb mappings
On 25.04.24 17:19, Guillaume Morin wrote:
> On 24 Apr 23:00, David Hildenbrand wrote:
>>> One issue here is that FOLL_FORCE|FOLL_WRITE is not implemented for
>>> hugetlb mappings. However this was also on my TODO and I have a draft
>>> patch that implements it.
>>
>> Yes, I documented it back then and added sanity checks in GUP code to fence
>> it off. Shouldn't be too hard to implement (famous last words) and would be
>> the cleaner thing to use here once I manage to switch over to
>> FOLL_WRITE|FOLL_FORCE to break COW.
>
> Yes, my patch seems to be working. The hugetlb code is pretty simple.
> And it allows ptrace and the proc pid mem file to work on the executable
> private hugetlb mappings.
>
> There is one thing I am unclear about though. hugetlb enforces that
> huge_pte_write() is true on FOLL_WRITE in both the fault and
> follow_page_mask paths. I am not sure if we can simply assume in the
> hugetlb code that if the pte is not writable and this is a write fault
> then we're in the FOLL_FORCE|FOLL_WRITE case. Or do we want to keep the
> checks simply not enforce it for FOLL_FORCE|FOLL_WRITE?
>
> The latter is more complicated in the fault path because there is no
> FAULT_FLAG_FORCE flag.
>
handle_mm_fault()->sanitize_fault_flags() makes sure that we'll only
proceed with a fault either if
* we have VM_WRITE set
* we are in a COW mapping (MAP_PRIVATE with at least VM_MAYWRITE)
Once you see FAULT_FLAG_WRITE and you do have VM_WRITE, you don't care
about FOLL_FORCE, it's simply a write fault.
Once you see FAULT_FLAG_WRITE and you *don't* have VM_WRITE, you must
have VM_MAYWRITE and are essentially in FOLL_FORCE.
In a VMA without VM_WRITE, you must never map a PTE writable. In
ordinary COW code, that's done in wp_page_copy(), where we *always* use
maybe_mkwrite(), to do exactly what a write fault would do, but without
mapping the PTE writable.
That's what the whole can_follow_write_pmd()/can_follow_write_pte() is
about: writing to PTEs that are not writable.
You'll have to follow the exact same model in hugetlb
(can_follow_write_pmd(), hugetlb_maybe_mkwrite(), ...).
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists