[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c9dc3f22-4a72-9b9d-7a74-ad77fe4f3b6e@redhat.com>
Date: Thu, 1 Sep 2022 18:46:13 +0200
From: David Hildenbrand <david@...hat.com>
To: Peter Xu <peterx@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Sasha Levin <sasha.levin@...cle.com>,
"Aneesh Kumar K . V" <aneesh.kumar@...ux.vnet.ibm.com>,
Vlastimil Babka <vbabka@...e.cz>,
Jerome Marchand <jmarchan@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Hugh Dickins <hughd@...gle.com>,
Jason Gunthorpe <jgg@...dia.com>,
John Hubbard <jhubbard@...dia.com>,
Yang Shi <shy828301@...il.com>
Subject: Re: [PATCH v1] mm/gup: adjust stale comment for RCU GUP-fast
On 01.09.22 18:40, Peter Xu wrote:
> On Thu, Sep 01, 2022 at 06:34:41PM +0200, David Hildenbrand wrote:
>> On 01.09.22 18:28, Peter Xu wrote:
>>> On Thu, Sep 01, 2022 at 09:21:19AM +0200, David Hildenbrand wrote:
>>>> commit 4b471e8898c3 ("mm, thp: remove infrastructure for handling splitting
>>>> PMDs") didn't remove all details about the THP split requirements for
>>>> RCU GUP-fast.
>>>>
>>>> IPI broeadcasts on THP split are no longer required.
>>>>
>>>> Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>>>> Cc: Sasha Levin <sasha.levin@...cle.com>
>>>> Cc: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
>>>> Cc: Vlastimil Babka <vbabka@...e.cz>
>>>> Cc: Jerome Marchand <jmarchan@...hat.com>
>>>> Cc: Andrea Arcangeli <aarcange@...hat.com>
>>>> Cc: Hugh Dickins <hughd@...gle.com>
>>>> Cc: Jason Gunthorpe <jgg@...dia.com>
>>>> Cc: John Hubbard <jhubbard@...dia.com>
>>>> Cc: Peter Xu <peterx@...hat.com>
>>>> Cc: Yang Shi <shy828301@...il.com>
>>>> Signed-off-by: David Hildenbrand <david@...hat.com>
>>>> ---
>>>> mm/gup.c | 5 ++---
>>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/mm/gup.c b/mm/gup.c
>>>> index 5abdaf487460..cfe71f422787 100644
>>>> --- a/mm/gup.c
>>>> +++ b/mm/gup.c
>>>> @@ -2309,9 +2309,8 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
>>>> *
>>>> * Another way to achieve this is to batch up page table containing pages
>>>> * belonging to more than one mm_user, then rcu_sched a callback to free those
>>>> - * pages. Disabling interrupts will allow the fast_gup walker to both block
>>>> - * the rcu_sched callback, and an IPI that we broadcast for splitting THPs
>>>> - * (which is a relatively rare event). The code below adopts this strategy.
>>>> + * pages. Disabling interrupts will allow the fast_gup walker to block the
>>>> + * rcu_sched callback.
>>>
>>> This is the comment for fast-gup in general but not only for thp split.
>>
>> "an IPI that we broadcast for splitting THP" is about splitting THP.
>
> Ah OK. Shall we still keep some "IPI broadcast" information here if we're
> modifying it? Otherwise it gives a feeling that none needs the IPIs.
I guess that's the end goal -- and we forgot about the PMD collapse case.
Are we aware of any other case that needs an IPI? I'd rather avoid
documenting something that's no longer true.
>
> It can be dropped later if you want to rework the thp collapse side and
> finally remove IPI dependency on fast-gup, but so far it seems to me it's
> still needed. Or just drop this patch until that rework happens?
The doc as is is obviously stale, why drop this patch?
We should see a fix for the THP collapse issue very soon I guess. Most
probably this patch will go upstream after that fix.
>
>>
>>>
>>> I can understand that we don't need IPI for thp split, but isn't the IPIs
>>> still needed for thp collapse (aka pmdp_collapse_flush)?
>>
>> That was, unfortunately, never documented -- and as discussed in the
>> other thread, arm64 doesn't do that IPI before collapse and might need
>> fixing. We'll most probably end up getting rid of that
>> (undocumented/forgotten) IPI requirement and fix it in GUP-fast by
>> re-rechecking if the PMD changed.
>
> Yeah from an initial thought that looks valid to me. It'll also allow
> pmdp_collapse_flush() to be dropped too, am I right?
I think the magic about pmdp_collapse_flush() is not only the IPIs, but
that we don't perform an ordinary PMD flush but we logically flush "all
PTEs in that range".
Apparently, that's a difference on some architectures.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists