[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <45154D49-0EE6-439D-B169-35864F8EAF5A@nvidia.com>
Date: Wed, 21 May 2025 08:18:41 -0400
From: Zi Yan <ziy@...dia.com>
To: David Hildenbrand <david@...hat.com>
Cc: Oscar Salvador <osalvador@...e.de>, Johannes Weiner <hannes@...xchg.org>,
linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Brendan Jackman <jackmanb@...gle.com>, Richard Chang <richardycc@...gle.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a
standalone bit.
On 21 May 2025, at 8:11, David Hildenbrand wrote:
> On 21.05.25 14:00, Zi Yan wrote:
>> On 21 May 2025, at 7:57, David Hildenbrand wrote:
>>
>>> On 21.05.25 13:16, Zi Yan wrote:
>>>> On 19 May 2025, at 12:42, David Hildenbrand wrote:
>>>>
>>>>>>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>>>>>>> + if (flags & PB_migrate_isolate_bit)
>>>>>>>> + return MIGRATE_ISOLATE;
>>>>>>>> +#endif
>>>>>>>
>>>>>>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
>>>>>>
>>>>>> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
>>>>>> so it gets PB_migrate_isolate_bit.
>>>>>>
>>>>>
>>>>> Oh ... that's confusing.
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think what we should do is
>>>>>>>
>>>>>>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>>>>>>
>>>>>>> 2) Remove the mask parameter
>>>>>>>
>>>>>>> 3) Perform the masking in all callers.
>>>>>>
>>>>>> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
>>>>>> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
>>>>>> in the mask to confuse readers.
>>>>>
>>>>> The masking will be handled in the caller.
>>>>>
>>>>> So get_pageblock_skip() would essentially do a
>>>>>
>>>>> return get_pfnblock_flags() & PB_migrate_skip_bit;
>>>>>
>>>>> etc.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Maybe, we should convert set_pfnblock_flags_mask() to
>>>>>>>
>>>>>>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>>>>>>> set_flags, unsigned long clear_flags);
>>>>>>>
>>>>>>> And better, splitting it up (or providing helpers)
>>>>>>>
>>>>>>> set_pfnblock_flags(struct page *page, unsigned long flags);
>>>>>>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>>>>>>
>>>>>>>
>>>>>>> This implies some more code cleanups first that make the code easier to extend.
>>>>>>>
>>>>>>
>>>>>> The same due to PB_migrate_skip.
>>>>>>
>>>>>> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
>>>>>> internal APIs by prepending "__". They are only used by the new
>>>>>> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
>>>>>> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
>>>>>>
>>>>>> WDYT?
>>>>>
>>>>> In general, lgtm. I just hope we can avoid the "_mask" part and just handle it in these functions directly?
>>>>
>>>> After implementing {get, set, clear}_pfnblock_flags(), I find that
>>>> get_pfnblock_flags() is easy like you wrote above, but set and clear are not,
>>>> since migratetype and skip/isolate bits are in the same word, meaning
>>>> I will need to first read them out, change the field, then write them back.
>>>
>>> Like existing set_pfnblock_flags_mask() I guess, with the try_cmpxchg() loop.
>>
>> Are you saying I duplicate the code in set_pfnblock_flags_mask() to implement
>> set_pfnblock_flags()? Or just replace set_pfnblock_flags_mask() entirely?
>
> The latter as possible.
>
>>
>>>
>>>> But it will cause inconsistency if there is a parallel writer to the same
>>>> word. So for set and clear, mask is required.
>>>>
>>>> I can try to implement {get, set, clear}_pfnblock_bits(page,pfn, bits) to
>>>> only handle standalone bits by using the given @bits as the mask and
>>>> {set,get}_pageblock_migratetype() still use the mask.
>>>
>>> We'd still have to do the try_cmpxchg() when dealing with multiple bits, right?
>>>
>>> For single bits, we could just use set_bit() etc.
>>
>> Mel moved from set_bit() to try_cmpxchg() a word for performance reason. I am
>> not sure we want to move back.
>
> In e58469bafd05 we moved from multiple set_bit etc to a cmpxchange.
>
> - for (; start_bitidx <= end_bitidx; start_bitidx++, value <<= 1)
> - if (flags & value)
> - __set_bit(bitidx + start_bitidx, bitmap);
> - else
> - __clear_bit(bitidx + start_bitidx, bitmap);
>
>
> However, when only setting/clearing a single bit (e.g., isolated), set_bit etc should be much cheaper.
>
> For multiple bits, the existing try_cmpxchg should be kept IMHO.
Yes, I was thinking about that too. Let me do that as a standalone cleanup series
first, then resend this one afterwards.
--
Best Regards,
Yan, Zi
Powered by blists - more mailing lists