[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c26aa4e-fe11-09d2-c2fb-63546ba80893@arm.com>
Date: Fri, 28 Jul 2023 10:00:21 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Yu Zhao <yuzhao@...gle.com>
Cc: Matthew Wilcox <willy@...radead.org>,
"Huang, Ying" <ying.huang@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Yin Fengwei <fengwei.yin@...el.com>,
David Hildenbrand <david@...hat.com>,
Yang Shi <shy828301@...il.com>, Zi Yan <ziy@...dia.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 2/3] mm: Implement folio_remove_rmap_range()
On 27/07/2023 17:38, Yu Zhao wrote:
> On Thu, Jul 27, 2023 at 1:26 AM Ryan Roberts <ryan.roberts@....com> wrote:
>>
>> On 27/07/2023 03:35, Matthew Wilcox wrote:
>>> On Thu, Jul 27, 2023 at 09:29:24AM +0800, Huang, Ying wrote:
>>>> Matthew Wilcox <willy@...radead.org> writes:
>>>>> I think that can make sense. Because we limit to a single page table,
>>>>> specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
>>>>> Just make it folio, page, nr, vma. I'd actually prefer it as (vma,
>>>>> folio, page, nr), but that isn't the convention we've had in rmap up
>>>>> until now.
>>>>
>>>> IIUC, even if 'nr = 1 << PMD_ORDER', we may remove one PMD 'compound'
>>>> mapping, or 'nr' PTE mapping. So, we will still need 'compound' (or
>>>> some better name) as parameter.
>>>
>>> Oh, this is removing ... so you're concerned with the case where we've
>>> split the PMD into PTEs, but all the PTEs are still present in a single
>>> page table? OK, I don't have a good answer to that. Maybe that torpedoes
>>> the whole idea; I'll think about it.
>>
>> This is exactly why I think the approach I've already taken is the correct one;
>> a 'range' makes no sense when you are dealing with 'compound' pages because you
>> are accounting the entire folio. So surely its better to reflect that by only
>> accounting small pages in the range version of the API.
>
> If the argument is the compound case is a separate one, then why not a
> separate API for it?
>
> I don't really care about whether we think 'range' makes sense for
> 'compound' or not. What I'm saying is:
> 1. if they are considered one general case, then one API with the
> compound parameter.
> 2. if they are considered two specific cases, there should be two APIs.
> This common design pattern is cleaner IMO.
Option 2 definitely makes sense to me and I agree that it would be cleaner to
have 2 separate APIs, one for small-page accounting (which can accept a range
within a folio) and one for large-page accounting (i.e. compound=true in today's
API).
But...
1) That's not how the rest of the rmap API does it
2) This would be a much bigger change since I'm removing an existing API and
replacing it with a completely new one (there are ~20 call sites to fix up). I
was trying to keep the change small and manageable by maintaining the current
API but moving all the small-page logic to the new API, so the old API is a
wrapper in that case.
3) You would also need an API for the hugetlb case, which page_remove_rmap()
handles today. Perhaps that could also be done by the new API that handles the
compound case. But then you are mixing and matching your API styles - one caters
for 1 specific case, and the other caters for 2 cases and figures out which one.
>
> Right now we have an overlap (redundancy) -- people would have to do
> two code searches: one for page_remove_rmap() and the other for
> folio_remove_rmap_range(nr=1), and this IMO is a bad design pattern.
I'm open to doing the work to remove this redundancy, but I'd like to hear
concensus on this thread that its the right approach first. Although personally
I don't see a problem with what I've already done; If you want to operate on a
page (inc the old concept of a "compound page" and a hugetlb page) call the old
one. If you want to operate on a range of pages in a folio, call the new one.
Thanks,
Ryan
Powered by blists - more mailing lists