[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51f57fa1-a2ca-4e3e-82cd-b0733d4f12a7@arm.com>
Date: Tue, 28 Nov 2023 11:15:19 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Barry Song <21cnbao@...il.com>
Cc: akpm@...ux-foundation.org, andreyknvl@...il.com,
anshuman.khandual@....com, ardb@...nel.org,
catalin.marinas@....com, david@...hat.com, dvyukov@...gle.com,
glider@...gle.com, james.morse@....com, jhubbard@...dia.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, mark.rutland@....com, maz@...nel.org,
oliver.upton@...ux.dev, ryabinin.a.a@...il.com,
suzuki.poulose@....com, vincenzo.frascino@....com,
wangkefeng.wang@...wei.com, will@...nel.org, willy@...radead.org,
yuzenghui@...wei.com, yuzhao@...gle.com, ziy@...dia.com
Subject: Re: [PATCH v2 14/14] arm64/mm: Add ptep_get_and_clear_full() to
optimize process teardown
On 28/11/2023 07:32, Barry Song wrote:
>> +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
>> +static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
>> + unsigned long addr, pte_t *ptep, int full)
>> +{
>> + pte_t orig_pte = __ptep_get(ptep);
>> +
>> + if (!pte_valid_cont(orig_pte) || !full) {
>> + contpte_try_unfold(mm, addr, ptep, orig_pte);
>> + return __ptep_get_and_clear(mm, addr, ptep);
>> + } else
>> + return contpte_ptep_get_and_clear_full(mm, addr, ptep);
>> +}
>> +
>
> Hi Ryan,
>
> I feel quite hard to understand the code. when !pte_valid_cont(orig_pte),
> we will call contpte_try_unfold(mm, addr, ptep, orig_pte);
>
> but in contpte_try_unfold(), we call unfold only if pte_valid_cont()
> is true:
> static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, pte_t pte)
> {
> if (contpte_is_enabled(mm) && pte_valid_cont(pte))
> __contpte_try_unfold(mm, addr, ptep, pte);
> }
>
> so do you mean the below?
>
> if (!pte_valid_cont(orig_pte))
> return __ptep_get_and_clear(mm, addr, ptep);
>
> if (!full) {
> contpte_try_unfold(mm, addr, ptep, orig_pte);
> return __ptep_get_and_clear(mm, addr, ptep);
> } else {
> return contpte_ptep_get_and_clear_full(mm, addr, ptep);
> }
Yes, this is equivalent. In general, I was trying not to spray `if
(pte_valid_cont(orig_pte))` checks everywhere to guard contpte_try_unfold() and
instead put the checks into contpte_try_unfold() (hence the 'try'). I figured
just calling it unconditionally and letting the compiler optimize as it sees fit
was the cleanest approach.
But in this instance I can see this is confusing. I'll modify as you suggest.
Thanks!
>
> Thanks
> Barry
>
>
Powered by blists - more mailing lists