[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87msyscz77.fsf@mail.lhotse>
Date: Tue, 15 Aug 2023 21:38:52 +1000
From: Michael Ellerman <mpe@...erman.id.au>
To: Christophe Leroy <christophe.leroy@...roup.eu>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>,
linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH] powerpc/32s: Cleanup the mess in __set_pte_at()
Christophe Leroy <christophe.leroy@...roup.eu> writes:
> __set_pte_at() handles 3 main cases with #ifdefs plus the 'percpu'
> subcase which leads to code duplication.
>
> Rewrite the function using IS_ENABLED() to minimise the total number
> of cases and remove duplicated code.
I think the code change is good, but the comment becomes completely
misleading. Because it talks about "the first case" etc., but then the
code is structured differently.
cheers
> diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
> index 40041ac713d9..2a0ca1f9a1ff 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> @@ -534,58 +534,43 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>
>
> /* This low level function performs the actual PTE insertion
> - * Setting the PTE depends on the MMU type and other factors. It's
> - * an horrible mess that I'm not going to try to clean up now but
> - * I'm keeping it in one place rather than spread around
> + * Setting the PTE depends on the MMU type and other factors.
> + *
> + * First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
> + * helper pte_update() which does an atomic update. We need to do that
> + * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
> + * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
> + * the hash bits instead (ie, same as the non-SMP case)
> + *
> + * Second case is 32-bit with 64-bit PTE. In this case, we
> + * can just store as long as we do the two halves in the right order
> + * with a barrier in between. This is possible because we take care,
> + * in the hash code, to pre-invalidate if the PTE was already hashed,
> + * which synchronizes us with any concurrent invalidation.
> + * In the percpu case, we also fallback to the simple update preserving
> + * the hash bits
> + *
> + * Third case is 32-bit hash table in UP mode, we need to preserve
> + * the _PAGE_HASHPTE bit since we may not have invalidated the previous
> + * translation in the hash yet (done in a subsequent flush_tlb_xxx())
> + * and see we need to keep track that this PTE needs invalidating
> */
> static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, pte_t pte, int percpu)
> {
> -#if defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
> - /* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
> - * helper pte_update() which does an atomic update. We need to do that
> - * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
> - * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
> - * the hash bits instead (ie, same as the non-SMP case)
> - */
> - if (percpu)
> - *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> - | (pte_val(pte) & ~_PAGE_HASHPTE));
> - else
> - pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, pte_val(pte), 0);
> + if ((!IS_ENABLED(CONFIG_SMP) && !IS_ENABLED(CONFIG_PTE_64BIT)) || percpu) {
> + *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE) |
> + (pte_val(pte) & ~_PAGE_HASHPTE));
> + } else if (IS_ENABLED(CONFIG_PTE_64BIT)) {
> + if (pte_val(*ptep) & _PAGE_HASHPTE)
> + flush_hash_entry(mm, ptep, addr);
>
> -#elif defined(CONFIG_PTE_64BIT)
> - /* Second case is 32-bit with 64-bit PTE. In this case, we
> - * can just store as long as we do the two halves in the right order
> - * with a barrier in between. This is possible because we take care,
> - * in the hash code, to pre-invalidate if the PTE was already hashed,
> - * which synchronizes us with any concurrent invalidation.
> - * In the percpu case, we also fallback to the simple update preserving
> - * the hash bits
> - */
> - if (percpu) {
> - *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> - | (pte_val(pte) & ~_PAGE_HASHPTE));
> - return;
> + asm volatile("stw%X0 %2,%0; eieio; stw%X1 %L2,%1" :
> + "=m" (*ptep), "=m" (*((unsigned char *)ptep+4)) :
> + "r" (pte) : "memory");
> + } else {
> + pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, pte_val(pte), 0);
> }
> - if (pte_val(*ptep) & _PAGE_HASHPTE)
> - flush_hash_entry(mm, ptep, addr);
> - __asm__ __volatile__("\
> - stw%X0 %2,%0\n\
> - eieio\n\
> - stw%X1 %L2,%1"
> - : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
> - : "r" (pte) : "memory");
> -
> -#else
> - /* Third case is 32-bit hash table in UP mode, we need to preserve
> - * the _PAGE_HASHPTE bit since we may not have invalidated the previous
> - * translation in the hash yet (done in a subsequent flush_tlb_xxx())
> - * and see we need to keep track that this PTE needs invalidating
> - */
> - *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> - | (pte_val(pte) & ~_PAGE_HASHPTE));
> -#endif
> }
>
> /*
> --
> 2.36.1
Powered by blists - more mailing lists