[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bfcd33-9741-4d6c-8b7a-1d1ee2dce61b@ghiti.fr>
Date: Fri, 21 Jul 2023 22:00:52 +0200
From: Alexandre Ghiti <alex@...ti.fr>
To: Guo Ren <guoren@...nel.org>
Cc: palmer@...osinc.com, paul.walmsley@...ive.com, falcon@...ylab.org,
bjorn@...nel.org, conor.dooley@...rochip.com,
linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-riscv@...ts.infradead.org, Guo Ren <guoren@...ux.alibaba.com>
Subject: Re: [PATCH] riscv: mm: Fixup spurious fault of kernel vaddr
On 21/07/2023 18:08, Guo Ren wrote:
> On Fri, Jul 21, 2023 at 11:19 PM Alexandre Ghiti <alex@...ti.fr> wrote:
>>
>> On 21/07/2023 16:51, guoren@...nel.org wrote:
>>> From: Guo Ren <guoren@...ux.alibaba.com>
>>>
>>> RISC-V specification permits the caching of PTEs whose V (Valid)
>>> bit is clear. Operating systems must be written to cope with this
>>> possibility, but implementers are reminded that eagerly caching
>>> invalid PTEs will reduce performance by causing additional page
>>> faults.
>>>
>>> So we must keep vmalloc_fault for the spurious page faults of kernel
>>> virtual address from an OoO machine.
>>>
>>> Signed-off-by: Guo Ren <guoren@...ux.alibaba.com>
>>> Signed-off-by: Guo Ren <guoren@...nel.org>
>>> ---
>>> arch/riscv/mm/fault.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>> index 85165fe438d8..f662c9eae7d4 100644
>>> --- a/arch/riscv/mm/fault.c
>>> +++ b/arch/riscv/mm/fault.c
>>> @@ -258,8 +258,7 @@ void handle_page_fault(struct pt_regs *regs)
>>> * only copy the information from the master page table,
>>> * nothing more.
>>> */
>>> - if ((!IS_ENABLED(CONFIG_MMU) || !IS_ENABLED(CONFIG_64BIT)) &&
>>> - unlikely(addr >= VMALLOC_START && addr < VMALLOC_END)) {
>>> + if (unlikely(addr >= TASK_SIZE)) {
>>> vmalloc_fault(regs, code, addr);
>>> return;
>>> }
>>
>> Can you share what you are trying to fix here?
> We met a spurious page fault panic on an OoO machine.
>
> 1. The processor speculative execution brings the V=0 entries into the
> TLB in the kernel virtual address.
> 2. Linux kernel installs the kernel virtual address with the page, and V=1
> 3. When kernel code access the kernel virtual address, it would raise
> a page fault as the V=0 entry in the tlb.
> 4. No vmalloc_fault, then panic.
>
>> I have a fix (that's currently running our CI) for commit 7d3332be011e
>> ("riscv: mm: Pre-allocate PGD entries for vmalloc/modules area") that
>> implements flush_cache_vmap() since we lost the vmalloc_fault.
> Could you share that patch?
Here we go:
Author: Alexandre Ghiti <alexghiti@...osinc.com>
Date: Fri Jul 21 08:43:44 2023 +0000
riscv: Implement flush_cache_vmap()
The RISC-V kernel needs a sfence.vma after a page table
modification: we
used to rely on the vmalloc fault handling to emit an sfence.vma, but
commit 7d3332be011e ("riscv: mm: Pre-allocate PGD entries for
vmalloc/modules area") got rid of this path for 64-bit kernels, so
now we
need to explicitly emit a sfence.vma in flush_cache_vmap().
Note that we don't need to implement flush_cache_vunmap() as the
generic
code should emit a flush tlb after unmapping a vmalloc region.
Fixes: 7d3332be011e ("riscv: mm: Pre-allocate PGD entries for
vmalloc/modules area")
Signed-off-by: Alexandre Ghiti <alexghiti@...osinc.com>
diff --git a/arch/riscv/include/asm/cacheflush.h
b/arch/riscv/include/asm/cacheflush.h
index 8091b8bf4883..b93ffddf8a61 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -37,6 +37,10 @@ static inline void flush_dcache_page(struct page *page)
#define flush_icache_user_page(vma, pg, addr, len) \
flush_icache_mm(vma->vm_mm, 0)
+#ifdef CONFIG_64BIT
+#define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)
+#endif
+
#ifndef CONFIG_SMP
#define flush_icache_all() local_flush_icache_all()
Let me know if that works for you!
>
>
Powered by blists - more mailing lists