lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 16 Mar 2021 06:56:58 +0000
From:   Jiuyang Liu <liu@...yang.me>
To:     Anup Patel <anup@...infault.org>
Cc:     Alexandre Ghiti <alex@...ti.fr>,
        Andrew Waterman <waterman@...s.berkeley.edu>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Atish Patra <atish.patra@....com>,
        Anup Patel <anup.patel@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Rapoport <rppt@...nel.org>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        Zong Li <zong.li@...ive.com>,
        Greentime Hu <greentime.hu@...ive.com>,
        linux-riscv <linux-riscv@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Insert SFENCE.VMA in function set_pte_at for RISCV

> As per my understanding, we don't need to explicitly invalidate local TLB
> in set_pte() or set_pet_at() because generic Linux page table management
> (<linux>/mm/*) will call the appropriate flush_tlb_xyz() function after page
> table updates.

I witnessed this bug in our micro-architecture: set_pte instruction is
still in the store buffer, no functions are inserting SFENCE.VMA in
the stack below, so TLB cannot witness this modification.
Here is my call stack:
set_pte
set_pte_at
map_vm_area
__vmalloc_area_node
__vmalloc_node_range
__vmalloc_node
__vmalloc_node_flags
vzalloc
n_tty_open

I think this is an architecture specific code, so <linux>/mm/* should
not be modified.
And spec requires SFENCE.VMA to be inserted on each modification to
TLB. So I added code here.

> Also, just local TLB flush is generally not sufficient because
> a lot of page tables will be used across on multiple HARTs.

Yes, this is the biggest issue, in RISC-V Volume 2, Privileged Spec v.
20190608 page 67 gave a solution:
Consequently, other harts must be notified separately when the
memory-management data structures have been modified. One approach is
to use
1) a local data fence to ensure local writes are visible globally,
then 2) an interprocessor interrupt to the other thread,
then 3) a local SFENCE.VMA in the interrupt handler of the remote thread,
and finally 4) signal back to originating thread that operation is
complete. This is, of course, the RISC-V analog to a TLB shootdown.

In general, this patch didn't handle the G bit in PTE, kernel trap it
to sbi_remote_sfence_vma. do you think I should use flush_tlb_all?

Jiuyang




arch/arm/mm/mmu.c
void set_pte_at(struct mm_struct *mm, unsigned long addr,
                              pte_t *ptep, pte_t pteval)
{
        unsigned long ext = 0;

        if (addr < TASK_SIZE && pte_valid_user(pteval)) {
                if (!pte_special(pteval))
                        __sync_icache_dcache(pteval);
                ext |= PTE_EXT_NG;
        }

        set_pte_ext(ptep, pteval, ext);
}

arch/mips/include/asm/pgtable.h
static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
                              pte_t *ptep, pte_t pteval)
{

        if (!pte_present(pteval))
                goto cache_sync_done;

        if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval)))
                goto cache_sync_done;

        __update_cache(addr, pteval);
cache_sync_done:
        set_pte(ptep, pteval);
}


Also, just local TLB flush is generally not sufficient because
> a lot of page tables will be used accross on multiple HARTs.


On Tue, Mar 16, 2021 at 5:05 AM Anup Patel <anup@...infault.org> wrote:
>
> +Alex
>
> On Tue, Mar 16, 2021 at 9:20 AM Jiuyang Liu <liu@...yang.me> wrote:
> >
> > This patch inserts SFENCE.VMA after modifying PTE based on RISC-V
> > specification.
> >
> > arch/riscv/include/asm/pgtable.h:
> > 1. implement pte_user, pte_global and pte_leaf to check correspond
> > attribute of a pte_t.
>
> Adding pte_user(), pte_global(), and pte_leaf() is fine.
>
> >
> > 2. insert SFENCE.VMA in set_pte_at based on RISC-V Volume 2, Privileged
> > Spec v. 20190608 page 66 and 67:
> > If software modifies a non-leaf PTE, it should execute SFENCE.VMA with
> > rs1=x0. If any PTE along the traversal path had its G bit set, rs2 must
> > be x0; otherwise, rs2 should be set to the ASID for which the
> > translation is being modified.
> > If software modifies a leaf PTE, it should execute SFENCE.VMA with rs1
> > set to a virtual address within the page. If any PTE along the traversal
> > path had its G bit set, rs2 must be x0; otherwise, rs2 should be set to
> > the ASID for which the translation is being modified.
> >
> > arch/riscv/include/asm/tlbflush.h:
> > 1. implement get_current_asid to get current program asid.
> > 2. implement local_flush_tlb_asid to flush tlb with asid.
>
> As per my understanding, we don't need to explicitly invalidate local TLB
> in set_pte() or set_pet_at() because generic Linux page table management
> (<linux>/mm/*) will call the appropriate flush_tlb_xyz() function after page
> table updates. Also, just local TLB flush is generally not sufficient because
> a lot of page tables will be used accross on multiple HARTs.
>
> >
> > Signed-off-by: Jiuyang Liu <liu@...yang.me>
> > ---
> >  arch/riscv/include/asm/pgtable.h  | 27 +++++++++++++++++++++++++++
> >  arch/riscv/include/asm/tlbflush.h | 12 ++++++++++++
> >  2 files changed, 39 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> > index ebf817c1bdf4..5a47c60372c1 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -222,6 +222,16 @@ static inline int pte_write(pte_t pte)
> >         return pte_val(pte) & _PAGE_WRITE;
> >  }
> >
> > +static inline int pte_user(pte_t pte)
> > +{
> > +       return pte_val(pte) & _PAGE_USER;
> > +}
> > +
> > +static inline int pte_global(pte_t pte)
> > +{
> > +       return pte_val(pte) & _PAGE_GLOBAL;
> > +}
> > +
> >  static inline int pte_exec(pte_t pte)
> >  {
> >         return pte_val(pte) & _PAGE_EXEC;
> > @@ -248,6 +258,11 @@ static inline int pte_special(pte_t pte)
> >         return pte_val(pte) & _PAGE_SPECIAL;
> >  }
> >
> > +static inline int pte_leaf(pte_t pte)
> > +{
> > +       return pte_val(pte) & (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC);
> > +}
> > +
> >  /* static inline pte_t pte_rdprotect(pte_t pte) */
> >
> >  static inline pte_t pte_wrprotect(pte_t pte)
> > @@ -358,6 +373,18 @@ static inline void set_pte_at(struct mm_struct *mm,
> >                 flush_icache_pte(pteval);
> >
> >         set_pte(ptep, pteval);
> > +
> > +       if (pte_present(pteval)) {
> > +               if (pte_leaf(pteval)) {
> > +                       local_flush_tlb_page(addr);
> > +               } else {
> > +                       if (pte_global(pteval))
> > +                               local_flush_tlb_all();
> > +                       else
> > +                               local_flush_tlb_asid();
> > +
> > +               }
> > +       }
> >  }
> >
> >  static inline void pte_clear(struct mm_struct *mm,
> > diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
> > index 394cfbccdcd9..1f9b62b3670b 100644
> > --- a/arch/riscv/include/asm/tlbflush.h
> > +++ b/arch/riscv/include/asm/tlbflush.h
> > @@ -21,6 +21,18 @@ static inline void local_flush_tlb_page(unsigned long addr)
> >  {
> >         __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
> >  }
> > +
> > +static inline unsigned long get_current_asid(void)
> > +{
> > +       return (csr_read(CSR_SATP) >> SATP_ASID_SHIFT) & SATP_ASID_MASK;
> > +}
> > +
> > +static inline void local_flush_tlb_asid(void)
> > +{
> > +       unsigned long asid = get_current_asid();
> > +       __asm__ __volatile__ ("sfence.vma x0, %0" : : "r" (asid) : "memory");
> > +}
> > +
> >  #else /* CONFIG_MMU */
> >  #define local_flush_tlb_all()                  do { } while (0)
> >  #define local_flush_tlb_page(addr)             do { } while (0)
> > --
> > 2.30.2
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@...ts.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>
> Regards,
> Anup

Powered by blists - more mailing lists