linux-kernel - Re: [External] [PATCH v5 08/13] riscv: Avoid TLB flush loops when affected by SiFive CIP-1200

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <100d5414-11fa-4c47-9c35-51f5fad2d6e6@sifive.com>
Date: Mon, 11 Mar 2024 19:35:49 -0500
From: Samuel Holland <samuel.holland@...ive.com>
To: yunhui cui <cuiyunhui@...edance.com>
Cc: Palmer Dabbelt <palmer@...belt.com>, linux-riscv@...ts.infradead.org,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 Alexandre Ghiti <alexghiti@...osinc.com>, Jisheng Zhang <jszhang@...nel.org>
Subject: Re: [External] [PATCH v5 08/13] riscv: Avoid TLB flush loops when
 affected by SiFive CIP-1200

Hi Yunhui,

On 2024-02-29 8:48 PM, yunhui cui wrote:
> Hi Samuel,
> 
> On Fri, Mar 1, 2024 at 7:22 AM Samuel Holland <samuel.holland@...ive.com> wrote:
>>
>> Since implementations affected by SiFive errata CIP-1200 always use the
>> global variant of the sfence.vma instruction, they only need to execute
>> the instruction once. The range-based loop only hurts performance.
>>
>> Signed-off-by: Samuel Holland <samuel.holland@...ive.com>
>> ---
>>
>> (no changes since v4)
>>
>> Changes in v4:
>>  - Only set tlb_flush_all_threshold when CONFIG_MMU=y.
>>
>> Changes in v3:
>>  - New patch for v3
>>
>>  arch/riscv/errata/sifive/errata.c | 5 +++++
>>  arch/riscv/include/asm/tlbflush.h | 2 ++
>>  arch/riscv/mm/tlbflush.c          | 2 +-
>>  3 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/errata/sifive/errata.c b/arch/riscv/errata/sifive/errata.c
>> index 3d9a32d791f7..716cfedad3a2 100644
>> --- a/arch/riscv/errata/sifive/errata.c
>> +++ b/arch/riscv/errata/sifive/errata.c
>> @@ -42,6 +42,11 @@ static bool errata_cip_1200_check_func(unsigned long  arch_id, unsigned long imp
>>                 return false;
>>         if ((impid & 0xffffff) > 0x200630 || impid == 0x1200626)
>>                 return false;
>> +
>> +#ifdef CONFIG_MMU
>> +       tlb_flush_all_threshold = 0;
>> +#endif
>> +
>>         return true;
>>  }
>>
>> diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
>> index 463b615d7728..8e329721375b 100644
>> --- a/arch/riscv/include/asm/tlbflush.h
>> +++ b/arch/riscv/include/asm/tlbflush.h
>> @@ -66,6 +66,8 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
>>                                unsigned long uaddr);
>>  void arch_flush_tlb_batched_pending(struct mm_struct *mm);
>>  void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
>> +
>> +extern unsigned long tlb_flush_all_threshold;
>>  #else /* CONFIG_MMU */
>>  #define local_flush_tlb_all()                  do { } while (0)
>>  #endif /* CONFIG_MMU */
>> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
>> index 365e0a0e4725..22870f213188 100644
>> --- a/arch/riscv/mm/tlbflush.c
>> +++ b/arch/riscv/mm/tlbflush.c
>> @@ -11,7 +11,7 @@
>>   * Flush entire TLB if number of entries to be flushed is greater
>>   * than the threshold below.
>>   */
>> -static unsigned long tlb_flush_all_threshold __read_mostly = 64;
>> +unsigned long tlb_flush_all_threshold __read_mostly = 64;
>>
>>  static void local_flush_tlb_range_threshold_asid(unsigned long start,
>>                                                  unsigned long size,
>> --
>> 2.43.1
>>
> 
> If local_flush_tlb_all_asid() is used every time, more PTWs will be
> generated. Will such modifications definitely improve the overall
> performance?

This change in this commit specifically applies to older SiFive SoCs with a bug
making single-page sfence.vma instructions unsafe to use. In this case, a single
call to local_flush_tlb_all_asid() is optimal, yes.

> Hi Alex, Samuel,
> The relationship between flush_xx_range_asid() and nr_ptes is
> basically linear growth (y=kx +b), while flush_all_asid() has nothing
> to do with nr_ptes (y=c).
> Some TLBs may do some optimization. The operation of flush all itself
> requires very few cycles, but there is a certain delay between
> consecutive flush all.
> The intersection of the two straight lines is the optimal solution of
> tlb_flush_all_threshold. In actual situations, continuous
> flush_all_asid will not occur. One problem caused by flush_all_asid()
> is that multiple flush entries require PTW, which causes greater
> latency.
> Therefore, the value of tlb_flush_all_threshold needs to be considered
> or quantified. Maybe doing local_flush_tlb_page_asid() based on the
> actual nr_ptes_in_range would give better overall performance.
> What do you think?

Yes, this was something Alex brought up when adding this threshold, that it
should be tuned for various scenarios. That still needs to be done. This patch
just covers one specific case where we know the optimal answer due to an erratum.

Regards,
Samuel