lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251127141117.87420-1-luxu.kernel@bytedance.com>
Date: Thu, 27 Nov 2025 22:11:08 +0800
From: Xu Lu <luxu.kernel@...edance.com>
To: pjw@...nel.org,
	palmer@...belt.com,
	aou@...s.berkeley.edu,
	alex@...ti.fr,
	kees@...nel.org,
	mingo@...hat.com,
	peterz@...radead.org,
	juri.lelli@...hat.com,
	vincent.guittot@...aro.org,
	akpm@...ux-foundation.org,
	david@...hat.com,
	apatel@...tanamicro.com,
	guoren@...nel.org
Cc: linux-riscv@...ts.infradead.org,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	Xu Lu <luxu.kernel@...edance.com>
Subject: [RFC PATCH v2 0/9] riscv: mm: Introduce lazy tlb flush

This patch series introduces a lazy tlb flush mechanism for riscv. This
mechanism is based on two insights:

1) Since each CPU has limited TLB entries, there exist limited active
ASIDs in each CPU's TLB at the same time. When a mm has not been used
for enough long time (or, after enough switch_mm times), we can assume
its TLB entries are all evicted out. Then we can clear current CPU in
its mm_cpumask so that next time when the memory mapping of this mm is
modified, no IPI will be sent to current CPU.

2) When memory mapping of a mm is modified, instead of sending IPI to
all CPUs recorded in its mm_cpumask, we check whether each target CPU
is using this mm right now. If not, we just store the TLB Flush
information in target CPU's percpu buffer, avoiding the IPI. Next time
when the target CPU switch_mm to this mm, it can check the percpu buffer
and perform TLB Flush itself, without IPI involvement either.

Using this mechanism, we significantly reduced the number of IPI due to
TLB Flush:

* ltp - mmapstress01
Before: ~108k
After: ~17k

* ltp - hackbench
Before: ~385k
After: ~2k

Thanks Guo Ren for his advice on memory access latency test via lmbench.
We are unable to test it now due to lack of real machines. We will
supply this test and adjust our mechanism according to it as soon as
possible.

Xu Lu (9):
  riscv: Introduce RISCV_LAZY_TLB_FLUSH config
  riscv: mm: Apply a threshold to the number of active ASIDs on each CPU
  riscv: mm: Grab mm_count to avoid mm getting released
  fork: Add arch override for do_shoot_lazy_tlb()
  riscv: mm: Introduce arch_do_shoot_lazy_tlb
  riscv: mm: Introduce percpu TLB Flush queue
  riscv: mm: Defer the TLB Flush to switch_mm
  riscv: mm: Clear mm_cpumask during local_flush_tlb_all_asid()
  riscv: mm: Clear mm_cpumask during local_flush_tlb_all()

 arch/riscv/Kconfig                   |  12 ++
 arch/riscv/include/asm/mmu.h         |   4 +
 arch/riscv/include/asm/mmu_context.h |   5 +
 arch/riscv/include/asm/tlbflush.h    |  63 ++++++
 arch/riscv/mm/context.c              |  24 ++-
 arch/riscv/mm/tlbflush.c             | 302 +++++++++++++++++++++++++--
 kernel/fork.c                        |   6 +-
 7 files changed, 394 insertions(+), 22 deletions(-)

-- 
2.20.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ