lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <173108456812.1559945.17269799494713828811.b4-ty@arm.com>
Date: Fri,  8 Nov 2024 16:49:53 +0000
From: Catalin Marinas <catalin.marinas@....com>
To: mhiramat@...nel.org,
	oleg@...hat.com,
	peterz@...radead.org,
	will@...nel.org,
	mark.rutland@....com,
	Liao Chang <liaochang1@...wei.com>
Cc: linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] arm64: uprobes: Optimize cache flushes for xol slot

On Thu, 19 Sep 2024 12:17:19 +0000, Liao Chang wrote:
> The profiling of single-thread selftests bench reveals a bottlenect in
> caches_clean_inval_pou() on ARM64. On my local testing machine, this
> function takes approximately 34% of CPU cycles for trig-uprobe-nop and
> trig-uprobe-push.
> 
> This patch add a check to avoid unnecessary cache flush when writing
> instruction to the xol slot. If the instruction is same with the
> existing instruction in slot, there is no need to synchronize D/I cache.
> Since xol slot allocation and updates occur on the hot path of uprobe
> handling, The upstream kernel running on Kunpeng916 (Hi1616), 4 NUMA
> nodes, 64 cores@ 2.4GHz reveals this optimization has obvious gain for
> nop and push testcases.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[1/1] arm64: uprobes: Optimize cache flushes for xol slot
      https://git.kernel.org/arm64/c/bdf94836c22a

-- 
Catalin


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ