lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250314175309.2263997-1-herton@redhat.com>
Date: Fri, 14 Mar 2025 14:53:08 -0300
From: "Herton R. Krzesinski" <herton@...hat.com>
To: x86@...nel.org
Cc: tglx@...utronix.de,
	mingo@...hat.com,
	bp@...en8.de,
	dave.hansen@...ux.intel.com,
	hpa@...or.com,
	linux-kernel@...r.kernel.org,
	torvalds@...ux-foundation.org,
	olichtne@...hat.com,
	atomasov@...hat.com,
	aokuliar@...hat.com
Subject: Performance issues in copy_user_generic() in x86_64

Hello,

recently I have got two reports of performance loss in copy_user_generic()
after updates in user copy functions in x86_64, when benchmarking with iperf3.
I believe the write alignment to 8 bytes that was done through the old
ALIGN_DESTINATION macro was helping in some cases, and when it was removed the
performance drop can be noticed. Looks like this theory is corroborated by some
performance testing I did.

Please take a look at the following email with the patch if everything is sane.
I already did some testing as explained in the changelog of the patch. I used
the following scripts to run the testing, I just wrote them to get the job done
and get some results, so there is nothing fancy about them.

---- bench.sh
#!/bin/bash

dir=$1
mkdir -p $dir

for cpu in 19 21 23 none; do
	sync
	echo 3 > /proc/sys/vm/drop_caches
	cpu_opt=""
	if [ "$cpu" != "none" ]; then
		cpu_opt="taskset -c $cpu"
	fi
	$cpu_opt iperf3 -D -s -B 127.0.0.1 -p 12000
	perf stat -o $dir/stat.$cpu.txt taskset -c 17 iperf3 -c 127.0.0.1 -b 0/1000 -V -n 50G --repeating-payload -l 16384 -p 12000 --cport 12001 2>&1 > $dir/stat-$cpu.txt
	cat $dir/stat.$cpu.txt >> $dir/stat-$cpu.txt
	rm -f $dir/stat.$cpu.txt
	killall iperf3
done
----

---- stat.sh
#!/bin/bash

dir=$1
printf "            %4s  %13s %12s %12s %11s\n" "CPU" "RATE     " "SYS     " "TIME    " "sender-receiver"

for cpu in 19 21 23 none; do
	time=$(grep 'seconds time elapsed' $dir/stat-$cpu.txt | awk '{ print $1 }')
	sys=$(grep 'seconds sys' $dir/stat-$cpu.txt | awk '{ print $1 }')
	rate=$(grep ' sender' $dir/stat-$cpu.txt | awk '{ print $7 $8 }')
	cpuu=$(grep 'CPU Utilization' $dir/stat-$cpu.txt | awk '{ printf "%s-%s\n", $4, $7 }')

	printf "Server bind %4s: $rate $sys $time %s\n" $cpu $cpuu
done
----

Example of a test run:
nice -n -20 ./bench.sh align
./stat.sh align


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ