linux-kernel - Re: [Bug] Take more 150s to boot qemu on ARM64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220613132247.GB1790663@paulmck-ThinkPad-P17-Gen-1>
Date:   Mon, 13 Jun 2022 06:22:47 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     "chenxiang (M)" <chenxiang66@...ilicon.com>
Cc:     linux-kernel@...r.kernel.org, qemu-devel@...gnu.org,
        linuxarm@...wei.com
Subject: Re: [Bug] Take more 150s to boot qemu on ARM64

On Mon, Jun 13, 2022 at 08:26:34PM +0800, chenxiang (M) wrote:
> Hi all,
> 
> I encounter a issue with kernel 5.19-rc1 on a ARM64 board:  it takes about
> 150s between beginning to run qemu command and beginng to boot Linux kernel
> ("EFI stub: Booting Linux Kernel...").
> 
> But in kernel 5.18-rc4, it only takes about 5s. I git bisect the kernel code
> and it finds c2445d387850 ("srcu: Add contention check to call_srcu()
> srcu_data ->lock acquisition").
> 
> The qemu (qemu version is 6.2.92) command i run is :
> 
> ./qemu-system-aarch64 -m 4G,slots=4,maxmem=8g \
> --trace "kvm*" \
> -cpu host \
> -machine virt,accel=kvm,gic-version=3  \
> -machine smp.cpus=2,smp.sockets=2 \
> -no-reboot \
> -nographic \
> -monitor unix:/home/cx/qmp-test,server,nowait \
> -bios /home/cx/boot/QEMU_EFI.fd \
> -kernel /home/cx/boot/Image  \
> -device pcie-root-port,port=0x8,chassis=1,id=net1,bus=pcie.0,multifunction=on,addr=0x1
> \
> -device vfio-pci,host=7d:01.3,id=net0 \
> -device virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=4  \
> -drive file=/home/cx/boot/boot_ubuntu.img,if=none,id=drive0 \
> -append "rdinit=init console=ttyAMA0 root=/dev/vda rootfstype=ext4 rw " \
> -net none \
> -D /home/cx/qemu_log.txt
> 
> I am not familiar with rcu code, and don't know how it causes the issue. Do
> you have any idea about this issue?

Please see the discussion here:

https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/

Though that report requires ACPI to be forced on to get the
delay, which results in more than 9,000 back-to-back calls to
synchronize_srcu_expedited().  I cannot reproduce this on my setup, even
with an artificial tight loop invoking synchronize_srcu_expedited(),
but then again I don't have ARM hardware.

My current guess is that the following patch, but with larger values for
SRCU_MAX_NODELAY_PHASE.  Here "larger" might well be up in the hundreds,
or perhaps even larger.

If you get a chance to experiment with this, could you please reply
to the discussion at the above URL?  (Or let me know, and I can CC
you on the next message in that thread.)

						Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 50ba70f019dea..0db7873f4e95b 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -513,7 +513,7 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
 
 #define SRCU_INTERVAL		1	// Base delay if no expedited GPs pending.
 #define SRCU_MAX_INTERVAL	10	// Maximum incremental delay from slow readers.
-#define SRCU_MAX_NODELAY_PHASE	1	// Maximum per-GP-phase consecutive no-delay instances.
+#define SRCU_MAX_NODELAY_PHASE	3	// Maximum per-GP-phase consecutive no-delay instances.
 #define SRCU_MAX_NODELAY	100	// Maximum consecutive no-delay instances.
 
 /*
@@ -522,16 +522,22 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
  */
 static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 {
+	unsigned long gpstart;
+	unsigned long j;
 	unsigned long jbase = SRCU_INTERVAL;
 
 	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
 		jbase = 0;
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)))
-		jbase += jiffies - READ_ONCE(ssp->srcu_gp_start);
-	if (!jbase) {
-		WRITE_ONCE(ssp->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
-		if (READ_ONCE(ssp->srcu_n_exp_nodelay) > SRCU_MAX_NODELAY_PHASE)
-			jbase = 1;
+	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) {
+		j = jiffies - 1;
+		gpstart = READ_ONCE(ssp->srcu_gp_start);
+		if (time_after(j, gpstart))
+			jbase += j - gpstart;
+		if (!jbase) {
+			WRITE_ONCE(ssp->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
+			if (READ_ONCE(ssp->srcu_n_exp_nodelay) > SRCU_MAX_NODELAY_PHASE)
+				jbase = 1;
+		}
 	}
 	return jbase > SRCU_MAX_INTERVAL ? SRCU_MAX_INTERVAL : jbase;
 }