lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yd7EmC7HlxTg/fIl@Red>
Date:   Wed, 12 Jan 2022 13:07:52 +0100
From:   Corentin Labbe <clabbe.montjoie@...il.com>
To:     herbert@...dor.apana.org.au, davem@...emloft.net,
        linux-arm-kernel@...ts.infradead.org, netdev@...r.kernel.org,
        linux-crypto@...r.kernel.org, steffen.klassert@...unet.com
Cc:     linux-kernel@...r.kernel.org
Subject: Re: crypto: BUG: spinlock recursion when doing iperf over ipsec with
 crypto hardware device

Le Tue, Jan 11, 2022 at 10:47:12AM +0100, Corentin Labbe a écrit :

I forgot to add some mailing list (linux-crypto) and I add ipsec maintainer as spinlock is from ipsec code.

And I have now identified the spin_lock which is stuck:
It is the one from net/xfrm/xfrm_input.c:649

This is the patch used to identify it:
diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index a732d62dbe35..63d388fe0363 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -354,6 +354,12 @@ static __always_inline void spin_lock(spinlock_t *lock)
 	raw_spin_lock(&lock->rlock);
 }
 
+static __always_inline void spin_lock2(spinlock_t *lock, int vid)
+{
+	raw_spin_lock(&lock->rlock);
+	atomic64_set(&lock->rlock.id, vid);	\
+}
+
 static __always_inline void spin_lock_bh(spinlock_t *lock)
 {
 	raw_spin_lock_bh(&lock->rlock);
diff --git a/include/linux/spinlock_types_raw.h b/include/linux/spinlock_types_raw.h
index 91cb36b65a17..247ed688e9e7 100644
--- a/include/linux/spinlock_types_raw.h
+++ b/include/linux/spinlock_types_raw.h
@@ -2,6 +2,7 @@
 #define __LINUX_SPINLOCK_TYPES_RAW_H
 
 #include <linux/types.h>
+#include <linux/atomic.h>
 
 #if defined(CONFIG_SMP)
 # include <asm/spinlock_types.h>
@@ -16,6 +17,7 @@ typedef struct raw_spinlock {
 #ifdef CONFIG_DEBUG_SPINLOCK
 	unsigned int magic, owner_cpu;
 	void *owner;
+	atomic64_t id;
 #endif
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map dep_map;
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 14235671a1a7..685981594c54 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -27,6 +27,7 @@ void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
 	lock->magic = SPINLOCK_MAGIC;
 	lock->owner = SPINLOCK_OWNER_INIT;
 	lock->owner_cpu = -1;
+	atomic64_set(&lock->id, 666);	\
 }
 
 EXPORT_SYMBOL(__raw_spin_lock_init);
@@ -61,11 +62,12 @@ static void spin_dump(raw_spinlock_t *lock, const char *msg)
 		msg, raw_smp_processor_id(),
 		current->comm, task_pid_nr(current));
 	printk(KERN_EMERG " lock: %pS, .magic: %08x, .owner: %s/%d, "
-			".owner_cpu: %d\n",
+			".owner_cpu: %d ID=%lu\n",
 		lock, READ_ONCE(lock->magic),
 		owner ? owner->comm : "<none>",
 		owner ? task_pid_nr(owner) : -1,
-		READ_ONCE(lock->owner_cpu));
+		READ_ONCE(lock->owner_cpu),
+		atomic64_read(&lock->id));
 	dump_stack();
 }
 
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 144238a50f3d..f874c4af9220 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -596,7 +596,7 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 		}
 
 lock:
-		spin_lock(&x->lock);
+		spin_lock2(&x->lock, 15);
 
 		if (unlikely(x->km.state != XFRM_STATE_VALID)) {
 			if (x->km.state == XFRM_STATE_ACQ)
@@ -646,7 +646,7 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 resume:
 		dev_put(skb->dev);
 
-		spin_lock(&x->lock);
+		spin_lock2(&x->lock, 14);
 		if (nexthdr < 0) {
 			if (nexthdr == -EBADMSG) {
 				xfrm_audit_state_icvfail(x, skb,
And I got:
[   49.170676] BUG: spinlock recursion on CPU#0, 1c15000.crypto-/58
[   49.176747]  lock: 0xc43642a8, .magic: dead4ead, .owner: 1c15000.crypto-/58, .owner_cpu: 0 ID=14

Anyway, I believed that spinlock was from crypto/crypto_engine.c but the stack trace said the problem is from a basic spin_lock() and crypto_engine use only spin_lock_irq()

Regards

> Hello
> 
> I work on adding crypto/engine support to my crypto driver sun4i-ss.
> Selftests are passed but when doing iperf over ipsec I hit a spinlock recursion BUG.
> 
> This bug happen also against my sun8i-ce driver.
> But this bug happen only on ARM (SoCs H3) and not on arm64 (SoCs H6, a64 tested) for it.
> 
> This is the full trace:
> iperf 3.10.1
> Linux buildroot 5.15.13-00003-g14f78eca130d #199 SMP Fri Jan 7 08:35:20 CET 2022 armv7l
> Control connection MSS 1386
> Time: Fri, 07 Jan 2022 08:52:27 GMT
> Connecting to host ipsec.lava.local, port 5201
>       Cookie: ylt5ilt5n4jnw72qu2lik5njq63rfalkshz6
>       TCP MSS: 1386 (default)
> [  5] local 192.168.1.204 port 33828 connected to 192.168.1.40 port 5201
> Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test, tos 0
> [   44.412526] BUG: spinlock recursion on CPU#0, 1c15000.crypto-/111
> [   44.418674]  lock: 0xc3f40268, .magic: dead4ead, .owner: 1c15000.crypto-/111, .owner_cpu: 0
> [   44.427040] CPU: 0 PID: 111 Comm: 1c15000.crypto- Not tainted 5.15.13-00003-g14f78eca130d #199
> [   44.435647] Hardware name: Allwinner sun8i Family
> [   44.440354] [<c010d7a0>] (unwind_backtrace) from [<c0109fa0>] (show_stack+0x10/0x14)
> [   44.448110] [<c0109fa0>] (show_stack) from [<c0857fc4>] (dump_stack_lvl+0x40/0x4c)
> [   44.455685] [<c0857fc4>] (dump_stack_lvl) from [<c016c5dc>] (do_raw_spin_lock+0x11c/0x120)
> [   44.463953] [<c016c5dc>] (do_raw_spin_lock) from [<c0809034>] (xfrm_input+0x17c/0x1304)
> [   44.471961] [<c0809034>] (xfrm_input) from [<c07f9804>] (xfrm4_esp_rcv+0x44/0x84)
> [   44.479451] [<c07f9804>] (xfrm4_esp_rcv) from [<c0794c58>] (ip_protocol_deliver_rcu+0x2c/0x2b8)
> [   44.488156] [<c0794c58>] (ip_protocol_deliver_rcu) from [<c0794f48>] (ip_local_deliver_finish+0x64/0x80)
> [   44.497636] [<c0794f48>] (ip_local_deliver_finish) from [<c0793f58>] (ip_sublist_rcv_finish+0x3c/0x50)
> [   44.506944] [<c0793f58>] (ip_sublist_rcv_finish) from [<c0794af8>] (ip_sublist_rcv+0x180/0x1a0)
> [   44.515645] [<c0794af8>] (ip_sublist_rcv) from [<c07952d8>] (ip_list_rcv+0xf0/0x110)
> [   44.523391] [<c07952d8>] (ip_list_rcv) from [<c0735f5c>] (__netif_receive_skb_list_core+0x194/0x1fc)
> [   44.532527] [<c0735f5c>] (__netif_receive_skb_list_core) from [<c07361b0>] (netif_receive_skb_list_internal+0x1ec/0x304)
> [   44.543394] [<c07361b0>] (netif_receive_skb_list_internal) from [<c0736ef0>] (napi_complete_done+0x130/0x1c8)
> [   44.553306] [<c0736ef0>] (napi_complete_done) from [<c05ab408>] (stmmac_napi_poll_rx+0xa7c/0xbd8)
> [   44.562182] [<c05ab408>] (stmmac_napi_poll_rx) from [<c0736fb0>] (__napi_poll+0x28/0x170)
> [   44.570359] [<c0736fb0>] (__napi_poll) from [<c0737364>] (net_rx_action+0xf0/0x278)
> [   44.578016] [<c0737364>] (net_rx_action) from [<c010126c>] (__do_softirq+0x104/0x29c)
> [   44.585848] [<c010126c>] (__do_softirq) from [<c01231ac>] (irq_exit+0xbc/0x100)
> [   44.593162] [<c01231ac>] (irq_exit) from [<c0174300>] (handle_domain_irq+0x60/0x78)
> [   44.600824] [<c0174300>] (handle_domain_irq) from [<c044000c>] (gic_handle_irq+0x7c/0x90)
> [   44.609007] [<c044000c>] (gic_handle_irq) from [<c0100afc>] (__irq_svc+0x5c/0x78)
> [   44.616491] Exception stack(0xc1b79e38 to 0xc1b79e80)
> [   44.621541] 9e20:                                                       c1c13c00 c0de5100
> [   44.629712] 9e40: c1b78000 00000002 c3f40240 ffffe000 ec010000 c1b78000 00000002 00000000
> [   44.637883] 9e60: c3f40268 c0de5100 c3c369c0 c1b79e88 c0809388 c080b9d4 a0000113 ffffffff
> [   44.646050] [<c0100afc>] (__irq_svc) from [<c080b9d4>] (xfrm_replay_advance+0x11c/0x3dc)
> [   44.654143] [<c080b9d4>] (xfrm_replay_advance) from [<c0809388>] (xfrm_input+0x4d0/0x1304)
> [   44.662408] [<c0809388>] (xfrm_input) from [<c03a3d88>] (crypto_finalize_request+0x5c/0xc4)
> [   44.670766] [<c03a3d88>] (crypto_finalize_request) from [<c06a0888>] (sun8i_ce_cipher_run+0x34/0x3c)
> [   44.679900] [<c06a0888>] (sun8i_ce_cipher_run) from [<c03a4264>] (crypto_pump_work+0x1a8/0x330)
> [   44.688600] [<c03a4264>] (crypto_pump_work) from [<c013e448>] (kthread_worker_fn+0xd8/0x220)
> [   44.697045] [<c013e448>] (kthread_worker_fn) from [<c013f708>] (kthread+0x15c/0x180)
> [   44.704793] [<c013f708>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
> [   44.712017] Exception stack(0xc1b79fb0 to 0xc1b79ff8)
> [   44.717066] 9fa0:                                     00000000 00000000 00000000 00000000
> [   44.725237] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   44.733406] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.11   sec  1.11 MBytes  8.39 Mbits/sec    1   1.35 KBytes       
> [  5]   1.11-2.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [   46.438402] sched: RT throttling activated
> [   46.438402] sched: RT throttling activated
> [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.35 KBytes       
> [   65.398401] rcu: INFO: rcu_sched self-detected stall on CPU
> [   65.403987] rcu: 	0-....: (2068 ticks this GP) idle=afd/1/0x40000004 softirq=3056/3057 fqs=1050 
> [   65.412774] 	(t=2102 jiffies g=1437 q=434)
> [   65.416873] NMI backtrace for cpu 0
> [   65.420361] CPU: 0 PID: 111 Comm: 1c15000.crypto- Not tainted 5.15.13-00003-g14f78eca130d #199
> [   65.428965] Hardware name: Allwinner sun8i Family
> [   65.433669] [<c010d7a0>] (unwind_backtrace) from [<c0109fa0>] (show_stack+0x10/0x14)
> [   65.441424] [<c0109fa0>] (show_stack) from [<c0857fc4>] (dump_stack_lvl+0x40/0x4c)
> [   65.448998] [<c0857fc4>] (dump_stack_lvl) from [<c042dc1c>] (nmi_cpu_backtrace+0xdc/0x110)
> [   65.457267] [<c042dc1c>] (nmi_cpu_backtrace) from [<c042dd58>] (nmi_trigger_cpumask_backtrace+0x108/0x168)
> [   65.466918] [<c042dd58>] (nmi_trigger_cpumask_backtrace) from [<c0856bf8>] (rcu_dump_cpu_stacks+0x128/0x15c)
> [   65.476745] [<c0856bf8>] (rcu_dump_cpu_stacks) from [<c0188224>] (rcu_sched_clock_irq+0x798/0x950)
> [   65.485707] [<c0188224>] (rcu_sched_clock_irq) from [<c0193608>] (update_process_times+0x9c/0xd0)
> [   65.494581] [<c0193608>] (update_process_times) from [<c01a5b14>] (tick_sched_timer+0x7c/0xf0)
> [   65.503197] [<c01a5b14>] (tick_sched_timer) from [<c0193e3c>] (__hrtimer_run_queues+0x15c/0x218)
> [   65.511984] [<c0193e3c>] (__hrtimer_run_queues) from [<c0194efc>] (hrtimer_interrupt+0x124/0x2b0)
> [   65.520857] [<c0194efc>] (hrtimer_interrupt) from [<c06a552c>] (arch_timer_handler_phys+0x28/0x30)
> [   65.529820] [<c06a552c>] (arch_timer_handler_phys) from [<c017a654>] (handle_percpu_devid_irq+0x78/0x13c)
> [   65.539388] [<c017a654>] (handle_percpu_devid_irq) from [<c01742fc>] (handle_domain_irq+0x5c/0x78)
> [   65.548352] [<c01742fc>] (handle_domain_irq) from [<c044000c>] (gic_handle_irq+0x7c/0x90)
> [   65.556533] [<c044000c>] (gic_handle_irq) from [<c0100afc>] (__irq_svc+0x5c/0x78)
> [   65.564017] Exception stack(0xc1b79a30 to 0xc1b79a78)
> [   65.569067] 9a20:                                     00000000 00000000 000003dc 000003da
> [   65.577238] 9a40: c3f40268 ffffe000 00000000 c1b78000 00000002 00000000 c3f40268 c0de5100
> [   65.585408] 9a60: c0b496d0 c1b79a80 c016c5b0 c016c538 80000113 ffffffff
> [   65.592015] [<c0100afc>] (__irq_svc) from [<c016c538>] (do_raw_spin_lock+0x78/0x120)
> [   65.599759] [<c016c538>] (do_raw_spin_lock) from [<c0809034>] (xfrm_input+0x17c/0x1304)
> [   65.607765] [<c0809034>] (xfrm_input) from [<c07f9804>] (xfrm4_esp_rcv+0x44/0x84)
> [   65.615255] [<c07f9804>] (xfrm4_esp_rcv) from [<c0794c58>] (ip_protocol_deliver_rcu+0x2c/0x2b8)
> [   65.623960] [<c0794c58>] (ip_protocol_deliver_rcu) from [<c0794f48>] (ip_local_deliver_finish+0x64/0x80)
> [   65.633440] [<c0794f48>] (ip_local_deliver_finish) from [<c0793f58>] (ip_sublist_rcv_finish+0x3c/0x50)
> [   65.642747] [<c0793f58>] (ip_sublist_rcv_finish) from [<c0794af8>] (ip_sublist_rcv+0x180/0x1a0)
> [   65.651447] [<c0794af8>] (ip_sublist_rcv) from [<c07952d8>] (ip_list_rcv+0xf0/0x110)
> [   65.659193] [<c07952d8>] (ip_list_rcv) from [<c0735f5c>] (__netif_receive_skb_list_core+0x194/0x1fc)
> [   65.668329] [<c0735f5c>] (__netif_receive_skb_list_core) from [<c07361b0>] (netif_receive_skb_list_internal+0x1ec/0x304)
> [   65.679196] [<c07361b0>] (netif_receive_skb_list_internal) from [<c0736ef0>] (napi_complete_done+0x130/0x1c8)
> [   65.689108] [<c0736ef0>] (napi_complete_done) from [<c05ab408>] (stmmac_napi_poll_rx+0xa7c/0xbd8)
> [   65.697982] [<c05ab408>] (stmmac_napi_poll_rx) from [<c0736fb0>] (__napi_poll+0x28/0x170)
> [   65.706159] [<c0736fb0>] (__napi_poll) from [<c0737364>] (net_rx_action+0xf0/0x278)
> [   65.713816] [<c0737364>] (net_rx_action) from [<c010126c>] (__do_softirq+0x104/0x29c)
> [   65.721647] [<c010126c>] (__do_softirq) from [<c01231ac>] (irq_exit+0xbc/0x100)
> [   65.728960] [<c01231ac>] (irq_exit) from [<c0174300>] (handle_domain_irq+0x60/0x78)
> [   65.736620] [<c0174300>] (handle_domain_irq) from [<c044000c>] (gic_handle_irq+0x7c/0x90)
> [   65.744799] [<c044000c>] (gic_handle_irq) from [<c0100afc>] (__irq_svc+0x5c/0x78)
> [   65.752281] Exception stack(0xc1b79e38 to 0xc1b79e80)
> [   65.757329] 9e20:                                                       c1c13c00 c0de5100
> [   65.765500] 9e40: c1b78000 00000002 c3f40240 ffffe000 ec010000 c1b78000 00000002 00000000
> [   65.773671] 9e60: c3f40268 c0de5100 c3c369c0 c1b79e88 c0809388 c080b9d4 a0000113 ffffffff
> [   65.781837] [<c0100afc>] (__irq_svc) from [<c080b9d4>] (xfrm_replay_advance+0x11c/0x3dc)
> [   65.789929] [<c080b9d4>] (xfrm_replay_advance) from [<c0809388>] (xfrm_input+0x4d0/0x1304)
> [   65.798196] [<c0809388>] (xfrm_input) from [<c03a3d88>] (crypto_finalize_request+0x5c/0xc4)
> [   65.806552] [<c03a3d88>] (crypto_finalize_request) from [<c06a0888>] (sun8i_ce_cipher_run+0x34/0x3c)
> [   65.815686] [<c06a0888>] (sun8i_ce_cipher_run) from [<c03a4264>] (crypto_pump_work+0x1a8/0x330)
> [   65.824384] [<c03a4264>] (crypto_pump_work) from [<c013e448>] (kthread_worker_fn+0xd8/0x220)
> [   65.832827] [<c013e448>] (kthread_worker_fn) from [<c013f708>] (kthread+0x15c/0x180)
> [   65.840574] [<c013f708>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
> [   65.847799] Exception stack(0xc1b79fb0 to 0xc1b79ff8)
> [   65.852847] 9fa0:                                     00000000 00000000 00000000 00000000
> [   65.861017] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   65.869186] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [  128.458402] rcu: INFO: rcu_sched self-detected stall on CPU
> [  128.463984] rcu: 	0-....: (8328 ticks this GP) idle=afd/1/0x40000004 softirq=3056/3057 fqs=3872 
> [  128.472769] 	(t=8408 jiffies g=1437 q=435)
> [  128.476867] NMI backtrace for cpu 0
> [  128.480354] CPU: 0 PID: 111 Comm: 1c15000.crypto- Not tainted 5.15.13-00003-g14f78eca130d #199
> [  128.488958] Hardware name: Allwinner sun8i Family
> [  128.493660] [<c010d7a0>] (unwind_backtrace) from [<c0109fa0>] (show_stack+0x10/0x14)
> [  128.501413] [<c0109fa0>] (show_stack) from [<c0857fc4>] (dump_stack_lvl+0x40/0x4c)
> [  128.508985] [<c0857fc4>] (dump_stack_lvl) from [<c042dc1c>] (nmi_cpu_backtrace+0xdc/0x110)
> [  128.517252] [<c042dc1c>] (nmi_cpu_backtrace) from [<c042dd58>] (nmi_trigger_cpumask_backtrace+0x108/0x168)
> [  128.526904] [<c042dd58>] (nmi_trigger_cpumask_backtrace) from [<c0856bf8>] (rcu_dump_cpu_stacks+0x128/0x15c)
> [  128.536730] [<c0856bf8>] (rcu_dump_cpu_stacks) from [<c0188224>] (rcu_sched_clock_irq+0x798/0x950)
> [  128.545691] [<c0188224>] (rcu_sched_clock_irq) from [<c0193608>] (update_process_times+0x9c/0xd0)
> [  128.554564] [<c0193608>] (update_process_times) from [<c01a5b14>] (tick_sched_timer+0x7c/0xf0)
> [  128.563179] [<c01a5b14>] (tick_sched_timer) from [<c0193e3c>] (__hrtimer_run_queues+0x15c/0x218)
> [  128.571966] [<c0193e3c>] (__hrtimer_run_queues) from [<c0194efc>] (hrtimer_interrupt+0x124/0x2b0)
> [  128.580838] [<c0194efc>] (hrtimer_interrupt) from [<c06a552c>] (arch_timer_handler_phys+0x28/0x30)
> [  128.589800] [<c06a552c>] (arch_timer_handler_phys) from [<c017a654>] (handle_percpu_devid_irq+0x78/0x13c)
> [  128.599369] [<c017a654>] (handle_percpu_devid_irq) from [<c01742fc>] (handle_domain_irq+0x5c/0x78)
> [  128.608332] [<c01742fc>] (handle_domain_irq) from [<c044000c>] (gic_handle_irq+0x7c/0x90)
> [  128.616514] [<c044000c>] (gic_handle_irq) from [<c0100afc>] (__irq_svc+0x5c/0x78)
> [  128.623997] Exception stack(0xc1b79a30 to 0xc1b79a78)
> [  128.629047] 9a20:                                     00000000 00000000 000003dc 000003da
> [  128.637218] 9a40: c3f40268 ffffe000 00000000 c1b78000 00000002 00000000 c3f40268 c0de5100
> [  128.645388] 9a60: c0b496d0 c1b79a80 c016c5b0 c016c538 80000113 ffffffff
> [  128.651995] [<c0100afc>] (__irq_svc) from [<c016c538>] (do_raw_spin_lock+0x78/0x120)
> [  128.659738] [<c016c538>] (do_raw_spin_lock) from [<c0809034>] (xfrm_input+0x17c/0x1304)
> [  128.667744] [<c0809034>] (xfrm_input) from [<c07f9804>] (xfrm4_esp_rcv+0x44/0x84)
> [  128.675233] [<c07f9804>] (xfrm4_esp_rcv) from [<c0794c58>] (ip_protocol_deliver_rcu+0x2c/0x2b8)
> [  128.683936] [<c0794c58>] (ip_protocol_deliver_rcu) from [<c0794f48>] (ip_local_deliver_finish+0x64/0x80)
> [  128.693408] [<c0794f48>] (ip_local_deliver_finish) from [<c0793f58>] (ip_sublist_rcv_finish+0x3c/0x50)
> [  128.702716] [<c0793f58>] (ip_sublist_rcv_finish) from [<c0794af8>] (ip_sublist_rcv+0x180/0x1a0)
> [  128.711417] [<c0794af8>] (ip_sublist_rcv) from [<c07952d8>] (ip_list_rcv+0xf0/0x110)
> [  128.719163] [<c07952d8>] (ip_list_rcv) from [<c0735f5c>] (__netif_receive_skb_list_core+0x194/0x1fc)
> [  128.728299] [<c0735f5c>] (__netif_receive_skb_list_core) from [<c07361b0>] (netif_receive_skb_list_internal+0x1ec/0x304)
> [  128.739165] [<c07361b0>] (netif_receive_skb_list_internal) from [<c0736ef0>] (napi_complete_done+0x130/0x1c8)
> [  128.749077] [<c0736ef0>] (napi_complete_done) from [<c05ab408>] (stmmac_napi_poll_rx+0xa7c/0xbd8)
> [  128.757951] [<c05ab408>] (stmmac_napi_poll_rx) from [<c0736fb0>] (__napi_poll+0x28/0x170)
> [  128.766129] [<c0736fb0>] (__napi_poll) from [<c0737364>] (net_rx_action+0xf0/0x278)
> [  128.773786] [<c0737364>] (net_rx_action) from [<c010126c>] (__do_softirq+0x104/0x29c)
> [  128.781617] [<c010126c>] (__do_softirq) from [<c01231ac>] (irq_exit+0xbc/0x100)
> [  128.788930] [<c01231ac>] (irq_exit) from [<c0174300>] (handle_domain_irq+0x60/0x78)
> [  128.796589] [<c0174300>] (handle_domain_irq) from [<c044000c>] (gic_handle_irq+0x7c/0x90)
> [  128.804768] [<c044000c>] (gic_handle_irq) from [<c0100afc>] (__irq_svc+0x5c/0x78)
> [  128.812250] Exception stack(0xc1b79e38 to 0xc1b79e80)
> [  128.817298] 9e20:                                                       c1c13c00 c0de5100
> [  128.825469] 9e40: c1b78000 00000002 c3f40240 ffffe000 ec010000 c1b78000 00000002 00000000
> [  128.833640] 9e60: c3f40268 c0de5100 c3c369c0 c1b79e88 c0809388 c080b9d4 a0000113 ffffffff
> [  128.841806] [<c0100afc>] (__irq_svc) from [<c080b9d4>] (xfrm_replay_advance+0x11c/0x3dc)
> [  128.849898] [<c080b9d4>] (xfrm_replay_advance) from [<c0809388>] (xfrm_input+0x4d0/0x1304)
> [  128.858163] [<c0809388>] (xfrm_input) from [<c03a3d88>] (crypto_finalize_request+0x5c/0xc4)
> [  128.866518] [<c03a3d88>] (crypto_finalize_request) from [<c06a0888>] (sun8i_ce_cipher_run+0x34/0x3c)
> [  128.875651] [<c06a0888>] (sun8i_ce_cipher_run) from [<c03a4264>] (crypto_pump_work+0x1a8/0x330)
> [  128.884349] [<c03a4264>] (crypto_pump_work) from [<c013e448>] (kthread_worker_fn+0xd8/0x220)
> [  128.892790] [<c013e448>] (kthread_worker_fn) from [<c013f708>] (kthread+0x15c/0x180)
> [  128.900537] [<c013f708>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
> [  128.907760] Exception stack(0xc1b79fb0 to 0xc1b79ff8)
> [  128.912809] 9fa0:                                     00000000 00000000 00000000 00000000
> [  128.920978] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [  128.929147] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> 
> None on my drivers are using spinlock so the only one which remains is crypto/engine one (queue_lock)
> 
> I have started to bisect this, but this lead to unrelated commits. (I tried twice with different starting point)
> 
> Known broken version are
> 5.13 5.14 5.15.13 next-20220105 5.10
> I randomly hit also "NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!".
> On 5.6 and 5.7 I got: NOHZ: local_softirq_pending 08
> 
> Working version are 5.4.170 and 5.5
> 
> The normal iperf performance is at least 30Mbit/s, when bug is present, iperf never end.
> But When the bug is present with "NOHZ: local_softirq_pending 08", iperf give 1Mbit/s.
> 
> I started also to test other crypto driver which use the crypto/crypto_engine
> On my sun8i-ss, no kernel trace are given, but iperf performance decrease is present.
> I also tested omap-aes and it seems un-impacted.
> 
> Any help to debug that is welcome.
> Thanks in advance
> 
> Regards
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ