lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 01 Feb 2024 19:49:23 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Eric Dumazet <edumazet@...gle.com>, "David S . Miller"
 <davem@...emloft.net>,  Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, eric.dumazet@...il.com, syzbot
	 <syzkaller@...glegroups.com>, Jiri Pirko <jiri@...dia.com>
Subject: Re: [PATCH net] netdevsim: avoid potential loop in
 nsim_dev_trap_report_work()

On Thu, 2024-02-01 at 17:53 +0000, Eric Dumazet wrote:
> Many syzbot reports include the following trace [1]
> 
> If nsim_dev_trap_report_work() can not grab the mutex,
> it should rearm itself at least one jiffie later.
> 
> [1]
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 32383 Comm: kworker/0:2 Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
> Workqueue: events nsim_dev_trap_report_work
>  RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:89 [inline]
>  RIP: 0010:memory_is_nonzero mm/kasan/generic.c:104 [inline]
>  RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:129 [inline]
>  RIP: 0010:memory_is_poisoned mm/kasan/generic.c:161 [inline]
>  RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline]
>  RIP: 0010:kasan_check_range+0x101/0x190 mm/kasan/generic.c:189
> Code: 07 49 39 d1 75 0a 45 3a 11 b8 01 00 00 00 7c 0b 44 89 c2 e8 21 ed ff ff 83 f0 01 5b 5d 41 5c c3 48 85 d2 74 4f 48 01 ea eb 09 <48> 83 c0 01 48 39 d0 74 41 80 38 00 74 f2 eb b6 41 bc 08 00 00 00
> RSP: 0018:ffffc90012dcf998 EFLAGS: 00000046
> RAX: fffffbfff258af1e RBX: fffffbfff258af1f RCX: ffffffff8168eda3
> RDX: fffffbfff258af1f RSI: 0000000000000004 RDI: ffffffff92c578f0
> RBP: fffffbfff258af1e R08: 0000000000000000 R09: fffffbfff258af1e
> R10: ffffffff92c578f3 R11: ffffffff8acbcbc0 R12: 0000000000000002
> R13: ffff88806db38400 R14: 1ffff920025b9f42 R15: ffffffff92c578e8
> FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000c00994e078 CR3: 000000002c250000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <NMI>
>  </NMI>
>  <TASK>
>   instrument_atomic_read include/linux/instrumented.h:68 [inline]
>   atomic_read include/linux/atomic/atomic-instrumented.h:32 [inline]
>   queued_spin_is_locked include/asm-generic/qspinlock.h:57 [inline]
>   debug_spin_unlock kernel/locking/spinlock_debug.c:101 [inline]
>   do_raw_spin_unlock+0x53/0x230 kernel/locking/spinlock_debug.c:141
>   __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:150 [inline]
>   _raw_spin_unlock_irqrestore+0x22/0x70 kernel/locking/spinlock.c:194
>   debug_object_activate+0x349/0x540 lib/debugobjects.c:726
>   debug_work_activate kernel/workqueue.c:578 [inline]
>   insert_work+0x30/0x230 kernel/workqueue.c:1650
>   __queue_work+0x62e/0x11d0 kernel/workqueue.c:1802
>   __queue_delayed_work+0x1bf/0x270 kernel/workqueue.c:1953
>   queue_delayed_work_on+0x106/0x130 kernel/workqueue.c:1989
>   queue_delayed_work include/linux/workqueue.h:563 [inline]
>   schedule_delayed_work include/linux/workqueue.h:677 [inline]
>   nsim_dev_trap_report_work+0x9c0/0xc80 drivers/net/netdevsim/dev.c:842
>   process_one_work+0x886/0x15d0 kernel/workqueue.c:2633
>   process_scheduled_works kernel/workqueue.c:2706 [inline]
>   worker_thread+0x8b9/0x1290 kernel/workqueue.c:2787
>   kthread+0x2c6/0x3a0 kernel/kthread.c:388
>   ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
>   ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
>  </TASK>
> 
> Fixes: 012ec02ae441 ("netdevsim: convert driver to use unlocked devlink API during init/fini")
> Reported-by: syzbot <syzkaller@...glegroups.com>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Jiri Pirko <jiri@...dia.com>
> ---
>  drivers/net/netdevsim/dev.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
> index b4d3b9cde8bd685202f135cf9c845d1be76ef428..92a7a36b93ac0cc1b02a551b974fb390254ac484 100644
> --- a/drivers/net/netdevsim/dev.c
> +++ b/drivers/net/netdevsim/dev.c
> @@ -835,14 +835,14 @@ static void nsim_dev_trap_report_work(struct work_struct *work)
>  				      trap_report_dw.work);
>  	nsim_dev = nsim_trap_data->nsim_dev;
>  
> -	/* For each running port and enabled packet trap, generate a UDP
> -	 * packet with a random 5-tuple and report it.
> -	 */
>  	if (!devl_trylock(priv_to_devlink(nsim_dev))) {
> -		schedule_delayed_work(&nsim_dev->trap_data->trap_report_dw, 0);
> +		schedule_delayed_work(&nsim_dev->trap_data->trap_report_dw, 1);

The patch LGTM, thanks!

I'm wondering if we have a similar problem in
devlink_rel_nested_in_notify_work():

	if (!devl_trylock(devlink)) {
		devlink_put(devlink);
		goto reschedule_work;
	}

	//...
reschedule_work:
	schedule_work(&rel->nested_in.notify_work);

And possibly adding 1ms delay there could be problematic?

Cheers,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ