lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250515164731.48991-1-kuniyu@amazon.com>
Date: Thu, 15 May 2025 09:46:13 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <pabeni@...hat.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <edumazet@...gle.com>,
	<horms@...nel.org>, <kuba@...nel.org>, <kuni1840@...il.com>,
	<kuniyu@...zon.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series.

From: Paolo Abeni <pabeni@...hat.com>
Date: Thu, 15 May 2025 11:02:34 +0200
> On 5/15/25 4:05 AM, Kuniyuki Iwashima wrote:
> > From: Jakub Kicinski <kuba@...nel.org>
> > Date: Wed, 14 May 2025 18:45:02 -0700
> >> On Wed, 14 May 2025 13:18:53 -0700 Kuniyuki Iwashima wrote:
> >>> Patch 1 removes rcu_read_lock() in fib6_get_table().
> >>> Patch 2 removes rtnl_is_held arg for lwtunnel_valid_encap_type(), which
> >>>  was short-term fix and is no longer used.
> >>> Patch 3 fixes RCU vs GFP_KERNEL report by syzkaller.
> >>> Patch 4~7 reverts GFP_ATOMIC uses to GFP_KERNEL.
> >>
> >> Hi! Something in the following set of patches is making our CI time out.
> >> The problem seems to be:
> >>
> >> [    0.751266] virtme-init: waiting for udev to settle
> >> Timed out for waiting the udev queue being empty.
> >> [  120.826428] virtme-init: udev is done
> >>
> >> +team: grab team lock during team_change_rx_flags
> >> +net: mana: Add handler for hardware servicing events
> >> +ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE.
> >> +ipv6: Pass gfp_flags down to ip6_route_info_create_nh().
> >> +Revert "ipv6: Factorise ip6_route_multipath_add()."
> >> +Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup"
> >> +ipv6: Narrow down RCU critical section in inet6_rtm_newroute().
> >> +inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?().
> >> +ipv6: Remove rcu_read_lock() in fib6_get_table().
> >> +net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead
> >>  amd-xgbe: read link status twice to avoid inconsistencies
> >> +net: phy: fixed_phy: remove fixed_phy_register_with_gpiod
> >>  drivers: net: mvpp2: attempt to refill rx before allocating skb
> >> +selftest: af_unix: Test SO_PASSRIGHTS.
> >> +af_unix: Introduce SO_PASSRIGHTS.
> >> +af_unix: Inherit sk_flags at connect().
> >> +af_unix: Move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
> >> +net: Restrict SO_PASS{CRED,PIDFD,SEC} to AF_{UNIX,NETLINK,BLUETOOTH}.
> >> +tcp: Restrict SO_TXREHASH to TCP socket.
> >> +scm: Move scm_recv() from scm.h to scm.c.
> >> +af_unix: Don't pass struct socket to maybe_add_creds().
> >> +af_unix: Factorise test_bit() for SOCK_PASSCRED and SOCK_PASSPIDFD.
> >>
> >> I haven't dug into it, gotta review / apply other patches :(
> >> Maybe you can try to repro? 
> > 
> > I think I was able to reproduce it with SO_PASSRIGHTS series
> > with virtme-ng (but not with normal qemu with AL2023 rootfs).
> > 
> > After 2min, virtme-ng showed the console.
> > 
> > [    1.461450] virtme-ng-init: triggering udev coldplug
> > [    1.533147] virtme-ng-init: waiting for udev to settle
> > [  121.588624] virtme-ng-init: Timed out for waiting the udev queue being empty.
> > [  121.588710] virtme-ng-init: udev is done
> > [  121.593214] virtme-ng-init: initialization done
> >           _      _
> >    __   _(_)_ __| |_ _ __ ___   ___       _ __   __ _
> >    \ \ / / |  __| __|  _   _ \ / _ \_____|  _ \ / _  |
> >     \ V /| | |  | |_| | | | | |  __/_____| | | | (_| |
> >      \_/ |_|_|   \__|_| |_| |_|\___|     |_| |_|\__  |
> >                                                 |___/
> >    kernel version: 6.15.0-rc4-virtme-00071-gceba111cf5e7 x86_64
> >    (CTRL+d to exit)
> > 
> > 
> > Will investigate the cause.
> > 
> > Sorry, but please drop the series and kick the CI again.
> 
> FTR I think some CI iterations survived the boot and hit the following,
> in several forwarding tests (i.e. router-multipath-sh)

Oh thanks!

I learnt "make TARGETS=net run_tests" doesn't run forwarding tests.

Will fix in v2.


> 
> [  922.307796][ T6194] =============================
> [  922.308069][ T6194] WARNING: suspicious RCU usage
> [  922.308339][ T6194] 6.15.0-rc5-virtme #1 Not tainted
> [  922.308596][ T6194] -----------------------------
> [  922.308860][ T6194] ./include/net/addrconf.h:347 suspicious
> rcu_dereference_check() usage!
> [  922.309352][ T6194]
> [  922.309352][ T6194] other info that might help us debug this:
> [  922.309352][ T6194]
> [  922.310105][ T6194]
> [  922.310105][ T6194] rcu_scheduler_active = 2, debug_locks = 1
> [  922.310501][ T6194] 1 lock held by ip/6194:
> [  922.310704][ T6194]  #0: ffff888012942630
> (&tb->tb6_lock){+...}-{3:3}, at: ip6_route_multipath_add+0x743/0x1450
> [  922.311255][ T6194]
> [  922.311255][ T6194] stack backtrace:
> [  922.311577][ T6194] CPU: 1 UID: 0 PID: 6194 Comm: ip Not tainted
> 6.15.0-rc5-virtme #1 PREEMPT(full)
> [  922.311583][ T6194] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [  922.311585][ T6194] Call Trace:
> [  922.311589][ T6194]  <TASK>
> [  922.311591][ T6194]  dump_stack_lvl+0xb0/0xd0
> [  922.311605][ T6194]  lockdep_rcu_suspicious+0x166/0x270
> [  922.311619][ T6194]  rt6_multipath_rebalance.part.0+0x70c/0x8a0
> [  922.311628][ T6194]  fib6_add_rt2node+0xa36/0x2c00
> [  922.311668][ T6194]  fib6_add+0x38d/0xec0
> [  922.311699][ T6194]  ip6_route_multipath_add+0x75b/0x1450
> [  922.311753][ T6194]  inet6_rtm_newroute+0xb2/0x120
> [  922.311795][ T6194]  rtnetlink_rcv_msg+0x710/0xc00
> [  922.311819][ T6194]  netlink_rcv_skb+0x12f/0x360
> [  922.311869][ T6194]  netlink_unicast+0x449/0x710
> [  922.311891][ T6194]  netlink_sendmsg+0x721/0xbe0
> [  922.311922][ T6194]  ____sys_sendmsg+0x7aa/0xa10
> [  922.311954][ T6194]  ___sys_sendmsg+0xed/0x170
> [  922.312031][ T6194]  __sys_sendmsg+0x108/0x1a0
> [  922.312061][ T6194]  do_syscall_64+0xc1/0x1d0
> [  922.312069][ T6194]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [  922.312074][ T6194] RIP: 0033:0x7f8e77c649a7
> [  922.312078][ T6194] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff
> eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00
> 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89
> 74 24 10
> [  922.312081][ T6194] RSP: 002b:00007ffd73480708 EFLAGS: 00000246
> ORIG_RAX: 000000000000002e
> [  922.312086][ T6194] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> 00007f8e77c649a7
> [  922.312088][ T6194] RDX: 0000000000000000 RSI: 00007ffd73480770 RDI:
> 0000000000000005
> [  922.312090][ T6194] RBP: 00007ffd73480abc R08: 0000000000000038 R09:
> 0000000000000000
> [  922.312092][ T6194] R10: 000000000b9c6910 R11: 0000000000000246 R12:
> 00007ffd73481a80
> [  922.312094][ T6194] R13: 00000000682562aa R14: 0000000000498600 R15:
> 00007ffd7348499b
> [  922.312108][ T6194]  </TASK>
> 
> see:
> 
> https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-05-15--03-00&executor=vmksft-forwarding-dbg&pw-n=0&pass=0
> 
> Thanks,
> 
> Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ