[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250319155904.6616-1-kuniyu@amazon.com>
Date: Wed, 19 Mar 2025 08:57:46 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <pabeni@...hat.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <edumazet@...gle.com>,
<horms@...nel.org>, <kuba@...nel.org>, <kuni1840@...il.com>,
<kuniyu@...zon.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v1 net-next 0/7] nexthop: Convert RTM_{NEW,DEL}NEXTHOP to per-netns RTNL.
From: Paolo Abeni <pabeni@...hat.com>
Date: Wed, 19 Mar 2025 08:57:52 +0100
> Hi,
>
> On 3/19/25 12:31 AM, Kuniyuki Iwashima wrote:
> > Patch 1 - 5 move some validation for RTM_NEWNEXTHOP so that it can be
> > done without RTNL.
> >
> > Patch 6 & 7 converts RTM_NEWNEXTHOP and RTM_DELNEXTHOP to per-netns RTNL.
> >
> > Note that RTM_GETNEXTHOP and RTM_GETNEXTHOPBUCKET are not touched in
> > this series.
> >
> > rtm_get_nexthop() can be easily converted to RCU, but rtm_dump_nexthop()
> > needs more work due to the left-to-right rbtree walk, which looks prone
> > to node deletion and tree rotation without a retry mechanism.
> >
> >
> > Kuniyuki Iwashima (7):
> > nexthop: Move nlmsg_parse() in rtm_to_nh_config() to
> > rtm_new_nexthop().
> > nexthop: Split nh_check_attr_group().
> > nexthop: Move NHA_OIF validation to rtm_to_nh_config_rtnl().
> > nexthop: Check NLM_F_REPLACE and NHA_ID in rtm_new_nexthop().
> > nexthop: Remove redundant group len check in nexthop_create_group().
> > nexthop: Convert RTM_NEWNEXTHOP to per-netns RTNL.
> > nexthop: Convert RTM_DELNEXTHOP to per-netns RTNL.
> >
> > net/ipv4/nexthop.c | 183 +++++++++++++++++++++++++++------------------
> > 1 file changed, 112 insertions(+), 71 deletions(-)
>
> This series is apparently causing NULL ptr deref in the nexthop.sh
> netdevsim selftests. Unfortunately, due to a transient nipa infra
> outage, a lot of stuff landed into the same batch, so I'm not 110% this
> series is the real curprit but looks like a reasonable suspect.
>
> Kuniyuki, could you please have a look?
>
> ---
> [ 1.653896] BUG: kernel NULL pointer dereference, address:
> 0000000000000068
> [ 1.653963] #PF: supervisor read access in kernel mode
> [ 1.654003] #PF: error_code(0x0000) - not-present page
> [ 1.654037] PGD 7828067 P4D 7828067 PUD 782a067 PMD 0
> [ 1.654077] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 1.654119] CPU: 0 UID: 0 PID: 303 Comm: ip Not tainted
> 6.14.0-rc6-virtme #1
> [ 1.654176] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 1.654219] RIP: 0010:rtm_new_nexthop+0x645/0x2260
Sorry, I failed to resolve conflict during the last minute rebase,
and the normal test bailed out here...
---8<---
@@ -3245,7 +3248,7 @@ static int rtm_new_nexthop(struct sk_buff *skb, struct nlmsghdr *nlh,
goto out;
err = rtm_to_nh_config_rtnl(net, tb, extack);
- if (!err)
+ if (err)
goto out;
nh = nexthop_add(net, &cfg, extack);
---8<---
The failed test case created a nexthop group with an invalid ID,
and nexthop_get() for nh by nexthop_find_by_id() assumes nh is not
NULL because it's checked in advance.
Will squash the diff above in v2.
Thanks!
Powered by blists - more mailing lists