lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPZ4fLKBiCCIGr9e@stanley.mountain>
Date: Mon, 20 Oct 2025 20:59:24 +0300
From: Dan Carpenter <dan.carpenter@...aro.org>
To: Lizhi Xu <lizhi.xu@...driver.com>
Cc: davem@...emloft.net, edumazet@...gle.com, horms@...nel.org,
	kuba@...nel.org, linux-hams@...r.kernel.org,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	pabeni@...hat.com,
	syzbot+2860e75836a08b172755@...kaller.appspotmail.com,
	syzkaller-bugs@...glegroups.com
Subject: Re: [PATCH V2] netrom: Prevent race conditions between multiple add
 route

On Mon, Oct 20, 2025 at 09:49:12PM +0800, Lizhi Xu wrote:
> On Mon, 20 Oct 2025 21:34:56 +0800, Lizhi Xu wrote:
> > > Task0					Task1						Task2
> > > =====					=====						=====
> > > [97] nr_add_node()
> > > [113] nr_neigh_get_dev()		[97] nr_add_node()
> > > 					[214] nr_node_lock()
> > > 					[245] nr_node->routes[2].neighbour->count--
> > > 					[246] nr_neigh_put(nr_node->routes[2].neighbour);
> > > 					[248] nr_remove_neigh(nr_node->routes[2].neighbour)
> > > 					[283] nr_node_unlock()
> > > [214] nr_node_lock()
> > > [253] nr_node->routes[2].neighbour = nr_neigh
> > > [254] nr_neigh_hold(nr_neigh);							[97] nr_add_node()
> > > 											[XXX] nr_neigh_put()
> > >                                                                                         ^^^^^^^^^^^^^^^^^^^^
> > > 
> > > These charts are supposed to be chronological so [XXX] is wrong because the
> > > use after free happens on line [248].  Do we really need three threads to
> > > make this race work?
> > The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
> > then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
> > executing [XXX]nr_neigh_put().
> Execution Order:
> 1 -> Task0
> [113] nr_neigh_get_dev() // After execution, the refcount value is 3
> 
> 2 -> Task1
> [246] nr_neigh_put(nr_node->routes[2].neighbour);   // After execution, the refcount value is 2
> [248] nr_remove_neigh(nr_node->routes[2].neighbour) // After execution, the refcount value is 1
> 
> 3 -> Task0
> [253] nr_node->routes[2].neighbour = nr_neigh       // nr_neigh's refcount value is 1 and add it to routes[2]
> 
> 4 -> Task2
> [XXX] nr_neigh_put(nr_node->routes[2].neighbour)    // After execution, neighhour is freed
> if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)  // Uaf occurs this line when accessing neighbour->count

Let's step back a bit and look at the bigger picture design.  (Which is
completely undocumented so we're just guessing).

When we put nr_neigh into nr_node->routes[] we bump the nr_neigh_hold()
reference count and nr_neigh->count++, then when we remove it from
->routes[] we drop the reference and do nr_neigh->count--.

If it's the last reference (and we are not holding ->locked) then we
remove it from the &nr_neigh_list and drop the reference count again and
free it.  So we drop the reference count twice.  This is a complicated
design with three variables: nr_neigh_hold(), nr_neigh->count and
->locked.  Why can it not just be one counter nr_neigh_hold().  So
instead of setting locked = true we would just take an extra reference?
The nr_neigh->count++ would be replaced with nr_neigh_hold() as well.

Because that's fundamentally the problem, right?  We call
nr_neigh_get_dev() so we think we're holding a reference and we're
safe, but we don't realize that calling neighbour->count-- can
result in dropping two references.

regards,
dan carpenter


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ