lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251214182903.jliq46n6ulbrwoha@skbuf>
Date: Sun, 14 Dec 2025 20:29:03 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Jonas Gorski <jonas.gorski@...il.com>
Cc: Ma Ke <make24@...as.ac.cn>, andrew@...n.ch, davem@...emloft.net,
	edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
	horms@...nel.org, tobias@...dekranz.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	stable@...r.kernel.org
Subject: Re: [PATCH v2] net: dsa: Fix error handling in dsa_port_parse_of

On Sun, Dec 14, 2025 at 05:14:04PM +0100, Jonas Gorski wrote:
> On Sun, Dec 14, 2025 at 5:10 PM Vladimir Oltean <olteanv@...il.com> wrote:
> >
> > Hi Jonas, Ma Ke,
> >
> > On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
> > > Hi,
> > >
> > > On 12/14/25 14:12, Ma Ke wrote:
> > > > When of_find_net_device_by_node() successfully acquires a reference to
> > >
> > > Your subject is missing the () of dsa_port_parse_of()
> > >
> > > > a network device but the subsequent call to dsa_port_parse_cpu()
> > > > fails, dsa_port_parse_of() returns without releasing the reference
> > > > count on the network device.
> > > >
> > > > of_find_net_device_by_node() increments the reference count of the
> > > > returned structure, which should be balanced with a corresponding
> > > > put_device() when the reference is no longer needed.
> > > >
> > > > Found by code review.
> > >
> > > I agree with the reference not being properly released on failure,
> > > but I don't think this fix is complete.
> > >
> > > I was trying to figure out where the put_device() would happen in
> > > the success case (or on removal), and I failed to find it.
> > >
> > > Also if the (indirect) top caller of dsa_port_parse_of(),
> > > dsa_switch_probe(), fails at a later place the reference won't be
> > > released either.
> > >
> > > The only explicit put_device() that happens is in
> > > dsa_dev_to_net_device(), which seems to convert a device
> > > reference to a netdev reference via dev_hold().
> > >
> > > But the only caller of that, dsa_port_parse() immediately
> > > calls dev_put() on it, essentially dropping all references, and
> > > then continuing using it.
> > >
> > > dsa_switch_shutdown() talks about dropping references taken via
> > > netdev_upper_dev_link(), but AFAICT this happens only after
> > > dsa_port_parse{,_of}() setup the conduit, so it looks like there
> > > could be a window without any reference held onto the conduit.
> > >
> > > So AFAICT the current state is:
> > >
> > > dsa_port_parse_of() keeps the device reference.
> > > dsa_port_parse() drops the device reference, and shortly has a
> > > dev_hold(), but it does not extend beyond the function.
> > >
> > > Therefore if my analysis is correct (which it may very well not
> > > be), the correct fix(es) here could be:
> > >
> > > dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold()
> > > on success to the conduit.
> > >
> > > Or maybe they should unconditionally drop if *after* calling
> > > dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one
> > > when assigning dsa_port::conduit.
> > >
> > > Regardless, the end result should be that there is a reference on
> > > the conduit stored in dsa_port::conduit.
> > >
> > > dsa_switch_release_ports() should drop the references, as this
> > > seems to be called in all error paths of dsa_port_parse{,of} as
> > > well by dsa_switch_remove().
> > >
> > > And maybe dsa_switch_shutdown() then also needs to drop the
> > > reference? Though it may need to then retake the reference on
> > > resume, and I don't know where that exactly should happen. Maybe
> > > it should also lookup the conduit(s) again to be correct.
> > >
> > > But here I'm more doing educated guesses then actually knowing
> > > what's correct.
> > >
> > > The alternative/quick "fix" would be to just drop the
> > > reference unconditionally, which would align the behaviour
> > > to that of dsa_port_parse(). Not sure if it should mirror the
> > > dev_hold() / dev_put() spiel as well.
> > >
> > > Not that I think this would be the correct behaviour though.
> > >
> > > Sorry for the lengthy review/train of thought.
> > >
> > > Best regards,
> > > Jonas
> >
> > Thank you for your thoughts on this topic. Indeed there is a problem,
> > for which I managed to find a few hours today to investigate. I was
> > going to just submit a patch directly and refer Ma Ke to it directly,
> > but since you started looking into the situation as well, I just thought
> > I'd reply "please standby". It's currently undergoing testing.
> 
> A patch already, that's even better! I'll gladly stand by :)
> 
> Best regards,
> Jonas

Patch location (for tracking purposes):
https://lore.kernel.org/netdev/20251214182449.3900190-1-vladimir.oltean@nxp.com/

Ma Ke's patch unfortunately does not compile even in v2, so:

pw-bot: cr

(although "changes requested" is a figure of speech here, in the sense
that I don't expect a follow-up patch from Ma Ke)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ