lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251214161023.5qcyyifscu73b47u@skbuf>
Date: Sun, 14 Dec 2025 18:10:23 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Jonas Gorski <jonas.gorski@...il.com>
Cc: Ma Ke <make24@...as.ac.cn>, andrew@...n.ch, davem@...emloft.net,
	edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
	horms@...nel.org, tobias@...dekranz.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	stable@...r.kernel.org
Subject: Re: [PATCH v2] net: dsa: Fix error handling in dsa_port_parse_of

Hi Jonas, Ma Ke,

On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
> Hi,
> 
> On 12/14/25 14:12, Ma Ke wrote:
> > When of_find_net_device_by_node() successfully acquires a reference to
> 
> Your subject is missing the () of dsa_port_parse_of()
> 
> > a network device but the subsequent call to dsa_port_parse_cpu()
> > fails, dsa_port_parse_of() returns without releasing the reference
> > count on the network device.
> > 
> > of_find_net_device_by_node() increments the reference count of the
> > returned structure, which should be balanced with a corresponding
> > put_device() when the reference is no longer needed.
> > 
> > Found by code review.
> 
> I agree with the reference not being properly released on failure,
> but I don't think this fix is complete.
> 
> I was trying to figure out where the put_device() would happen in
> the success case (or on removal), and I failed to find it.
> 
> Also if the (indirect) top caller of dsa_port_parse_of(),
> dsa_switch_probe(), fails at a later place the reference won't be
> released either.
> 
> The only explicit put_device() that happens is in
> dsa_dev_to_net_device(), which seems to convert a device
> reference to a netdev reference via dev_hold().
> 
> But the only caller of that, dsa_port_parse() immediately
> calls dev_put() on it, essentially dropping all references, and
> then continuing using it.
> 
> dsa_switch_shutdown() talks about dropping references taken via
> netdev_upper_dev_link(), but AFAICT this happens only after
> dsa_port_parse{,_of}() setup the conduit, so it looks like there
> could be a window without any reference held onto the conduit.
> 
> So AFAICT the current state is:
> 
> dsa_port_parse_of() keeps the device reference.
> dsa_port_parse() drops the device reference, and shortly has a
> dev_hold(), but it does not extend beyond the function.
> 
> Therefore if my analysis is correct (which it may very well not
> be), the correct fix(es) here could be:
> 
> dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold()
> on success to the conduit.
> 
> Or maybe they should unconditionally drop if *after* calling
> dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one
> when assigning dsa_port::conduit.
> 
> Regardless, the end result should be that there is a reference on
> the conduit stored in dsa_port::conduit.
> 
> dsa_switch_release_ports() should drop the references, as this
> seems to be called in all error paths of dsa_port_parse{,of} as
> well by dsa_switch_remove().
> 
> And maybe dsa_switch_shutdown() then also needs to drop the
> reference? Though it may need to then retake the reference on
> resume, and I don't know where that exactly should happen. Maybe
> it should also lookup the conduit(s) again to be correct.
> 
> But here I'm more doing educated guesses then actually knowing
> what's correct.
> 
> The alternative/quick "fix" would be to just drop the
> reference unconditionally, which would align the behaviour
> to that of dsa_port_parse(). Not sure if it should mirror the
> dev_hold() / dev_put() spiel as well.
> 
> Not that I think this would be the correct behaviour though.
> 
> Sorry for the lengthy review/train of thought.
> 
> Best regards,
> Jonas

Thank you for your thoughts on this topic. Indeed there is a problem,
for which I managed to find a few hours today to investigate. I was
going to just submit a patch directly and refer Ma Ke to it directly,
but since you started looking into the situation as well, I just thought
I'd reply "please standby". It's currently undergoing testing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ