[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251214161023.5qcyyifscu73b47u@skbuf>
Date: Sun, 14 Dec 2025 18:10:23 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Jonas Gorski <jonas.gorski@...il.com>
Cc: Ma Ke <make24@...as.ac.cn>, andrew@...n.ch, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
horms@...nel.org, tobias@...dekranz.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
stable@...r.kernel.org
Subject: Re: [PATCH v2] net: dsa: Fix error handling in dsa_port_parse_of
Hi Jonas, Ma Ke,
On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
> Hi,
>
> On 12/14/25 14:12, Ma Ke wrote:
> > When of_find_net_device_by_node() successfully acquires a reference to
>
> Your subject is missing the () of dsa_port_parse_of()
>
> > a network device but the subsequent call to dsa_port_parse_cpu()
> > fails, dsa_port_parse_of() returns without releasing the reference
> > count on the network device.
> >
> > of_find_net_device_by_node() increments the reference count of the
> > returned structure, which should be balanced with a corresponding
> > put_device() when the reference is no longer needed.
> >
> > Found by code review.
>
> I agree with the reference not being properly released on failure,
> but I don't think this fix is complete.
>
> I was trying to figure out where the put_device() would happen in
> the success case (or on removal), and I failed to find it.
>
> Also if the (indirect) top caller of dsa_port_parse_of(),
> dsa_switch_probe(), fails at a later place the reference won't be
> released either.
>
> The only explicit put_device() that happens is in
> dsa_dev_to_net_device(), which seems to convert a device
> reference to a netdev reference via dev_hold().
>
> But the only caller of that, dsa_port_parse() immediately
> calls dev_put() on it, essentially dropping all references, and
> then continuing using it.
>
> dsa_switch_shutdown() talks about dropping references taken via
> netdev_upper_dev_link(), but AFAICT this happens only after
> dsa_port_parse{,_of}() setup the conduit, so it looks like there
> could be a window without any reference held onto the conduit.
>
> So AFAICT the current state is:
>
> dsa_port_parse_of() keeps the device reference.
> dsa_port_parse() drops the device reference, and shortly has a
> dev_hold(), but it does not extend beyond the function.
>
> Therefore if my analysis is correct (which it may very well not
> be), the correct fix(es) here could be:
>
> dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold()
> on success to the conduit.
>
> Or maybe they should unconditionally drop if *after* calling
> dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one
> when assigning dsa_port::conduit.
>
> Regardless, the end result should be that there is a reference on
> the conduit stored in dsa_port::conduit.
>
> dsa_switch_release_ports() should drop the references, as this
> seems to be called in all error paths of dsa_port_parse{,of} as
> well by dsa_switch_remove().
>
> And maybe dsa_switch_shutdown() then also needs to drop the
> reference? Though it may need to then retake the reference on
> resume, and I don't know where that exactly should happen. Maybe
> it should also lookup the conduit(s) again to be correct.
>
> But here I'm more doing educated guesses then actually knowing
> what's correct.
>
> The alternative/quick "fix" would be to just drop the
> reference unconditionally, which would align the behaviour
> to that of dsa_port_parse(). Not sure if it should mirror the
> dev_hold() / dev_put() spiel as well.
>
> Not that I think this would be the correct behaviour though.
>
> Sorry for the lengthy review/train of thought.
>
> Best regards,
> Jonas
Thank you for your thoughts on this topic. Indeed there is a problem,
for which I managed to find a few hours today to investigate. I was
going to just submit a patch directly and refer Ma Ke to it directly,
but since you started looking into the situation as well, I just thought
I'd reply "please standby". It's currently undergoing testing.
Powered by blists - more mailing lists