[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9187e9a3-fb93-4927-b02f-7f41176f844d@sigma-star.at>
Date: Tue, 1 Apr 2025 10:12:38 +0200
From: David Oberhollenzer <david.oberhollenzer@...ma-star.at>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, andrew@...n.ch, Julian.FRIEDRICH@...quentis.com,
f.fainelli@...il.com, olteanv@...il.com, davem@...emloft.net,
edumazet@...gle.com, pabeni@...hat.com, linux-kernel@...r.kernel.org,
upstream+netdev@...ma-star.at
Subject: Re: [PATCH v3] net: dsa: mv88e6xxx: propperly shutdown PPU re-enable
timer on destroy
Hi,
I did some further re-testing on the fix, regarding the the similar race
in remove() as well as the previous question regarding the locking and
cancellation order. V3 already expands on this, and the point still stands,
the nested timer+queue+trylock mechanism is somewhat tricky and I manage
to hit the race window with just cancel_work_sync(), without the lock or
a different order for tear down.
On 1/15/25 12:27 AM, Jakub Kicinski wrote:
> On Mon, 13 Jan 2025 09:49:12 +0100 David Oberhollenzer wrote:
>> @@ -7323,6 +7323,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
>> mv88e6xxx_g1_irq_free(chip);
>> else
>> mv88e6xxx_irq_poll_free(chip);
>> +out_phy:
>> + mv88e6xxx_phy_destroy(chip);
>> out:
>> if (pdata)
>> dev_put(pdata->netdev);
>
> If this is the right ordering the order in mv88e6xxx_remove()
> looks suspicious. We call mv88e6xxx_phy_destroy() pretty early
> and then unregister from DSA. Isn't there a window where DSA
> callbacks can reschedule the timer?
yes, this does looks suspicious, mv88e6xxx_phy_destroy() should be done
after the switch is unregistered, otherwise it should logically cause
the same issue.
However, I did not manage to trigger this during testing, and this also
did not fix the original issue I saw, but I will fix the order in a
followup v4 patch.
Greetings,
David
Powered by blists - more mailing lists