[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f8b5beb4-ac3d-417c-810e-d96901f688db@bp.renesas.com>
Date: Wed, 22 Jan 2025 14:03:37 +0000
From: Paul Barker <paul.barker.ct@...renesas.com>
To: Kory Maincent <kory.maincent@...tlin.com>
Cc: Jakub Kicinski <kuba@...nel.org>, "David S. Miller"
<davem@...emloft.net>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
Claudiu Beznea <claudiu.beznea.uj@...renesas.com>,
thomas.petazzoni@...tlin.com, Andrew Lunn <andrew@...n.ch>,
Heiner Kallweit <hkallweit1@...il.com>, Russell King
<linux@...linux.org.uk>, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>,
Niklas Söderlund <niklas.soderlund@...natech.se>,
Sergey Shtylyov <s.shtylyov@....ru>
Subject: Re: [PATCH net-next v3] net: phy: Fix suspicious rcu_dereference
usage
On 21/01/2025 16:11, Kory Maincent wrote:
> On Tue, 21 Jan 2025 15:44:34 +0000
> Paul Barker <paul.barker.ct@...renesas.com> wrote:
>
>> On 21/01/2025 13:01, Kory Maincent wrote:
>>
>>> On Tue, 21 Jan 2025 11:34:48 +0000
>>> Paul Barker <paul.barker.ct@...renesas.com> wrote:
>>>
>>>> Why do we need to hold the rtnl mutex across the calls to
>>>> netif_device_detach() and ravb_wol_setup()?
>>>>
>>>> My reading of Documentation/networking/netdevices.rst is that the rtnl
>>>> mutex is held when the net subsystem calls the driver's ndo_stop method,
>>>> which in our case is ravb_close(). So, we should take the rtnl mutex
>>>> when we call ravb_close() directly, in both ravb_suspend() and
>>>> ravb_wol_restore(). That would ensure that we do not call
>>>> phy_disconnect() without holding the rtnl mutex and should fix this
>>>> issue.
>>>
>>> Not sure about it. For example ravb_ptp_stop() called in ravb_wol_setup()
>>> won't be protected by the rtnl lock.
>>
>> ravb_ptp_stop() modifies a couple of device registers and calls
>> ptp_clock_unregister(). I don't see anything to suggest that this
>> requires the rtnl lock to be held, unless I am missing something.
>
> What happens if two ptp_clock_unregister() with the same ptp_clock pointer are
> called simultaneously? From ravb_suspend and ravb_set_ringparam for example. It
> may cause some errors.
> For example the ptp->kworker pointer could be used after a kfree.
> https://elixir.bootlin.com/linux/v6.12.6/source/drivers/ptp/ptp_clock.c#L416
I've dug into this some more today and looked at the suspend/resume
paths of other Ethernet drivers for comparison.
netif_device_detach() and netif_device_attach() seem to be safe to call
without holding the rtnl lock, e.g. the stmmac driver does this.
In the suspend path, we should hold the rtnl lock across the calls to
ravb_wol_setup() and ravb_close().
In the resume path, we should hold the rtnl lock across the calls to
ravb_wol_restore() and ravb_open(). I don't think there is any harm in
holding the rtnl lock while we call pm_runtime_force_resume(), so we can
take the lock before checking priv->wol_enabled and release after
calling ravb_open().
How does that sound?
Thanks,
--
Paul Barker
Download attachment "OpenPGP_0x27F4B3459F002257.asc" of type "application/pgp-keys" (3521 bytes)
Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (237 bytes)
Powered by blists - more mailing lists