[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6e0d1e5-bd50-464a-9eae-05ecd11de4ee@linux.dev>
Date: Tue, 16 Sep 2025 17:20:19 +0100
From: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Richard Cochran <richardcochran@...il.com>,
Ajay Kaher <ajay.kaher@...adcom.com>,
Alexey Makhalov <alexey.makhalov@...adcom.com>,
Andrew Lunn <andrew+netdev@...n.ch>,
Broadcom internal kernel review list
<bcm-kernel-feedback-list@...adcom.com>, Clark Wang <xiaoning.wang@....com>,
"David S. Miller" <davem@...emloft.net>,
David Woodhouse <dwmw2@...radead.org>, Eric Dumazet <edumazet@...gle.com>,
imx@...ts.linux.dev, Jakub Kicinski <kuba@...nel.org>,
Jonathan Lemon <jonathan.lemon@...il.com>, netdev@...r.kernel.org,
Nick Shi <nick.shi@...adcom.com>, Paolo Abeni <pabeni@...hat.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Vladimir Oltean <vladimir.oltean@....com>, Wei Fang <wei.fang@....com>,
Yangbo Lu <yangbo.lu@....com>
Subject: Re: [PATCH net-next 2/2] ptp: rework ptp_clock_unregister() to
disable events
On 16/09/2025 16:45, Russell King (Oracle) wrote:
> On Tue, Sep 16, 2025 at 02:02:56PM +0100, Vadim Fedorenko wrote:
>> On 15/09/2025 15:42, Russell King (Oracle) wrote:
>>> the ordering of ptp_clock_unregister() is not ideal, as the chardev
>>> remains published while state is being torn down. There is also no
>>> cleanup of enabled pin settings, which means enabled events can
>>> still forward into the core.
>>>
>>> Rework the ordering of cleanup in ptp_clock_unregister() so that we
>>> unpublish the posix clock (and user chardev), disable any pins that
>>> have events enabled, and then clean up the aux work and PPS source.
>>>
>>> This avoids potential use-after-free and races in PTP clock driver
>>> teardown.
>>>
>>> Signed-off-by: Russell King (Oracle) <rmk+kernel@...linux.org.uk>
>>> ---
>>> drivers/ptp/ptp_chardev.c | 13 +++++++++++++
>>> drivers/ptp/ptp_clock.c | 17 ++++++++++++++++-
>>> drivers/ptp/ptp_private.h | 2 ++
>>> 3 files changed, 31 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
>>> index eb4f6d1b1460..640a98f17739 100644
>>> --- a/drivers/ptp/ptp_chardev.c
>>> +++ b/drivers/ptp/ptp_chardev.c
>>> @@ -47,6 +47,19 @@ static int ptp_disable_pinfunc(struct ptp_clock_info *ops,
>>> return err;
>>> }
>>> +void ptp_disable_all_pins(struct ptp_clock *ptp)
>>> +{
>>> + struct ptp_clock_info *info = ptp->info;
>>> + unsigned int i;
>>> +
>>> + mutex_lock(&ptp->pincfg_mux);
>>> + for (i = 0; i < info->n_pins; i++)
>>> + if (info->pin_config[i].func != PTP_PF_NONE)
>>> + ptp_disable_pinfunc(info, info->pin_config[i].func,
>>> + info->pin_config[i].chan);
>>> + mutex_unlock(&ptp->pincfg_mux);
>>> +}
>>> +
>>
>> This part is questionable. We do have devices which have PPS out enabled
>> by default. The driver reads pins configuration from the HW during PTP
>> init phase and sets up proper function for pin in ptp_info::pin_config.
>>
>> With this patch applied these pins have PEROUT function disabled on
>> shutdown and in case of kexec'ing into a new kernel the PPS out feature
>> needs to be manually enabled, and it breaks expected behavior.
>
> Does kexec go to the trouble of unregistering PTP clocks? I don't see
> any driver in drivers/ptp/ that has the .shutdown method populated.
>
> That doesn't mean there aren't - it isn't obvious where they are or
> if it does happen.
That's part of mlx5 and at least Intel's igc and igb drivers.
> The question about whether one wants to leave the other features in
> place when the driver is removed is questionable - without the driver
> (or indeed without something discplining the clock) it's going to
> drift, so the accuracy of e.g. the PPS signal is going to be as good
> as the clock source clocking the TAI.
In our use-case we use PPS out as an input to the external monitoring
and we would like to see the PPS signal to drift in case of any errors.
> Having used NTP with a PPS sourced from a GPS, I'd personally want
> the PPS to stop if the GPS is no longer synchronised, so NTP knows
> that a fault has occurred, rather than PPS continuing but being
> undiscplined and thus of unknown accuracy.
>
> I'd suggest that whether PPS continues to be generated should be a
> matter of user policy. I would suggest that policy should include
> whether or not userspace is discplining the clock - in other words,
> whether the /dev/ptp* device is open or not.
The deduction based on the amount of references to ptp device is not
quite correct. Another option is to introduce another flag and use it
as a signal to remove the function in case of error/shutdown/etc.
> Consider the case where the userspace daemons get OOM-killed and
> that isn't realised. The PPS signal continues to be generated but
> is now unsynchronised and drifts. Yet PPS users continue to
> believe it's accurate.
And again, there is another use-case which actually needs
thisunsynchronised signal
Powered by blists - more mailing lists