netdev - RE: [PATCH v1] ptp: add ToD device driver for Intel FPGA cards

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB54832FB2E5006135710D344EE3BF9@BN9PR11MB5483.namprd11.prod.outlook.com>
Date:   Wed, 15 Mar 2023 02:59:28 +0000
From:   "Zhang, Tianfei" <tianfei.zhang@...el.com>
To:     "Pagani, Marco" <marpagan@...hat.com>
CC:     "ilpo.jarvinen@...ux.intel.com" <ilpo.jarvinen@...ux.intel.com>,
        "andriy.shevchenko@...ux.intel.com" 
        <andriy.shevchenko@...ux.intel.com>,
        "Weight, Russell H" <russell.h.weight@...el.com>,
        "matthew.gerlach@...ux.intel.com" <matthew.gerlach@...ux.intel.com>,
        "pierre-louis.bossart@...ux.intel.com" 
        <pierre-louis.bossart@...ux.intel.com>,
        "Gomes, Vinicius" <vinicius.gomes@...el.com>,
        "Khadatare, RaghavendraX Anand" 
        <raghavendrax.anand.khadatare@...el.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-fpga@...r.kernel.org" <linux-fpga@...r.kernel.org>,
        "richardcochran@...il.com" <richardcochran@...il.com>
Subject: RE: [PATCH v1] ptp: add ToD device driver for Intel FPGA cards



> -----Original Message-----
> From: Marco Pagani <marpagan@...hat.com>
> Sent: Tuesday, March 14, 2023 1:50 AM
> To: Zhang, Tianfei <tianfei.zhang@...el.com>
> Cc: ilpo.jarvinen@...ux.intel.com; andriy.shevchenko@...ux.intel.com; Weight,
> Russell H <russell.h.weight@...el.com>; matthew.gerlach@...ux.intel.com; pierre-
> louis.bossart@...ux.intel.com; Gomes, Vinicius <vinicius.gomes@...el.com>;
> Khadatare, RaghavendraX Anand <raghavendrax.anand.khadatare@...el.com>;
> netdev@...r.kernel.org; linux-fpga@...r.kernel.org; richardcochran@...il.com
> Subject: Re: [PATCH v1] ptp: add ToD device driver for Intel FPGA cards
> 
> 
> 
> On 2023-03-13 04:02, Tianfei Zhang wrote:
> > Adding a DFL (Device Feature List) device driver of ToD device for
> > Intel FPGA cards.
> >
> > The Intel FPGA Time of Day(ToD) IP within the FPGA DFL bus is exposed
> > as PTP Hardware clock(PHC) device to the Linux PTP stack to
> > synchronize the system clock to its ToD information using phc2sys
> > utility of the Linux PTP stack. The DFL is a hardware List within
> > FPGA, which defines a linked list of feature headers within the device
> > MMIO space to provide an extensible way of adding subdevice features.
> >
> > Signed-off-by: Raghavendra Khadatare
> > <raghavendrax.anand.khadatare@...el.com>
> > Signed-off-by: Tianfei Zhang <tianfei.zhang@...el.com>
> > ---
> >  MAINTAINERS               |   7 +
> >  drivers/ptp/Kconfig       |  13 ++
> >  drivers/ptp/Makefile      |   1 +
> >  drivers/ptp/ptp_dfl_tod.c | 334
> > ++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 355 insertions(+)
> >  create mode 100644 drivers/ptp/ptp_dfl_tod.c
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS index
> > ec57c42ed544..584979abbd92 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -15623,6 +15623,13 @@ L:	netdev@...r.kernel.org
> >  S:	Maintained
> >  F:	drivers/ptp/ptp_ocp.c
> >
> > +INTEL PTP DFL ToD DRIVER
> > +M:	Tianfei Zhang <tianfei.zhang@...el.com>
> > +L:	linux-fpga@...r.kernel.org
> > +L:	netdev@...r.kernel.org
> > +S:	Maintained
> > +F:	drivers/ptp/ptp_dfl_tod.c
> > +
> >  OPENCORES I2C BUS DRIVER
> >  M:	Peter Korsgaard <peter@...sgaard.com>
> >  M:	Andrew Lunn <andrew@...n.ch>
> > diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig index
> > fe4971b65c64..e0d6f136ee46 100644
> > --- a/drivers/ptp/Kconfig
> > +++ b/drivers/ptp/Kconfig
> > @@ -186,4 +186,17 @@ config PTP_1588_CLOCK_OCP
> >
> >  	  More information is available at http://www.timingcard.com/
> >
> > +config PTP_DFL_TOD
> > +	tristate "FPGA DFL ToD Driver"
> > +	depends on FPGA_DFL
> > +	help
> > +	  The DFL (Device Feature List) device driver for the Intel ToD
> > +	  (Time-of-Day) device in FPGA card. The ToD IP within the FPGA
> > +	  is exposed as PTP Hardware Clock (PHC) device to the Linux PTP
> > +	  stack to synchronize the system clock to its ToD information
> > +	  using phc2sys utility of the Linux PTP stack.
> > +
> > +	  To compile this driver as a module, choose M here: the module
> > +	  will be called ptp_dfl_tod.
> > +
> >  endmenu
> > diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile index
> > 28a6fe342d3e..553f18bf3c83 100644
> > --- a/drivers/ptp/Makefile
> > +++ b/drivers/ptp/Makefile
> > @@ -18,3 +18,4 @@ obj-$(CONFIG_PTP_1588_CLOCK_IDTCM)	+=
> ptp_clockmatrix.o
> >  obj-$(CONFIG_PTP_1588_CLOCK_IDT82P33)	+= ptp_idt82p33.o
> >  obj-$(CONFIG_PTP_1588_CLOCK_VMW)	+= ptp_vmw.o
> >  obj-$(CONFIG_PTP_1588_CLOCK_OCP)	+= ptp_ocp.o
> > +obj-$(CONFIG_PTP_DFL_TOD)		+= ptp_dfl_tod.o
> > diff --git a/drivers/ptp/ptp_dfl_tod.c b/drivers/ptp/ptp_dfl_tod.c new
> > file mode 100644 index 000000000000..d9fbdfc53bd8
> > --- /dev/null
> > +++ b/drivers/ptp/ptp_dfl_tod.c
> > @@ -0,0 +1,334 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * DFL device driver for Time-of-Day (ToD) private feature
> > + *
> > + * Copyright (C) 2023 Intel Corporation  */
> > +
> > +#include <linux/bitfield.h>
> > +#include <linux/delay.h>
> > +#include <linux/dfl.h>
> > +#include <linux/gcd.h>
> > +#include <linux/iopoll.h>
> > +#include <linux/module.h>
> > +#include <linux/ptp_clock_kernel.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/units.h>
> > +
> > +#define FME_FEATURE_ID_TOD		0x22
> > +
> > +/* ToD clock register space. */
> > +#define TOD_CLK_FREQ			0x038
> > +
> > +/*
> > + * The read sequence of ToD timestamp registers: TOD_NANOSEC,
> > +TOD_SECONDSL and
> > + * TOD_SECONDSH, because there is a hardware snapshot whenever the
> > +TOD_NANOSEC
> > + * register is read.
> > + *
> > + * The ToD IP requires writing registers in the reverse order to the read sequence.
> > + * The timestamp is corrected when the TOD_NANOSEC register is
> > +written, so the
> > + * sequence of write TOD registers: TOD_SECONDSH, TOD_SECONDSL and
> TOD_NANOSEC.
> > + */
> > +#define TOD_SECONDSH			0x100
> > +#define TOD_SECONDSL			0x104
> > +#define TOD_NANOSEC			0x108
> > +#define TOD_PERIOD			0x110
> > +#define TOD_ADJUST_PERIOD		0x114
> > +#define TOD_ADJUST_COUNT		0x118
> > +#define TOD_DRIFT_ADJUST		0x11c
> > +#define TOD_DRIFT_ADJUST_RATE		0x120
> > +#define PERIOD_FRAC_OFFSET		16
> > +#define SECONDS_MSB			GENMASK_ULL(47, 32)
> > +#define SECONDS_LSB			GENMASK_ULL(31, 0)
> > +#define TOD_SECONDSH_SEC_MSB		GENMASK_ULL(15, 0)
> > +
> > +#define CAL_SECONDS(m, l)		((FIELD_GET(TOD_SECONDSH_SEC_MSB,
> (m)) << 32) | (l))
> > +
> > +#define TOD_PERIOD_MASK		GENMASK_ULL(19, 0)
> > +#define TOD_PERIOD_MAX			FIELD_MAX(TOD_PERIOD_MASK)
> > +#define TOD_PERIOD_MIN			0
> > +#define TOD_DRIFT_ADJUST_MASK		GENMASK_ULL(15, 0)
> > +#define TOD_DRIFT_ADJUST_FNS_MAX
> 	FIELD_MAX(TOD_DRIFT_ADJUST_MASK)
> > +#define TOD_DRIFT_ADJUST_RATE_MAX	TOD_DRIFT_ADJUST_FNS_MAX
> > +#define TOD_ADJUST_COUNT_MASK		GENMASK_ULL(19, 0)
> > +#define TOD_ADJUST_COUNT_MAX
> 	FIELD_MAX(TOD_ADJUST_COUNT_MASK)
> > +#define TOD_ADJUST_INTERVAL_US		1000
> > +#define TOD_ADJUST_MS			\
> > +		(((TOD_PERIOD_MAX >> 16) + 1) * (TOD_ADJUST_COUNT_MAX + 1))
> > +#define TOD_ADJUST_MS_MAX		(TOD_ADJUST_MS / MICRO)
> > +#define TOD_ADJUST_MAX_US		(TOD_ADJUST_MS_MAX *
> USEC_PER_MSEC)
> > +#define TOD_MAX_ADJ			(500 * MEGA)
> > +
> > +struct dfl_tod {
> > +	struct ptp_clock_info ptp_clock_ops;
> > +	struct device *dev;
> > +	struct ptp_clock *ptp_clock;
> > +
> > +	/* ToD Clock address space */
> > +	void __iomem *tod_ctrl;
> > +
> > +	/* ToD clock registers protection */
> > +	spinlock_t tod_lock;
> > +};
> > +
> > +/*
> > + * A fine ToD HW clock offset adjustment. To perform the fine offset
> > +adjustment, the
> > + * adjust_period and adjust_count argument are used to update the
> > +TOD_ADJUST_PERIOD
> > + * and TOD_ADJUST_COUNT register for in hardware. The dt->tod_lock
> > +spinlock must be
> > + * held when calling this function.
> > + */
> 
> Calling this function while holding the dt->tod_lock spinlock might cause an error
> since read_poll_timeout() can sleep. If it is required to use a spinlock, there is the
> readl_poll_timeout_atomic() function, which can be called from atomic context.
> However, in this case, it is probably better to use a mutex instead of a spinlock since
> the delay appears to be 1 ms.

To program the TOD registers needs to be done without interruption to ensure the correct values are programmed,
So I think using readl_poll_timeout_atomic() function here is better. This delay should be very faster, the 10 us interval delay
will be better.

> 
> > +static int fine_adjust_tod_clock(struct dfl_tod *dt, u32 adjust_period,
> > +				 u32 adjust_count)
> > +{
> > +	void __iomem *base = dt->tod_ctrl;
> > +	u32 val;
> > +
> > +	writel(adjust_period, base + TOD_ADJUST_PERIOD);
> > +	writel(adjust_count, base + TOD_ADJUST_COUNT);
> > +
> > +	/* Wait for present offset adjustment update to complete */
> > +	return readl_poll_timeout(base + TOD_ADJUST_COUNT, val, !val,
> TOD_ADJUST_INTERVAL_US,
> > +				  TOD_ADJUST_MAX_US);
> > +}
> > +
> > +/*
> > + * A coarse ToD HW clock offset adjustment.
> > + * The coarse time adjustment performs by adding or subtracting the
> > +delta value
> > + * from the current ToD HW clock time.
> > + */
> > +static int coarse_adjust_tod_clock(struct dfl_tod *dt, s64 delta) {
> > +	u32 seconds_msb, seconds_lsb, nanosec;
> > +	void __iomem *base = dt->tod_ctrl;
> > +	u64 seconds, now;
> > +
> > +	if (delta == 0)
> > +		return 0;
> > +
> > +	nanosec = readl(base + TOD_NANOSEC);
> > +	seconds_lsb = readl(base + TOD_SECONDSL);
> > +	seconds_msb = readl(base + TOD_SECONDSH);
> > +
> > +	/* Calculate new time */
> > +	seconds = CAL_SECONDS(seconds_msb, seconds_lsb);
> > +	now = seconds * NSEC_PER_SEC + nanosec + delta;
> > +
> > +	seconds = div_u64_rem(now, NSEC_PER_SEC, &nanosec);
> > +	seconds_msb = FIELD_GET(SECONDS_MSB, seconds);
> > +	seconds_lsb = FIELD_GET(SECONDS_LSB, seconds);
> > +
> > +	writel(seconds_msb, base + TOD_SECONDSH);
> > +	writel(seconds_lsb, base + TOD_SECONDSL);
> > +	writel(nanosec, base + TOD_NANOSEC);
> > +
> > +	return 0;
> > +}
> > +
> > +static int dfl_tod_adjust_fine(struct ptp_clock_info *ptp, long
> > +scaled_ppm) {
> > +	struct dfl_tod *dt = container_of(ptp, struct dfl_tod, ptp_clock_ops);
> > +	u32 tod_period, tod_rem, tod_drift_adjust_fns, tod_drift_adjust_rate;
> > +	void __iomem *base = dt->tod_ctrl;
> > +	unsigned long flags, rate;
> > +	u64 ppb;
> > +
> > +	/* Get the clock rate from clock frequency register offset */
> > +	rate = readl(base + TOD_CLK_FREQ);
> > +
> > +	/* add GIGA as nominal ppb */
> > +	ppb = scaled_ppm_to_ppb(scaled_ppm) + GIGA;
> > +
> > +	tod_period = div_u64_rem(ppb << PERIOD_FRAC_OFFSET, rate, &tod_rem);
> > +	if (tod_period > TOD_PERIOD_MAX)
> > +		return -ERANGE;
> > +
> > +	/*
> > +	 * The drift of ToD adjusted periodically by adding a drift_adjust_fns
> > +	 * correction value every drift_adjust_rate count of clock cycles.
> > +	 */
> > +	tod_drift_adjust_fns = tod_rem / gcd(tod_rem, rate);
> > +	tod_drift_adjust_rate = rate / gcd(tod_rem, rate);
> > +
> > +	while ((tod_drift_adjust_fns > TOD_DRIFT_ADJUST_FNS_MAX) ||
> > +	       (tod_drift_adjust_rate > TOD_DRIFT_ADJUST_RATE_MAX)) {
> > +		tod_drift_adjust_fns >>= 1;
> > +		tod_drift_adjust_rate >>= 1;
> > +	}
> 
> Why both tod_drift_adjust_fns and tod_drift_adjust_rate are iteratively divided if
> one of the two exceeds the maximum value? Wouldn't it be more accurate to set
> each of them to the max if they exceeded their respective maximum value?

Thanks your point out, calculate the tod_drift_adjust_fns and tod_drift_adjust_rate separately is better.

> 
> > +
> > +	if (tod_drift_adjust_fns == 0)
> > +		tod_drift_adjust_rate = 0;
> > +
> > +	spin_lock_irqsave(&dt->tod_lock, flags);
> > +	writel(tod_period, base + TOD_PERIOD);
> > +	writel(0, base + TOD_ADJUST_PERIOD);
> > +	writel(0, base + TOD_ADJUST_COUNT);
> > +	writel(tod_drift_adjust_fns, base + TOD_DRIFT_ADJUST);
> > +	writel(tod_drift_adjust_rate, base + TOD_DRIFT_ADJUST_RATE);
> > +	spin_unlock_irqrestore(&dt->tod_lock, flags);
> > +
> > +	return 0;
> > +}
> > +
> > +static int dfl_tod_adjust_time(struct ptp_clock_info *ptp, s64 delta)
> > +{
> > +	struct dfl_tod *dt = container_of(ptp, struct dfl_tod, ptp_clock_ops);
> > +	u32 period, diff, rem, rem_period, adj_period;
> > +	void __iomem *base = dt->tod_ctrl;
> > +	unsigned long flags;
> > +	bool neg_adj;
> > +	u64 count;
> > +	int ret;
> > +
> > +	neg_adj = delta < 0;
> > +	if (neg_adj)
> > +		delta = -delta;
> > +
> > +	spin_lock_irqsave(&dt->tod_lock, flags);
> > +
> > +	/*
> > +	 * Get the maximum possible value of the Period register offset
> > +	 * adjustment in nanoseconds scale. This depends on the current
> > +	 * Period register setting and the maximum and minimum possible
> > +	 * values of the Period register.
> > +	 */
> > +	period = readl(base + TOD_PERIOD);
> > +
> > +	if (neg_adj) {
> > +		diff = (period - TOD_PERIOD_MIN) >> PERIOD_FRAC_OFFSET;
> > +		adj_period = period - (diff << PERIOD_FRAC_OFFSET);
> > +		rem_period = period - (rem << PERIOD_FRAC_OFFSET);
> 
> rem seems to be uninitialized here.

Yes, the rem should be calculated  in div_u64_rem(), I will change to code like this:

        period = readl(base + TOD_PERIOD);

        if (neg_adj) {
                diff = (period - TOD_PERIOD_MIN) >> PERIOD_FRAC_OFFSET;
                adj_period = period - (diff << PERIOD_FRAC_OFFSET);
                count = div_u64_rem(delta, diff, &rem);
                rem_period = period - (rem << PERIOD_FRAC_OFFSET);
        } else {
                diff = (TOD_PERIOD_MAX - period) >> PERIOD_FRAC_OFFSET;
                adj_period = period + (diff << PERIOD_FRAC_OFFSET);
                count = div_u64_rem(delta, diff, &rem);
                rem_period = period + (rem << PERIOD_FRAC_OFFSET);
        }

        ret = 0;

        if (count > TOD_ADJUST_COUNT_MAX) {
                     ....