[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB54832FB2E5006135710D344EE3BF9@BN9PR11MB5483.namprd11.prod.outlook.com>
Date: Wed, 15 Mar 2023 02:59:28 +0000
From: "Zhang, Tianfei" <tianfei.zhang@...el.com>
To: "Pagani, Marco" <marpagan@...hat.com>
CC: "ilpo.jarvinen@...ux.intel.com" <ilpo.jarvinen@...ux.intel.com>,
"andriy.shevchenko@...ux.intel.com"
<andriy.shevchenko@...ux.intel.com>,
"Weight, Russell H" <russell.h.weight@...el.com>,
"matthew.gerlach@...ux.intel.com" <matthew.gerlach@...ux.intel.com>,
"pierre-louis.bossart@...ux.intel.com"
<pierre-louis.bossart@...ux.intel.com>,
"Gomes, Vinicius" <vinicius.gomes@...el.com>,
"Khadatare, RaghavendraX Anand"
<raghavendrax.anand.khadatare@...el.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-fpga@...r.kernel.org" <linux-fpga@...r.kernel.org>,
"richardcochran@...il.com" <richardcochran@...il.com>
Subject: RE: [PATCH v1] ptp: add ToD device driver for Intel FPGA cards
> -----Original Message-----
> From: Marco Pagani <marpagan@...hat.com>
> Sent: Tuesday, March 14, 2023 1:50 AM
> To: Zhang, Tianfei <tianfei.zhang@...el.com>
> Cc: ilpo.jarvinen@...ux.intel.com; andriy.shevchenko@...ux.intel.com; Weight,
> Russell H <russell.h.weight@...el.com>; matthew.gerlach@...ux.intel.com; pierre-
> louis.bossart@...ux.intel.com; Gomes, Vinicius <vinicius.gomes@...el.com>;
> Khadatare, RaghavendraX Anand <raghavendrax.anand.khadatare@...el.com>;
> netdev@...r.kernel.org; linux-fpga@...r.kernel.org; richardcochran@...il.com
> Subject: Re: [PATCH v1] ptp: add ToD device driver for Intel FPGA cards
>
>
>
> On 2023-03-13 04:02, Tianfei Zhang wrote:
> > Adding a DFL (Device Feature List) device driver of ToD device for
> > Intel FPGA cards.
> >
> > The Intel FPGA Time of Day(ToD) IP within the FPGA DFL bus is exposed
> > as PTP Hardware clock(PHC) device to the Linux PTP stack to
> > synchronize the system clock to its ToD information using phc2sys
> > utility of the Linux PTP stack. The DFL is a hardware List within
> > FPGA, which defines a linked list of feature headers within the device
> > MMIO space to provide an extensible way of adding subdevice features.
> >
> > Signed-off-by: Raghavendra Khadatare
> > <raghavendrax.anand.khadatare@...el.com>
> > Signed-off-by: Tianfei Zhang <tianfei.zhang@...el.com>
> > ---
> > MAINTAINERS | 7 +
> > drivers/ptp/Kconfig | 13 ++
> > drivers/ptp/Makefile | 1 +
> > drivers/ptp/ptp_dfl_tod.c | 334
> > ++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 355 insertions(+)
> > create mode 100644 drivers/ptp/ptp_dfl_tod.c
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS index
> > ec57c42ed544..584979abbd92 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -15623,6 +15623,13 @@ L: netdev@...r.kernel.org
> > S: Maintained
> > F: drivers/ptp/ptp_ocp.c
> >
> > +INTEL PTP DFL ToD DRIVER
> > +M: Tianfei Zhang <tianfei.zhang@...el.com>
> > +L: linux-fpga@...r.kernel.org
> > +L: netdev@...r.kernel.org
> > +S: Maintained
> > +F: drivers/ptp/ptp_dfl_tod.c
> > +
> > OPENCORES I2C BUS DRIVER
> > M: Peter Korsgaard <peter@...sgaard.com>
> > M: Andrew Lunn <andrew@...n.ch>
> > diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig index
> > fe4971b65c64..e0d6f136ee46 100644
> > --- a/drivers/ptp/Kconfig
> > +++ b/drivers/ptp/Kconfig
> > @@ -186,4 +186,17 @@ config PTP_1588_CLOCK_OCP
> >
> > More information is available at http://www.timingcard.com/
> >
> > +config PTP_DFL_TOD
> > + tristate "FPGA DFL ToD Driver"
> > + depends on FPGA_DFL
> > + help
> > + The DFL (Device Feature List) device driver for the Intel ToD
> > + (Time-of-Day) device in FPGA card. The ToD IP within the FPGA
> > + is exposed as PTP Hardware Clock (PHC) device to the Linux PTP
> > + stack to synchronize the system clock to its ToD information
> > + using phc2sys utility of the Linux PTP stack.
> > +
> > + To compile this driver as a module, choose M here: the module
> > + will be called ptp_dfl_tod.
> > +
> > endmenu
> > diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile index
> > 28a6fe342d3e..553f18bf3c83 100644
> > --- a/drivers/ptp/Makefile
> > +++ b/drivers/ptp/Makefile
> > @@ -18,3 +18,4 @@ obj-$(CONFIG_PTP_1588_CLOCK_IDTCM) +=
> ptp_clockmatrix.o
> > obj-$(CONFIG_PTP_1588_CLOCK_IDT82P33) += ptp_idt82p33.o
> > obj-$(CONFIG_PTP_1588_CLOCK_VMW) += ptp_vmw.o
> > obj-$(CONFIG_PTP_1588_CLOCK_OCP) += ptp_ocp.o
> > +obj-$(CONFIG_PTP_DFL_TOD) += ptp_dfl_tod.o
> > diff --git a/drivers/ptp/ptp_dfl_tod.c b/drivers/ptp/ptp_dfl_tod.c new
> > file mode 100644 index 000000000000..d9fbdfc53bd8
> > --- /dev/null
> > +++ b/drivers/ptp/ptp_dfl_tod.c
> > @@ -0,0 +1,334 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * DFL device driver for Time-of-Day (ToD) private feature
> > + *
> > + * Copyright (C) 2023 Intel Corporation */
> > +
> > +#include <linux/bitfield.h>
> > +#include <linux/delay.h>
> > +#include <linux/dfl.h>
> > +#include <linux/gcd.h>
> > +#include <linux/iopoll.h>
> > +#include <linux/module.h>
> > +#include <linux/ptp_clock_kernel.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/units.h>
> > +
> > +#define FME_FEATURE_ID_TOD 0x22
> > +
> > +/* ToD clock register space. */
> > +#define TOD_CLK_FREQ 0x038
> > +
> > +/*
> > + * The read sequence of ToD timestamp registers: TOD_NANOSEC,
> > +TOD_SECONDSL and
> > + * TOD_SECONDSH, because there is a hardware snapshot whenever the
> > +TOD_NANOSEC
> > + * register is read.
> > + *
> > + * The ToD IP requires writing registers in the reverse order to the read sequence.
> > + * The timestamp is corrected when the TOD_NANOSEC register is
> > +written, so the
> > + * sequence of write TOD registers: TOD_SECONDSH, TOD_SECONDSL and
> TOD_NANOSEC.
> > + */
> > +#define TOD_SECONDSH 0x100
> > +#define TOD_SECONDSL 0x104
> > +#define TOD_NANOSEC 0x108
> > +#define TOD_PERIOD 0x110
> > +#define TOD_ADJUST_PERIOD 0x114
> > +#define TOD_ADJUST_COUNT 0x118
> > +#define TOD_DRIFT_ADJUST 0x11c
> > +#define TOD_DRIFT_ADJUST_RATE 0x120
> > +#define PERIOD_FRAC_OFFSET 16
> > +#define SECONDS_MSB GENMASK_ULL(47, 32)
> > +#define SECONDS_LSB GENMASK_ULL(31, 0)
> > +#define TOD_SECONDSH_SEC_MSB GENMASK_ULL(15, 0)
> > +
> > +#define CAL_SECONDS(m, l) ((FIELD_GET(TOD_SECONDSH_SEC_MSB,
> (m)) << 32) | (l))
> > +
> > +#define TOD_PERIOD_MASK GENMASK_ULL(19, 0)
> > +#define TOD_PERIOD_MAX FIELD_MAX(TOD_PERIOD_MASK)
> > +#define TOD_PERIOD_MIN 0
> > +#define TOD_DRIFT_ADJUST_MASK GENMASK_ULL(15, 0)
> > +#define TOD_DRIFT_ADJUST_FNS_MAX
> FIELD_MAX(TOD_DRIFT_ADJUST_MASK)
> > +#define TOD_DRIFT_ADJUST_RATE_MAX TOD_DRIFT_ADJUST_FNS_MAX
> > +#define TOD_ADJUST_COUNT_MASK GENMASK_ULL(19, 0)
> > +#define TOD_ADJUST_COUNT_MAX
> FIELD_MAX(TOD_ADJUST_COUNT_MASK)
> > +#define TOD_ADJUST_INTERVAL_US 1000
> > +#define TOD_ADJUST_MS \
> > + (((TOD_PERIOD_MAX >> 16) + 1) * (TOD_ADJUST_COUNT_MAX + 1))
> > +#define TOD_ADJUST_MS_MAX (TOD_ADJUST_MS / MICRO)
> > +#define TOD_ADJUST_MAX_US (TOD_ADJUST_MS_MAX *
> USEC_PER_MSEC)
> > +#define TOD_MAX_ADJ (500 * MEGA)
> > +
> > +struct dfl_tod {
> > + struct ptp_clock_info ptp_clock_ops;
> > + struct device *dev;
> > + struct ptp_clock *ptp_clock;
> > +
> > + /* ToD Clock address space */
> > + void __iomem *tod_ctrl;
> > +
> > + /* ToD clock registers protection */
> > + spinlock_t tod_lock;
> > +};
> > +
> > +/*
> > + * A fine ToD HW clock offset adjustment. To perform the fine offset
> > +adjustment, the
> > + * adjust_period and adjust_count argument are used to update the
> > +TOD_ADJUST_PERIOD
> > + * and TOD_ADJUST_COUNT register for in hardware. The dt->tod_lock
> > +spinlock must be
> > + * held when calling this function.
> > + */
>
> Calling this function while holding the dt->tod_lock spinlock might cause an error
> since read_poll_timeout() can sleep. If it is required to use a spinlock, there is the
> readl_poll_timeout_atomic() function, which can be called from atomic context.
> However, in this case, it is probably better to use a mutex instead of a spinlock since
> the delay appears to be 1 ms.
To program the TOD registers needs to be done without interruption to ensure the correct values are programmed,
So I think using readl_poll_timeout_atomic() function here is better. This delay should be very faster, the 10 us interval delay
will be better.
>
> > +static int fine_adjust_tod_clock(struct dfl_tod *dt, u32 adjust_period,
> > + u32 adjust_count)
> > +{
> > + void __iomem *base = dt->tod_ctrl;
> > + u32 val;
> > +
> > + writel(adjust_period, base + TOD_ADJUST_PERIOD);
> > + writel(adjust_count, base + TOD_ADJUST_COUNT);
> > +
> > + /* Wait for present offset adjustment update to complete */
> > + return readl_poll_timeout(base + TOD_ADJUST_COUNT, val, !val,
> TOD_ADJUST_INTERVAL_US,
> > + TOD_ADJUST_MAX_US);
> > +}
> > +
> > +/*
> > + * A coarse ToD HW clock offset adjustment.
> > + * The coarse time adjustment performs by adding or subtracting the
> > +delta value
> > + * from the current ToD HW clock time.
> > + */
> > +static int coarse_adjust_tod_clock(struct dfl_tod *dt, s64 delta) {
> > + u32 seconds_msb, seconds_lsb, nanosec;
> > + void __iomem *base = dt->tod_ctrl;
> > + u64 seconds, now;
> > +
> > + if (delta == 0)
> > + return 0;
> > +
> > + nanosec = readl(base + TOD_NANOSEC);
> > + seconds_lsb = readl(base + TOD_SECONDSL);
> > + seconds_msb = readl(base + TOD_SECONDSH);
> > +
> > + /* Calculate new time */
> > + seconds = CAL_SECONDS(seconds_msb, seconds_lsb);
> > + now = seconds * NSEC_PER_SEC + nanosec + delta;
> > +
> > + seconds = div_u64_rem(now, NSEC_PER_SEC, &nanosec);
> > + seconds_msb = FIELD_GET(SECONDS_MSB, seconds);
> > + seconds_lsb = FIELD_GET(SECONDS_LSB, seconds);
> > +
> > + writel(seconds_msb, base + TOD_SECONDSH);
> > + writel(seconds_lsb, base + TOD_SECONDSL);
> > + writel(nanosec, base + TOD_NANOSEC);
> > +
> > + return 0;
> > +}
> > +
> > +static int dfl_tod_adjust_fine(struct ptp_clock_info *ptp, long
> > +scaled_ppm) {
> > + struct dfl_tod *dt = container_of(ptp, struct dfl_tod, ptp_clock_ops);
> > + u32 tod_period, tod_rem, tod_drift_adjust_fns, tod_drift_adjust_rate;
> > + void __iomem *base = dt->tod_ctrl;
> > + unsigned long flags, rate;
> > + u64 ppb;
> > +
> > + /* Get the clock rate from clock frequency register offset */
> > + rate = readl(base + TOD_CLK_FREQ);
> > +
> > + /* add GIGA as nominal ppb */
> > + ppb = scaled_ppm_to_ppb(scaled_ppm) + GIGA;
> > +
> > + tod_period = div_u64_rem(ppb << PERIOD_FRAC_OFFSET, rate, &tod_rem);
> > + if (tod_period > TOD_PERIOD_MAX)
> > + return -ERANGE;
> > +
> > + /*
> > + * The drift of ToD adjusted periodically by adding a drift_adjust_fns
> > + * correction value every drift_adjust_rate count of clock cycles.
> > + */
> > + tod_drift_adjust_fns = tod_rem / gcd(tod_rem, rate);
> > + tod_drift_adjust_rate = rate / gcd(tod_rem, rate);
> > +
> > + while ((tod_drift_adjust_fns > TOD_DRIFT_ADJUST_FNS_MAX) ||
> > + (tod_drift_adjust_rate > TOD_DRIFT_ADJUST_RATE_MAX)) {
> > + tod_drift_adjust_fns >>= 1;
> > + tod_drift_adjust_rate >>= 1;
> > + }
>
> Why both tod_drift_adjust_fns and tod_drift_adjust_rate are iteratively divided if
> one of the two exceeds the maximum value? Wouldn't it be more accurate to set
> each of them to the max if they exceeded their respective maximum value?
Thanks your point out, calculate the tod_drift_adjust_fns and tod_drift_adjust_rate separately is better.
>
> > +
> > + if (tod_drift_adjust_fns == 0)
> > + tod_drift_adjust_rate = 0;
> > +
> > + spin_lock_irqsave(&dt->tod_lock, flags);
> > + writel(tod_period, base + TOD_PERIOD);
> > + writel(0, base + TOD_ADJUST_PERIOD);
> > + writel(0, base + TOD_ADJUST_COUNT);
> > + writel(tod_drift_adjust_fns, base + TOD_DRIFT_ADJUST);
> > + writel(tod_drift_adjust_rate, base + TOD_DRIFT_ADJUST_RATE);
> > + spin_unlock_irqrestore(&dt->tod_lock, flags);
> > +
> > + return 0;
> > +}
> > +
> > +static int dfl_tod_adjust_time(struct ptp_clock_info *ptp, s64 delta)
> > +{
> > + struct dfl_tod *dt = container_of(ptp, struct dfl_tod, ptp_clock_ops);
> > + u32 period, diff, rem, rem_period, adj_period;
> > + void __iomem *base = dt->tod_ctrl;
> > + unsigned long flags;
> > + bool neg_adj;
> > + u64 count;
> > + int ret;
> > +
> > + neg_adj = delta < 0;
> > + if (neg_adj)
> > + delta = -delta;
> > +
> > + spin_lock_irqsave(&dt->tod_lock, flags);
> > +
> > + /*
> > + * Get the maximum possible value of the Period register offset
> > + * adjustment in nanoseconds scale. This depends on the current
> > + * Period register setting and the maximum and minimum possible
> > + * values of the Period register.
> > + */
> > + period = readl(base + TOD_PERIOD);
> > +
> > + if (neg_adj) {
> > + diff = (period - TOD_PERIOD_MIN) >> PERIOD_FRAC_OFFSET;
> > + adj_period = period - (diff << PERIOD_FRAC_OFFSET);
> > + rem_period = period - (rem << PERIOD_FRAC_OFFSET);
>
> rem seems to be uninitialized here.
Yes, the rem should be calculated in div_u64_rem(), I will change to code like this:
period = readl(base + TOD_PERIOD);
if (neg_adj) {
diff = (period - TOD_PERIOD_MIN) >> PERIOD_FRAC_OFFSET;
adj_period = period - (diff << PERIOD_FRAC_OFFSET);
count = div_u64_rem(delta, diff, &rem);
rem_period = period - (rem << PERIOD_FRAC_OFFSET);
} else {
diff = (TOD_PERIOD_MAX - period) >> PERIOD_FRAC_OFFSET;
adj_period = period + (diff << PERIOD_FRAC_OFFSET);
count = div_u64_rem(delta, diff, &rem);
rem_period = period + (rem << PERIOD_FRAC_OFFSET);
}
ret = 0;
if (count > TOD_ADJUST_COUNT_MAX) {
....
Powered by blists - more mailing lists