[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuE1bF0Hho4VwO6w3f+9z3j5TtscYzuAjj10MFt2mZXG2P8dQ@mail.gmail.com>
Date: Sun, 31 Dec 2023 18:00:17 +0200
From: Sagi Maimon <maimon.sagi@...il.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: Richard Cochran <richardcochran@...il.com>, Andy Lutomirski <luto@...nel.org>, datglx@...utronix.de,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Geert Uytterhoeven <geert@...ux-m68k.org>,
Peter Zijlstra <peterz@...radead.org>, Johannes Weiner <hannes@...xchg.org>,
Sohil Mehta <sohil.mehta@...el.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>,
Nhat Pham <nphamcs@...il.com>, Palmer Dabbelt <palmer@...ive.com>, Kees Cook <keescook@...omium.org>,
Alexey Gladkov <legion@...nel.org>, Mark Rutland <mark.rutland@....com>, linux-kernel@...r.kernel.org,
linux-api@...r.kernel.org, Linux-Arch <linux-arch@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH v3] posix-timers: add multi_clock_gettime system call
On Fri, Dec 29, 2023 at 5:27 PM Arnd Bergmann <arnd@...db.de> wrote:
>
> On Thu, Dec 28, 2023, at 13:24, Sagi Maimon wrote:
> > Some user space applications need to read some clocks.
> > Each read requires moving from user space to kernel space.
> > The syscall overhead causes unpredictable delay between N clocks reads
> > Removing this delay causes better synchronization between N clocks.
> >
> > Introduce a new system call multi_clock_gettime, which can be used to measure
> > the offset between multiple clocks, from variety of types: PHC, virtual PHC
> > and various system clocks (CLOCK_REALTIME, CLOCK_MONOTONIC, etc).
> > The offset includes the total time that the driver needs to read the clock
> > timestamp.
> >
> > New system call allows the reading of a list of clocks - up to PTP_MAX_CLOCKS.
> > Supported clocks IDs: PHC, virtual PHC and various system clocks.
> > Up to PTP_MAX_SAMPLES times (per clock) in a single system call read.
> > The system call returns n_clocks timestamps for each measurement:
> > - clock 0 timestamp
> > - ...
> > - clock n timestamp
> >
> > Signed-off-by: Sagi Maimon <maimon.sagi@...il.com>
>
> Hi Sagi,
>
> Exposing an interface to read multiple clocks makes sense to me,
> but I wonder if the interface you use is too inflexible.
>
Arnd - thanks alot for your notes.
> > --- a/include/uapi/asm-generic/unistd.h
> > +++ b/include/uapi/asm-generic/unistd.h
> > @@ -828,9 +828,11 @@ __SYSCALL(__NR_futex_wake, sys_futex_wake)
> > __SYSCALL(__NR_futex_wait, sys_futex_wait)
> > #define __NR_futex_requeue 456
> > __SYSCALL(__NR_futex_requeue, sys_futex_requeue)
> > +#define __NR_multi_clock_gettime 457
> > +__SYSCALL(__NR_multi_clock_gettime, sys_multi_clock_gettime)
> >
> > #undef __NR_syscalls
> > -#define __NR_syscalls 457
> > +#define __NR_syscalls 458
>
> Site note: hooking it up only here is sufficient for the
> code review but not for inclusion: once we have an agreement
> on the API, this should be added to all architectures at once.
>
> > +#define MULTI_PTP_MAX_CLOCKS 5 /* Max number of clocks */
> > +#define MULTI_PTP_MAX_SAMPLES 10 /* Max allowed offset measurement samples. */
> > +
> > +struct __ptp_multi_clock_get {
> > + unsigned int n_clocks; /* Desired number of clocks. */
> > + unsigned int n_samples; /* Desired number of measurements per clock. */
> > + clockid_t clkid_arr[MULTI_PTP_MAX_CLOCKS]; /* list of clock IDs */
> > + /*
> > + * Array of list of n_clocks clocks time samples n_samples times.
> > + */
> > + struct __kernel_timespec ts[MULTI_PTP_MAX_SAMPLES][MULTI_PTP_MAX_CLOCKS];
> > +};
>
> The fixed size arrays here seem to be an unnecessary limitation,
> both MULTI_PTP_MAX_SAMPLES and MULTI_PTP_MAX_CLOCKS are small
> enough that one can come up with scenarios where you would want
> a higher number, but at the same time the structure is already
> 808 bytes long, which is more than you'd normally want to put
> on the kernel stack, and which may take a significant time to
> copy to and from userspace.
>
> Since n_clocks and n_samples are always inputs to the syscall,
> you can just pass them as register arguments and use a dynamically
> sized array instead.
>
Both MULTI_PTP_MAX_SAMPLES and MULTI_PTP_MAX_CLOCKS are enough of any
usage we can think of,
But I think you are right, it is better to use a dynamically sized
array for future use, plus to use less stack memory.
On patch v4 a dynamically sized array will be used .
I leaving both MULTI_PTP_MAX_SAMPLES and MULTI_PTP_MAX_CLOCKS but
increasing their values, since there should be some limitation.
> It's not clear to me what you gain from having the n_samples
> argument over just calling the syscall repeatedly. Does
> this offer a benefit for accuracy or is this just meant to
> avoid syscall overhead.
It is mainly to avoid syscall overhead which also slightly improve the accuracy.
> > +SYSCALL_DEFINE1(multi_clock_gettime, struct __ptp_multi_clock_get
> > __user *, ptp_multi_clk_get)
> > +{
> > + const struct k_clock *kc;
> > + struct timespec64 kernel_tp;
> > + struct __ptp_multi_clock_get multi_clk_get;
> > + unsigned int i, j;
> > + int error;
> > +
> > + if (copy_from_user(&multi_clk_get, ptp_multi_clk_get,
> > sizeof(multi_clk_get)))
> > + return -EFAULT;
>
> Here you copy the entire structure from userspace, but
> I don't actually see the .ts[] array on the stack being
> accessed later as you just copy to the user pointer
> directly.
>
you are right, will be fixed on patch v4.
> > + for (i = 0; i < multi_clk_get.n_clocks; i++) {
> > + kc = clockid_to_kclock(multi_clk_get.clkid_arr[i]);
> > + if (!kc)
> > + return -EINVAL;
> > + error = kc->clock_get_timespec(multi_clk_get.clkid_arr[i],
> > &kernel_tp);
> > + if (!error && put_timespec64(&kernel_tp, (struct __kernel_timespec
> > __user *)
> > + &ptp_multi_clk_get->ts[j][i]))
> > + error = -EFAULT;
> > + }
>
> The put_timespec64() and possibly the clockid_to_kclock() have
> some overhead that may introduce jitter, so it may be better to
> pull that out of the loop and have a fixed-size array
> of timespec64 values on the stack and then copy them
> at the end.
>
This overhead may introduce marginal jitter in my opinion,
still I like your idea since it is better to remove any overhead.
This will be fixed in patch v4.
I don't intend to use "a fixed-size array of timespec64 values on the
stack" since it will cause stack memory overflow,
Instead I will use a dynamically sized array (according to your
previews advice).
> On the other hand, this will still give less accuracy than the
> getcrosststamp() callback with ioctl(PTP_SYS_OFFSET_PRECISE),
> so either the last bit of accuracy isn't all that important,
> or you need to refine the interface to actually be an
> improvement over the chardev.
>
I don't understand this comment, please explain.
The ioctl(PTP_SYS_OFFSET_PRECISE) is one specific case that can be
done by multi_clock_gettime syscall (which cover many more cases)
Plus the ioctl(PTP_SYS_OFFSET_PRECISE) works only on drivers that
support this feature.
> Arnd
Powered by blists - more mailing lists