lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCof7+q+-XzQoP=w0pcrS_-ifH9pmAmtq8H++tbognBv1A@mail.gmail.com>
Date: Fri, 13 Sep 2024 11:59:56 -0700
From: John Stultz <jstultz@...gle.com>
To: Jeff Layton <jlayton@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>, Stephen Boyd <sboyd@...nel.org>, 
	Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>, 
	Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>, 
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Jonathan Corbet <corbet@....net>, 
	Chandan Babu R <chandan.babu@...cle.com>, "Darrick J. Wong" <djwong@...nel.org>, 
	"Theodore Ts'o" <tytso@....edu>, Andreas Dilger <adilger.kernel@...ger.ca>, Chris Mason <clm@...com>, 
	Josef Bacik <josef@...icpanda.com>, David Sterba <dsterba@...e.com>, Hugh Dickins <hughd@...gle.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Chuck Lever <chuck.lever@...cle.com>, 
	Vadim Fedorenko <vadim.fedorenko@...ux.dev>, linux-kernel@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-trace-kernel@...r.kernel.org, 
	linux-doc@...r.kernel.org, linux-xfs@...r.kernel.org, 
	linux-ext4@...r.kernel.org, linux-btrfs@...r.kernel.org, 
	linux-nfs@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v7 01/11] timekeeping: move multigrain timestamp floor
 handling into timekeeper

On Fri, Sep 13, 2024 at 6:54 AM Jeff Layton <jlayton@...nel.org> wrote:
>
> For multigrain timestamps, we must keep track of the latest timestamp
> that has ever been handed out, and never hand out a coarse time below
> that value.
>
> Add a static singleton atomic64_t into timekeeper.c that we can use to
> keep track of the latest fine-grained time ever handed out. This is

Maybe drop "ever" and  add "handed out through a specific interface",
as timestamps can be accessed in a lot of ways that don't keep track
of what was returned.


> tracked as a monotonic ktime_t value to ensure that it isn't affected by
> clock jumps.
>
> Add two new public interfaces:
>
> - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the
>   coarse-grained clock and the floor time
>
> - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries
>   to swap it into the floor. A timespec64 is filled with the result.
>
> Since the floor is global, we take great pains to avoid updating it
> unless it's absolutely necessary. If we do the cmpxchg and find that the
> value has been updated since we fetched it, then we discard the
> fine-grained time that was fetched in favor of the recent update.
>
> To maximize the window of this occurring when multiple tasks are racing
> to update the floor, ktime_get_coarse_real_ts64_mg returns a cookie
> value that represents the state of the floor tracking word, and
> ktime_get_real_ts64_mg accepts a cookie value that it uses as the "old"
> value when calling cmpxchg().
>
> Signed-off-by: Jeff Layton <jlayton@...nel.org>
> ---
>  include/linux/timekeeping.h |  4 +++
>  kernel/time/timekeeping.c   | 81 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 85 insertions(+)
>
> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> index fc12a9ba2c88..cf2293158c65 100644
> --- a/include/linux/timekeeping.h
> +++ b/include/linux/timekeeping.h
> @@ -45,6 +45,10 @@ extern void ktime_get_real_ts64(struct timespec64 *tv);
>  extern void ktime_get_coarse_ts64(struct timespec64 *ts);
>  extern void ktime_get_coarse_real_ts64(struct timespec64 *ts);
>
> +/* Multigrain timestamp interfaces */
> +extern u64 ktime_get_coarse_real_ts64_mg(struct timespec64 *ts);
> +extern void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie);
> +
>  void getboottime64(struct timespec64 *ts);
>
>  /*
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 5391e4167d60..ee11006a224f 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -114,6 +114,13 @@ static struct tk_fast tk_fast_raw  ____cacheline_aligned = {
>         .base[1] = FAST_TK_INIT,
>  };
>
> +/*
> + * This represents the latest fine-grained time that we have handed out as a
> + * timestamp on the system. Tracked as a monotonic ktime_t, and converted to the
> + * realtime clock on an as-needed basis.
> + */
> +static __cacheline_aligned_in_smp atomic64_t mg_floor;
> +
>  static inline void tk_normalize_xtime(struct timekeeper *tk)
>  {
>         while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) {
> @@ -2394,6 +2401,80 @@ void ktime_get_coarse_real_ts64(struct timespec64 *ts)
>  }
>  EXPORT_SYMBOL(ktime_get_coarse_real_ts64);
>
> +/**
> + * ktime_get_coarse_real_ts64_mg - get later of coarse grained time or floor
> + * @ts: timespec64 to be filled
> + *
> + * Adjust floor to realtime and compare it to the coarse time. Fill
> + * @ts with the latest one. Returns opaque cookie suitable for passing
> + * to ktime_get_real_ts64_mg().
> + */
> +u64 ktime_get_coarse_real_ts64_mg(struct timespec64 *ts)
> +{
> +       struct timekeeper *tk = &tk_core.timekeeper;
> +       u64 floor = atomic64_read(&mg_floor);
> +       ktime_t f_real, offset, coarse;
> +       unsigned int seq;
> +
> +       WARN_ON(timekeeping_suspended);
> +
> +       do {
> +               seq = read_seqcount_begin(&tk_core.seq);
> +               *ts = tk_xtime(tk);
> +               offset = *offsets[TK_OFFS_REAL];
> +       } while (read_seqcount_retry(&tk_core.seq, seq));
> +
> +       coarse = timespec64_to_ktime(*ts);
> +       f_real = ktime_add(floor, offset);
> +       if (ktime_after(f_real, coarse))
> +               *ts = ktime_to_timespec64(f_real);
> +       return floor;
> +}
> +EXPORT_SYMBOL_GPL(ktime_get_coarse_real_ts64_mg);
> +
> +/**
> + * ktime_get_real_ts64_mg - attempt to update floor value and return result
> + * @ts:                pointer to the timespec to be set
> + * @cookie:    opaque cookie from earlier call to ktime_get_coarse_real_ts64_mg()
> + *
> + * Get a current monotonic fine-grained time value and attempt to swap
> + * it into the floor using @cookie as the "old" value. @ts will be
> + * filled with the resulting floor value, regardless of the outcome of
> + * the swap.

I'd add more detail here to clarify that this can return a coarse
floor value if the cookie is stale.

> +void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie)
> +{
> +       struct timekeeper *tk = &tk_core.timekeeper;
> +       ktime_t offset, mono, old = (ktime_t)cookie;
> +       unsigned int seq;
> +       u64 nsecs;
> +
> +       WARN_ON(timekeeping_suspended);
> +
> +       do {
> +               seq = read_seqcount_begin(&tk_core.seq);
> +
> +               ts->tv_sec = tk->xtime_sec;
> +               mono = tk->tkr_mono.base;
> +               nsecs = timekeeping_get_ns(&tk->tkr_mono);
> +               offset = *offsets[TK_OFFS_REAL];
> +       } while (read_seqcount_retry(&tk_core.seq, seq));
> +
> +       mono = ktime_add_ns(mono, nsecs);
> +
> +       if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) {
> +               ts->tv_nsec = 0;
> +               timespec64_add_ns(ts, nsecs);
> +       } else {
> +               /*
> +                * Something has changed mg_floor since "old" was
> +                * fetched. That value is just as valid, so accept it.
> +                */

Mostly because I embarrassingly tripped over this in front of
everyone, I might suggest:
/*
 * mg_floor was updated since the cookie was fetched, so the
 * the try_cmpxchg failed. However try_cmpxchg updated old
 * with the current mg_floor, so use that to return the current
 * coarse floor value
 */

:)

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ