lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a273a8e6bb6785c5666bd2b8b83e3a437ebdeae4.camel@kernel.org>
Date: Sat, 14 Sep 2024 19:14:06 -0400
From: Jeff Layton <jlayton@...nel.org>
To: John Stultz <jstultz@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>, Stephen Boyd <sboyd@...nel.org>, 
 Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner
 <brauner@...nel.org>, Jan Kara <jack@...e.cz>,  Steven Rostedt
 <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>, Mathieu
 Desnoyers <mathieu.desnoyers@...icios.com>, Jonathan Corbet
 <corbet@....net>, Chandan Babu R <chandan.babu@...cle.com>, "Darrick J.
 Wong" <djwong@...nel.org>, Theodore Ts'o <tytso@....edu>, Andreas Dilger
 <adilger.kernel@...ger.ca>, Chris Mason <clm@...com>, Josef Bacik
 <josef@...icpanda.com>, David Sterba <dsterba@...e.com>,  Hugh Dickins
 <hughd@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>, Chuck Lever
 <chuck.lever@...cle.com>, Vadim Fedorenko <vadim.fedorenko@...ux.dev>,
 Randy Dunlap <rdunlap@...radead.org>, linux-kernel@...r.kernel.org,
 linux-fsdevel@...r.kernel.org,  linux-trace-kernel@...r.kernel.org,
 linux-doc@...r.kernel.org,  linux-xfs@...r.kernel.org,
 linux-ext4@...r.kernel.org,  linux-btrfs@...r.kernel.org,
 linux-nfs@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v8 01/11] timekeeping: move multigrain timestamp floor
 handling into timekeeper

On Sat, 2024-09-14 at 13:10 -0700, John Stultz wrote:
> On Sat, Sep 14, 2024 at 10:07 AM Jeff Layton <jlayton@...nel.org> wrote:
> > 
> > For multigrain timestamps, we must keep track of the latest timestamp
> > that has ever been handed out, and never hand out a coarse time below
> > that value.
> > 
> > Add a static singleton atomic64_t into timekeeper.c that we can use to
> > keep track of the latest fine-grained time ever handed out. This is
> > tracked as a monotonic ktime_t value to ensure that it isn't affected by
> > clock jumps.
> > 
> > Add two new public interfaces:
> > 
> > - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the
> >   coarse-grained clock and the floor time
> > 
> > - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries
> >   to swap it into the floor. A timespec64 is filled with the result.
> > 
> > Since the floor is global, we take great pains to avoid updating it
> > unless it's absolutely necessary. If we do the cmpxchg and find that the
> > value has been updated since we fetched it, then we discard the
> > fine-grained time that was fetched in favor of the recent update.
> > 
> > To maximize the window of this occurring when multiple tasks are racing
> > to update the floor, ktime_get_coarse_real_ts64_mg returns a cookie
> > value that represents the state of the floor tracking word, and
> > ktime_get_real_ts64_mg accepts a cookie value that it uses as the "old"
> > value when calling cmpxchg().
> 
> This last bit seems out of date.
> 

Thanks. Dropped the last paragraph.

> > ---
> >  include/linux/timekeeping.h |  4 +++
> >  kernel/time/timekeeping.c   | 82 +++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 86 insertions(+)
> > 
> > diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> > index fc12a9ba2c88..7aa85246c183 100644
> > --- a/include/linux/timekeeping.h
> > +++ b/include/linux/timekeeping.h
> > @@ -45,6 +45,10 @@ extern void ktime_get_real_ts64(struct timespec64 *tv);
> >  extern void ktime_get_coarse_ts64(struct timespec64 *ts);
> >  extern void ktime_get_coarse_real_ts64(struct timespec64 *ts);
> > 
> > +/* Multigrain timestamp interfaces */
> > +extern void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts);
> > +extern void ktime_get_real_ts64_mg(struct timespec64 *ts);
> > +
> >  void getboottime64(struct timespec64 *ts);
> > 
> >  /*
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index 5391e4167d60..16937242b904 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -114,6 +114,13 @@ static struct tk_fast tk_fast_raw  ____cacheline_aligned = {
> >         .base[1] = FAST_TK_INIT,
> >  };
> > 
> > +/*
> > + * This represents the latest fine-grained time that we have handed out as a
> > + * timestamp on the system. Tracked as a monotonic ktime_t, and converted to the
> > + * realtime clock on an as-needed basis.
> > + */
> > +static __cacheline_aligned_in_smp atomic64_t mg_floor;
> > +
> >  static inline void tk_normalize_xtime(struct timekeeper *tk)
> >  {
> >         while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) {
> > @@ -2394,6 +2401,81 @@ void ktime_get_coarse_real_ts64(struct timespec64 *ts)
> >  }
> >  EXPORT_SYMBOL(ktime_get_coarse_real_ts64);
> > 
> > +/**
> > + * ktime_get_coarse_real_ts64_mg - get later of coarse grained time or floor
> > + * @ts: timespec64 to be filled
> > + *
> > + * Adjust floor to realtime and compare it to the coarse time. Fill
> > + * @ts with the latest one. Note that this is a filesystem-specific
> > + * interface and should be avoided outside of that context.
> > + */
> > +void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts)
> > +{
> > +       struct timekeeper *tk = &tk_core.timekeeper;
> > +       u64 floor = atomic64_read(&mg_floor);
> > +       ktime_t f_real, offset, coarse;
> > +       unsigned int seq;
> > +
> > +       WARN_ON(timekeeping_suspended);
> > +
> > +       do {
> > +               seq = read_seqcount_begin(&tk_core.seq);
> > +               *ts = tk_xtime(tk);
> > +               offset = *offsets[TK_OFFS_REAL];
> > +       } while (read_seqcount_retry(&tk_core.seq, seq));
> > +
> > +       coarse = timespec64_to_ktime(*ts);
> > +       f_real = ktime_add(floor, offset);
> > +       if (ktime_after(f_real, coarse))
> > +               *ts = ktime_to_timespec64(f_real);
> > +}
> > +EXPORT_SYMBOL_GPL(ktime_get_coarse_real_ts64_mg);
> > +
> > +/**
> > + * ktime_get_real_ts64_mg - attempt to update floor value and return result
> > + * @ts:                pointer to the timespec to be set
> > + *
> > + * Get a current monotonic fine-grained time value and attempt to swap
> > + * it into the floor. @ts will be filled with the resulting floor value,
> > + * regardless of the outcome of the swap. Note that this is a filesystem
> > + * specific interface and should be avoided outside of that context.
> > + */
> > +void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie)
> 
> Still passing a cookie. It doesn't match the header definition, so I'm
> surprised this builds.
> 

Yeah, I didn't see a warning when I built it. Luckily the extra
parameter is ignored anyway, so it no harm. I've fixed that up in my
tree.

> > +{
> > +       struct timekeeper *tk = &tk_core.timekeeper;
> > +       ktime_t old = atomic64_read(&mg_floor);
> > +       ktime_t offset, mono;
> > +       unsigned int seq;
> > +       u64 nsecs;
> > +
> > +       WARN_ON(timekeeping_suspended);
> > +
> > +       do {
> > +               seq = read_seqcount_begin(&tk_core.seq);
> > +
> > +               ts->tv_sec = tk->xtime_sec;
> > +               mono = tk->tkr_mono.base;
> > +               nsecs = timekeeping_get_ns(&tk->tkr_mono);
> > +               offset = *offsets[TK_OFFS_REAL];
> > +       } while (read_seqcount_retry(&tk_core.seq, seq));
> > +
> > +       mono = ktime_add_ns(mono, nsecs);
> > +
> > +       if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) {
> > +               ts->tv_nsec = 0;
> > +               timespec64_add_ns(ts, nsecs);
> > +       } else {
> > +               /*
> > +                * Something has changed mg_floor since "old" was
> > +                * fetched. "old" has now been updated with the
> > +                * current value of mg_floor, so use that to return
> > +                * the current coarse floor value.
> > +                */
> > +               *ts = ktime_to_timespec64(ktime_add(old, offset));
> > +       }
> > +}
> > +EXPORT_SYMBOL_GPL(ktime_get_real_ts64_mg);
> 
> Other than those issues, I'm ok with it. Thanks again for working
> through my concerns!
> 
> Since I'm traveling for LPC soon, to save the next cycle, once the
> fixes above are sorted:
>  Acked-by: John Stultz <jstultz@...gle.com>
> 

Thanks for the review!
-- 
Jeff Layton <jlayton@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ