linux-kernel - Re: Question about ktime_get_mono_fast

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANDhNCrkceUi=+S8xzcBzf8=uUpD4namcm5U-MoACTSVEpcMrg@mail.gmail.com>
Date:   Thu, 13 Oct 2022 21:13:22 -0700
From:   John Stultz <jstultz@...gle.com>
To:     Yosry Ahmed <yosryahmed@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Hao Luo <haoluo@...gle.com>,
        Stanislav Fomichev <sdf@...gle.com>
Subject: Re: Question about ktime_get_mono_fast_ns() non-monotonic behavior

On Thu, Oct 13, 2022 at 8:47 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
>
> On Thu, Oct 13, 2022 at 8:42 PM John Stultz <jstultz@...gle.com> wrote:
> >
> > On Thu, Oct 13, 2022 at 8:26 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> > > On Thu, Oct 13, 2022 at 7:39 PM John Stultz <jstultz@...gle.com> wrote:
> > > > On Mon, Sep 26, 2022 at 2:18 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> > > > >
> > > > > I have a question about ktime_get_mono_fast_ns(), which is used by the
> > > > > BPF helper bpf_ktime_get_ns() among other use cases. The comment above
> > > > > this function specifies that there are cases where the observed clock
> > > > > would not be monotonic.
> > > > >
> > > > > I had 2 beginner questions:
> > > >
> > > > Thinking about this a bit more, I have my own "beginner question": Why
> > > > does bpf_ktime_get_ns() need to use the ktime_get_mono_fast_ns()
> > > > accessor instead of ktime_get_ns()?
> > > >
> > > > I don't know enough about the contexts that bpf logic can run, so it's
> > > > not clear to me and it's not obviously commented either.
> > >
> > > I am not the best person to answer this question (the BPF list is
> > > CC'd, it's full of more knowledgeable people).
> > >
> > > My understanding is that because BPF programs can basically be run in
> > > any context (because they can attach to almost all functions /
> > > tracepoints in the kernel), the time accessor needs to be safe in all
> > > contexts.
> >
> > Ah. Ok, the tracepoint connection is indeed likely the case. Thanks
> > for clarifying.
> >
> > > Now that I know that ktime_get_mono_fast_ns() can drift significantly,
> > > I am wondering why we don't just read sched_clock(). Can the
> > > difference between sched_clock() on different cpus be even higher than
> > > the potential drift from ktime_get_mono_fast_ns()?
> >
> > sched_clock is also lock free and so I think it's possible to have
> > inconsistencies.
>
> Right, I am just trying to figure out which is worse,
> ktime_get_mono_fast_ns() or sched_clock(). It appears to me that both
> can be inconsistent, but at least AFAICT sched_clock() can only be
> inconsistent if read across different cpus, right? It should also be
> faster (at least in my experimentation).
>
> I am wondering if there is a bound on the inconsistency we might
> observe from sched_clock() if we read it across different cpus, and if
> there is, how does it compare to ktime_get_mono_fast_ns() in that
> regard.

Again, I think ktime_get_raw_fast_ns() (so CLOCK_MONOTONIC_RAW) is
likely to be closer to sched_clock() as neither of them are NTP
adjusted.
(Which also likely makes them unusable for the case where timestamps
are compared with userland CLOCK_MONOTONIC timestamps).

So folks might need a new bpf interface for that.

Also I think folks would want to avoid exporting sched_clock
timestamps out to userland as they aren't connected to a well defined
clockid, and may have odd behavior around suspend/resume, etc.

thanks
-john