linux-kernel - Re: [PATCH 09/13] KVM: arm64: Add clock for hyp tracefs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zugm77Z47-kal5rf@google.com>
Date: Mon, 16 Sep 2024 13:39:11 +0100
From: Vincent Donnefort <vdonnefort@...gle.com>
To: John Stultz <jstultz@...gle.com>
Cc: rostedt@...dmis.org, mhiramat@...nel.org,
	linux-trace-kernel@...r.kernel.org, maz@...nel.org,
	oliver.upton@...ux.dev, kvmarm@...ts.linux.dev, will@...nel.org,
	qperret@...gle.com, kernel-team@...roid.com,
	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	Stephen Boyd <sboyd@...nel.org>,
	"Christopher S. Hall" <christopher.s.hall@...el.com>,
	Richard Cochran <richardcochran@...il.com>,
	Lakshmi Sowjanya D <lakshmi.sowjanya.d@...el.com>
Subject: Re: [PATCH 09/13] KVM: arm64: Add clock for hyp tracefs

On Fri, Sep 13, 2024 at 04:21:05PM -0700, 'John Stultz' via kernel-team wrote:
> On Wed, Sep 11, 2024 at 2:31 AM Vincent Donnefort <vdonnefort@...gle.com> wrote:
> >
> > Configure the hypervisor tracing clock before starting tracing. For
> > tracing purpose, the boot clock is interesting as it doesn't stop on
> > suspend. However, it is corrected on a regular basis, which implies we
> > need to re-evaluate it every once in a while.
> >
> > Cc: John Stultz <jstultz@...gle.com>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Cc: Stephen Boyd <sboyd@...nel.org>
> > Cc: Christopher S. Hall <christopher.s.hall@...el.com>
> > Cc: Richard Cochran <richardcochran@...il.com>
> > Cc: Lakshmi Sowjanya D <lakshmi.sowjanya.d@...el.com>
> > Signed-off-by: Vincent Donnefort <vdonnefort@...gle.com>
> >
> ...
> > +static void __hyp_clock_work(struct work_struct *work)
> > +{
> > +       struct delayed_work *dwork = to_delayed_work(work);
> > +       struct hyp_trace_buffer *hyp_buffer;
> > +       struct hyp_trace_clock *hyp_clock;
> > +       struct system_time_snapshot snap;
> > +       u64 rate, delta_cycles;
> > +       u64 boot, delta_boot;
> > +       u64 err = 0;
> > +
> > +       hyp_clock = container_of(dwork, struct hyp_trace_clock, work);
> > +       hyp_buffer = container_of(hyp_clock, struct hyp_trace_buffer, clock);
> > +
> > +       ktime_get_snapshot(&snap);
> > +       boot = ktime_to_ns(snap.boot);
> > +
> > +       delta_boot = boot - hyp_clock->boot;
> > +       delta_cycles = snap.cycles - hyp_clock->cycles;
> > +
> > +       /* Compare hyp clock with the kernel boot clock */
> > +       if (hyp_clock->mult) {
> > +               u64 cur = delta_cycles;
> > +
> > +               cur *= hyp_clock->mult;
> 
> Mult overflow protection (I see you already have a max_delta value) is
> probably needed here.

That should never happen really with the max_delta. But I could add a WARN_ON
and fallback to a 128-bits compute instead here too? 
> 
> > +               cur >>= hyp_clock->shift;
> > +               cur += hyp_clock->boot;
> > +
> > +               err = abs_diff(cur, boot);
> > +
> > +               /* No deviation, only update epoch if necessary */
> > +               if (!err) {
> > +                       if (delta_cycles >= hyp_clock->max_delta)
> > +                               goto update_hyp;
> > +
> > +                       goto resched;
> > +               }
> > +
> > +               /* Warn if the error is above tracing precision (1us) */
> > +               if (hyp_buffer->tracing_on && err > NSEC_PER_USEC)
> > +                       pr_warn_ratelimited("hyp trace clock off by %lluus\n",
> > +                                           err / NSEC_PER_USEC);
> 
> I'm curious in practice, does this come up often? If so, does it
> converge down nicely? Have you done much disruption testing using
> adjtimex?

So far, I haven't seen any error above ~100 ns on the machine I have tested
with, but that's a good point, I'll check how it looks when the boot clock is
less stable.

> 
> > +       }
> > +
> > +       if (delta_boot > U32_MAX) {
> > +               do_div(delta_boot, NSEC_PER_SEC);
> > +               rate = delta_cycles;
> > +       } else {
> > +               rate = delta_cycles * NSEC_PER_SEC;
> > +       }
> > +
> > +       do_div(rate, delta_boot);
> > +
> > +       clocks_calc_mult_shift(&hyp_clock->mult, &hyp_clock->shift,
> > +                              rate, NSEC_PER_SEC, CLOCK_MAX_CONVERSION_S);
> > +
> > +update_hyp:
> > +       hyp_clock->max_delta = (U64_MAX / hyp_clock->mult) >> 1;
> > +       hyp_clock->cycles = snap.cycles;
> > +       hyp_clock->boot = boot;
> > +       kvm_call_hyp_nvhe(__pkvm_update_clock_tracing, hyp_clock->mult,
> > +                         hyp_clock->shift, hyp_clock->boot, hyp_clock->cycles);
> > +       complete(&hyp_clock->ready);
> 
> I'm very forgetful, so maybe it's unnecessary, but for future-you or
> just other's like me, it might be worth adding some extra comments to
> clarify the assumptions in these calculations.

Ack.

> 
> 
> thanks
> -john

Thanks for your time!

-- 
Vincent