lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <MW5PR13MB5632A34DAEDD321E9813F997FDDFA@MW5PR13MB5632.namprd13.prod.outlook.com>
Date: Thu, 27 Nov 2025 00:16:23 +0000
From: "Bird, Tim" <Tim.Bird@...y.com>
To: Geert Uytterhoeven <geert@...ux-m68k.org>,
        Francesco Valla
	<francesco@...la.it>
CC: Tim Bird <tim@...d.org>, "pmladek@...e.com" <pmladek@...e.com>,
        Steve
 Rostedt <rostedt@...dmis.org>,
        "john.ogness@...utronix.de"
	<john.ogness@...utronix.de>,
        "senozhatsky@...omium.org"
	<senozhatsky@...omium.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML
	<linux-kernel@...r.kernel.org>,
        Linux Embedded
	<linux-embedded@...r.kernel.org>
Subject: RE: [PATCH] printk: add early_counter_ns routine for printk blind
 spot



> -----Original Message-----
> From: Geert Uytterhoeven <geert@...ux-m68k.org>
> Hi all,
> 
> On Wed, 26 Nov 2025 at 03:24, Francesco Valla <francesco@...la.it> wrote:
> > On Mon, Nov 24, 2025 at 10:30:52PM -0700, Tim Bird wrote:
> > > From: Tim Bird <tim.bird@...y.com>
> > >
> > > During early boot, printk timestamps are reported as zero,
> > > which creates a blind spot in early boot timings.  This blind
> > > spot hinders timing and optimization efforts for code that
> > > executes before time_init(), which is when local_clock() is
> > > initialized sufficiently to start returning non-zero timestamps.
> > > This period is about 400 milliseconds for many current desktop
> > > and embedded machines running Linux.
> > >
> > > Add an early_counter_ns function that returns nanosecond
> > > timestamps based on get_cycles().  get_cycles() is operational
> > > on arm64 and x86_64 from kernel start.  Add some calibration
> > > printks to allow setting configuration variables that are used
> > > to convert cycle counts to nanoseconds (which are then used
> > > in early printks).  Add CONFIG_EARLY_COUNTER_NS, as well as
> > > some associated conversion variables, as new kernel config
> > > variables.
> > >
> > > After proper configuration, this yields non-zero timestamps for
> > > printks from the very start of kernel execution.  The timestamps
> > > are relative to the start of the architecture-specific counter
> > > used in get_cycles (e.g. the TSC on x86_64 and cntvct_el0 on arm64).
> > > This means that the time reported reflects time-from-power-on for
> > > most embedded products.  This is also a useful data point for
> > > boot-time optimization work.
> > >
> > > Note that there is a discontinuity in the timestamp sequencing
> > > when standard clocks are finally initialized in time_init().
> > > The printk timestamps are thus not monotonically increasing
> > > through the entire boot.
> >
> > This is... not going to work, IMO, and might lead to breakages in
> > userspace tools (are printk timings a userspace API?).
> 
> I think they are.
> 
> Another approach would be to defer the calibration/conversion to
> userspace, and make sure the early part stands out.
> I.e. when real timekeeping is available, kernel messages are prefixed by
> "[%5lu.%06lu]".  Early messages could be prefixed by a plain integer
> "[%12u]", containing the raw cycle counter value.
> The presence of the decimal point would make the difference obvious.

I thought about this while I was creating this.
It wouldn't require the extra configuration for MULT and SHIFT (which would be nice),
and it would be, as you say, very obvious that this was not a regular timestamp. 
This means it could be enabled on a generic kernel (making more likely it could be
enabled by default). And really only boot-time optimizers would care enough to
decode the data, so the onus would be on them to run the tool.  Everyone else
could ignore them.

I'm not sure if it would break existing printk-processing tools.  I suspect it would.

Also,  I find that post-processing tools often get overlooked.
I asked at ELC this year how many people are using show_delta, which 
has been upstream for years, and can do a few neat things with printk timestamps,
and not a single person had even heard of it.

In this scenario, you would still need to have the calibration printks in
the code so that the tool could pull them out to then convert the cycle-valued
printks into printks with regular timestamps.

I could see doing this if people object to the non-genericity of the current
solution.

> 
> > I actually have a counter-proposal: the time obtained through cycle
> > evaluation is used as an offset to be added to the printk time after
> > time_init() is called. A (working, but maybe sub-optimal) patch to
> > obtain this is attached at the end.
> 
> Oh, that's a nice idea, too!

Thanks for the feedback!
 -- Tim

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ