linux-kernel - Re: [PATCH v3 3/3] arm64: Early boot time stamps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+CK2bCZHCp7d3SyiE2UpZr=K-NMPcGPVS0d-iU_7_tfcVmaGw@mail.gmail.com>
Date:   Fri, 4 Jan 2019 11:23:14 -0500
From:   Pavel Tatashin <pasha.tatashin@...een.com>
To:     Marc Zyngier <marc.zyngier@....com>
Cc:     catalin.marinas@....com, Will Deacon <will.deacon@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        rppt@...ux.vnet.ibm.com, Michal Hocko <mhocko@...e.com>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        andrew.murray@....com, james.morse@....com, sboyd@...nel.org,
        linux-arm-kernel@...ts.infradead.org,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 3/3] arm64: Early boot time stamps

Hi Marc,

Thank you for taking a look at this please see my replies below.

> > I think, given that on other platforms sched_clock() is already used
> > early, it is not a good idea to invent a different clock just for time
> > stamps.
>
> Square pegs vs round holes. Mimicking other architectures isn't always
> the right thing to do when faced with a different problem. We put a
> lot of effort in working around timer errata for a good reason, and
> feeding the rest of the system bogus timing information doesn't sound
> great.
>
> > We could limit arm64 approach only for chips where cntvct_el0 is
> > working: i.e. frequency is known, and the clock is stable, meaning
> > cannot go backward. Perhaps we would start early clock a little later,
> > but at least it will be available for the sane chips. The only
> > question, where during boot time this is known.
>
> How do you propose we do that? Defective timers can be a property of
> the implementation, of the integration, or both. In any case, it
> requires firmware support (DT, ACPI). All that is only available quite
> late, and moving it earlier is not easily doable.

OK, but could we at least whitelist something early with expectation
that the future chips won't be bogus?

> > Another approach is to modify sched_clock() in
> > kernel/time/sched_clock.c to never return backward value during boot.
> >
> > 1. Rename  current implementation of sched_clock() to sched_clock_raw()
> > 2. New sched_clock() would look like this:
> >
> > u64 sched_clock(void)
> > {
> >    if (static_branch(early_unstable_clock))
> >       return sched_clock_unstable();
> >    else
> >       return sched_clock_raw();
> > }
> >
> > 3. sched_clock_unstable() would look like this:
> >
> > u64 sched_clock_unstable(void)
> > {
> > again:
> >   static u64 old_clock;
> >   u64 new_clock = sched_clock_raw();
> >   static u64 old_clock_read =   READ_ONCE(old_clock);
> >   /* It is ok if time does not progress, but don't allow to go backward */
> >   if (new_clock < old_clock_read)
> >     return old_clock_read;
> >    /* update the old_clock value */
> >    if (cmpxchg64(&old_clock, old_clock_read, new_clock) != old_clock_read)
> >       goto again;
> >    return new_clock;
> > }
>
> You now have an "unstable" clock that is only allowed to move forward,
> until you switch to the real one. And at handover time, anything can
> happen.
>
> It is one thing to allow for the time stamping to be imprecise. But
> imposing the same behaviour on other parts of the kernel that have so
> far relied on a strictly monotonic sched_clock feels like a bad idea.

sched_clock() will still be strictly monotonic. During switch over we
will guarantee to continue from where the early clock left.

>
> What I'm proposing is that we allow architectures to override the hard
> tie between local_clock/sched_clock and kernel log time stamping, with
> the default being of course what we have today. This gives a clean
> separation between the two when the architecture needs to delay the
> availability of sched_clock until implementation requirements are
> discovered. It also keep sched_clock simple and efficient.
>
> To illustrate what I'm trying to argue for, I've pushed out a couple
> of proof of concept patches here[1]. I've briefly tested them in a
> guest, and things seem to work OK.

What I am worried is that decoupling time stamps from the
sched_clock() will cause uptime and other commands that show boot time
not to correlate with timestamps in dmesg with these changes. For them
to correlate we would still have to have a switch back to
local_clock() in timestamp_clock() after we are done with early boot,
which brings us back to using a temporarily unstable clock that I
proposed above but without adding an architectural hook for it. Again,
we would need to solve the problem of time continuity during switch
over, which is not a hard problem to solve, as we do it already in
sched_clock.c, and everytime clocksource changes.

During early boot time stamps project for x86 we were extra careful to
make sure that they stay the same.

Thank you,
Pasha