linux-kernel - Re: [Y2038][time namespaces] Question regarding CLOCK_REALTIME support plans in Linux time namespaces

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k0v7kwdc.fsf@nanos.tec.linutronix.de>
Date:   Sat, 31 Oct 2020 02:38:39 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Carlos O'Donell <carlos@...hat.com>,
        Zack Weinberg <zackw@...ix.com>, Cyril Hrubis <chrubis@...e.cz>
Cc:     Dmitry Safonov <dima@...sta.com>, Andrei Vagin <avagin@...il.com>,
        GNU C Library <libc-alpha@...rceware.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [Y2038][time namespaces] Question regarding CLOCK_REALTIME support plans in Linux time namespaces

Carlos,

On Fri, Oct 30 2020 at 18:19, Carlos O'Donell wrote:
> On 10/30/20 4:06 PM, Thomas Gleixner wrote:
>> On Fri, Oct 30 2020 at 12:58, Carlos O'Donell wrote:
>>> I expect that more requests for further time isolation will happen
>>> given the utility of this in containers.
>> 
>> There was a lengthy discussion about this and the only "usecase" which
>> was brought up was having different NTP servers in name spaces, i.e. the
>> leap second ones and the smearing ones.
>
> In the non-"request for ponies" category:
>
> * Running legacy 32-bit applications in containers with CLOCK_REALTIME set
>   to some value below y2038.

That's broken to begin with. That has been tried with Y2K and failed
miserably.

Any real application which needs access to CLOCK_REALTIME requires to
have access to something which is at least close to the real time.

> * Testing kernel and userspace clock handling code without needing to
>   run on bare-metal, VM, or other.

I grant you that, but it comes with a large can of worms as it opens the
door for 'request for ponies' all over the place.

>> Now imagine 1000 containers each running their own NTP. Guess what the
>> host does in each timer interrupt? Chasing 1000 containers and update
>> their notion of CLOCK_REALTIME. In the remaining 5% CPU time the 1000
>> containers can do their computations.
>
> How is this different than balancing any other resource that you give
> to a container/vm on a host?
>
> Can you enable 1000 containers running smbd/nmbd and expect to get
> great IO performance?

That's bogus. The kernel can control whether these daemons run or not
and how much CPU time they get, as it can control whether any container
application runs or not.

But when it comes down to time correctness that's a different story. At
the moment it allows to have a gazillion of notions of CLOCK_REALTIME
then it has to guarantee the correctness for all of them no matter what.

>> But even if you restrict it to a trivial offset without NTP
>> capabilities, what's the semantics of that offset when the host time is
>> set?
>
> Now you're talking about an implementation. This thread is simply
> "Would we implement CLOCK_REALTIME?" Is the answer "Maybe, if we solve
> all these other problems?"

Maybe, if you solved all these problems which is going to be finished at
the theoretical level in about 20 years from now. As I'm planning to be
retired and Y2038 has passed by then, feel free to pursue that route.

>>> If we have to use qemu today then that's where we're at, but again
>>> I expect our use case is representative of more than just glibc.
>> 
>> For testing purposes it might be. For real world use cases not so
>> much. People tend to rely on the coordinated nature of CLOCK_TAI and
>> CLOCK_REALTIME.
>
> Except we have two real world use cases, at the top of this email, 
> that could extend to a lot of software. We know legacy 32-bit 
> applications exist that will break with CLOCK_REALTIME past
> y2038. Software exists that manipulates time and needs testing
> with specific time values e.g. month crossings, day crossings,
> leap year crossings, etc.

Again. I agree with the testing part, but the legacy application part is
wishful thinking at least. IMO it's utter nonsense.

Coming back to your test coverage argument. I really don't see a problem
with the requirement of having qemu installed in order to run 'make
check'.

If you can't ask that from your contributors, then asking me to provide
you a namespace magic is just hillarious. The contributor who refuses to
install qemu will also insist to run on some last century kernel which
does not even know about name spaces at all.

Instead of asking for ponies your time might be better spent with
providing tools which just make it easy to run 'make check' with all
bells and whistels.

Virtualization is the right answer to the testing problem and if people
really insist on running their broken legacy apps past 2038, then stick
them into a VM and charge boatloads of money for that service. 
 
>>> Does checkpointing work robustly when userspace APIS use 
>>> CLOCK_REALTIME (directly or indirectly) in the container?
>> 
>> AFAICT, yes. That was the conclusion over the lenghty discussion about
>> time name spaces and their requirements.
>
> If this is the case then have we established behaviours that
> happen when such processes are migrated to other systems with
> different CLOCK_REALTIME clocks? Would these behaviours serve
> as the basis of how CLOCK_REALTIME in a namespace would behave?
>
> That is to say that migrating a container to a system with a
> different CLOCK_REALTIME should behave similarly to what happens
> when CLOCK_REALTIME is changed locally and you have a container
> with a unique CLOCK_REALTIME?

Any application has to be able to deal with CLOCK_REALTIME changing
under their feet no matter what. So why would migrating a container from
host A to host B which have a different notion of CLOCK_REALTIME make
any difference?

Please stop to abuse container migration which works perfectly fine with
the real problems vs. timekeeping solved (CLOCK_MONOTONIC and
CLOCK_BOOTTIME going backwards) as an argument for something which can
and should be solved entirely in user space.

  1) Testing

     Virtualization solves that problem. Creating tools to handle that
     conveniantly for your users/contributors is not rocket science.

  2) Legacy applications

     It does matter at all if you stick the application into a container
     which tells the kernel that it runs in some different time universe
     or if you start the very same application with a libc variant which
     uses the Y2038 aware interfaces of the kernel and pretends to be in
     the pre Y2038 time universe when handing time down to the
     application.
     
     If you have a bunch of applications which all suffer from the same
     problem and are completely disconnected from the real world notion
     of CLOCK_REALTIME then stick them into a VM and be done with it.

Just because something could be solved at the kernel level does not mean
that it is the right thing to do.

Thanks,

        tglx