lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 11 Nov 2010 15:41:19 -0800
From:	john stultz <johnstul@...ibm.com>
To:	Kyle Moffett <kyle@...fetthome.net>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Alexander Shishkin <virtuoso@...nd.org>,
	Valdis.Kletnieks@...edu, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Kay Sievers <kay.sievers@...y.org>, Greg KH <gregkh@...e.de>,
	Chris Friesen <chris.friesen@...band.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Kirill A. Shutemov" <kirill@...temov.name>
Subject: Re: [PATCHv6 0/7] system time changes notification

On Thu, 2010-11-11 at 18:19 -0500, Kyle Moffett wrote:
> On Thu, Nov 11, 2010 at 17:50, Thomas Gleixner <tglx@...utronix.de> wrote:
> > On Thu, 11 Nov 2010, Kyle Moffett wrote:
> >> What about maybe adding device nodes for various kinds of "clock"
> >> devices?  You could then do:
> >>
> >> #define CLOCK_FD 0x80000000
> >> fd = open("/dev/clock/realtime", O_RDWR);
> >> poll(fd);
> >> clock_gettime(CLOCK_FD|fd, &ts);
> >
> > That won't work due to the posix-cputimers occupying the negative
> > number space already.
> 
> Hmm, looks like the manpages clock_gettime(2) et. al. need updating,
> they don't mention anything at all about negative clockids.  The same
> thing could still be done with, EG:
> 
> #define CLOCK_FD 0x40000000

Again, see Richard's patch and the discussion around it for various
complications here (which cause pid_t size limits and run into
limitations with max number of fds per process).

> > This is very similar in spirit to what's being done by Richard Cochran's
> > dynamic clock devices code: http://lwn.net/Articles/413332/
> 
> Hmm, I've just been poking around and thinking about an extension of
> this concept.  Right now we have:
> 
> /sys/devices/system/clocksource
> /sys/devices/system/clocksource/clocksource0
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> /sys/devices/system/clocksource/clocksource0/available_clocksource
> 
> Could we actually register the separate clocksources (hpet, acpi_pm,
> etc) in the device model properly?
> 
> Then consider the possibility of creating "virtual clocksources" which
> are measured against an existing clocksource.  They could be
> independently slewed and adjusted relative to the parent clocksource.
> Then the "UTS namespace" feature could also affect the current
> clocksource used for CLOCK_MONOTONIC, etc.
> 
> You could perform various forms of time-sensitive software testing
> without causing problems for a "make" process running elsewhere on the
> system.  You could test the operation of various kinds of software
> across large jumps or long periods of time (at a highly accelerated
> rate) without impacting your development environment.

This can already be done by registering a bogus clocksource that returns
a counter value <<'ed up. 

That said, the entire system will then see time run faster, and since
timer irqs are triggered off of other devices and other devices notion
of time would not be accelerated, the irqs would seem late. At extreme
values, this would cause system issues, like instant device timeouts.
Further, it wouldn't accelerate the cpu execution time, so applications
would seem to run very slowly.

At one time I looked at doing this in the other direction (slowing down
system time to emulate what a faster cpu would be like), but there's
tons of issues around the fact that there are numerous time domains in a
system that are all very close to actual time, so lots of assumptions
are made as if there is really only one time domain. So by speeding up
the system time, you break the assumption between devices and things
don't function properly.

Again, you might be able to get away with very minor freq adjustments,
but that can easily be done by registering a clocksource with an
incorrect freq value.

> One really nice example would be testing "ntpd" itself; you could run
> a known-good "ntpd" in the base system to maintain a very stable
> clock, then simulate all kinds of terrifyingly bad clock hardware and
> kernel problems (sudden frequency changes, etc) in a container.  This
> kind of stuff can currently only be easily simulated with specialized
> hardware.

Eh, this stuff is emulated in software frequently. 

Also, doing what you propose could be easily done via virtualization or
a hardware emulator where you really can manage all the different time
domains properly.


> You could also improve "container-based" virtualization, allowing
> perceived "CPU-time" to be slewed based on the cgroup.  IE: Processes
> inside of a container allocated only "33%" of one CPU might see their
> "CPU-time" accrue 3 times faster than a process outside of the
> container, as though the process was the only thing running on the
> system.  Running "top" inside of the container might show 100% CPU
> even though the hardware is at 33% utilization, or 200% CPU if the
> container is currently bursting much higher.

I just don't see the real benefit to greatly complicating the
timekeeping code to keep track of multiple fake time domains when these
things can be achieved in other ways (emulation, or virtualization with
freq adjusted clocksources).

The only cases I see where exposing alternative time domains to the
system time is a good thing is where you actually need to precisely
interact with a device that is adjusted or runs on a different time
crystal (as is the case with the PTP clock Richard is working on, or the
clocks on audio hardware).

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ