linux-kernel - Re: [RFC] HZ free ntp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1166037549.6425.21.camel@localhost.localdomain>
Date:	Wed, 13 Dec 2006 11:19:09 -0800
From:	john stultz <johnstul@...ibm.com>
To:	Roman Zippel <zippel@...ux-m68k.org>
Cc:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...l.org>, linux-kernel@...r.kernel.org
Subject: Re: [RFC] HZ free ntp

On Wed, 2006-12-13 at 14:47 +0100, Roman Zippel wrote:
> Hi,
> 
> On Tue, 12 Dec 2006, john stultz wrote:
> 
> > Basically INTERVAL_LENGTH_NSEC defines the NTP interval length that the
> > time code will use to accumulate with. In this patch I've pushed it out
> > to a full second, but it could be set via config (NSEC_PER_SEC/HZ for
> > regular systems, something larger for systems using dynticks).
> 
> Why do you want to use such an interval? This makes everything only more
> complicated.
> The largest possible interval is freq cycles (or 1 second without
> adjustments). That is the base interval and without redesigning NTP we
> can't change that. This base interval can be subdivided into smaller
> intervals for incremental updates.

Indeed, larger then 1 second intervals would require the second_overflow
code to be reworked too. However, I'm not proposing larger then 1 second
intervals at this point. I'm just allowing for non-HZ intervals.

> You cannot choose arbitrary intervals otherwise you get other problems,
> e.g. with your patch time_offset handling is broken.

I'm not seeing this yet. Any more details? 

> > +	/* calculate the length of one NTP adjusted second */
> > +	second_length = (u64)(tick_usec * NSEC_PER_USEC * USER_HZ);
> > +	second_length += (s64)CLOCK_TICK_ADJUST;
> > +	adj_length = (s64)time_freq;
> > +
> > +	/* calculate tick length @ HZ*/
> > +	tick_length = (second_length << TICK_LENGTH_SHIFT)
> > +			+ (adj_length << (TICK_LENGTH_SHIFT - SHIFT_NSEC));
> > +	do_div(tick_length, HZ);
> > +	tick_nsec = tick_length >> TICK_LENGTH_SHIFT;
> > +
> > +
> > +	/* calculate interval_length_base */
> > +	/* XXX - this is broken up to avoid 64bit overlfows */
> > +	interval_length_base = second_length * INTERVAL_LENGTH_NSEC;
> > +	interval_length_base <<= 2;
> > +	do_div(interval_length_base, NSEC_PER_SEC);
> > +	interval_length_base <<= TICK_LENGTH_SHIFT-2;
> 
> You don't have to introduce anything new, it's tick_length that changes
> and HZ that becomes a variable in this function.

So, forgive me for rehashing this, but it seems we're cross talking
again. The context here is the dynticks code. Where HZ doesn't change,
but we get interrupts at much reduced rates. The problem is, if the
interrupts slow to ~1 per second frequencies, we end up spending a ton
of time in the main update_wall_time loop, processing that 1 second in
1/HZ chunks. The current code is correct, but just wastes a lot of time.

So I'm trying to come up w/ an approach (the earlier, and broken,
exponential accumulation patch had the same intention), that allows us
to accumulate time in larger chunks. However, in doing so we have to
work w/ the ntp.c code which (as Ingo earlier mentioned) has a number of
HZ based assumptions.

This last patch tries to give NTP and the timekeeping an agreed upon
chunk (I chose "interval" as the term, since "tick" is somewhat
connected to HZ) of time upon which accumulation is done, so we can have
courser second updates, or finer grained updates, depending on config
settings.

In fairness, the patch probably going about this in a less then perfect
way, but that's why I'm asking for your feedback and suggestions. What
are your thoughts for reducing the time spent in the update_wall_time
loop in the context of dynticks?

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/