[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1606141045191.5839@nanos>
Date: Tue, 14 Jun 2016 10:50:03 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: George Spelvin <linux@...encehorizons.net>
cc: edumazet@...gle.com, linux-kernel@...r.kernel.org,
peterz@...radead.org, richardcochran@...il.com
Subject: Re: [patch 13/20] timer: Switch to a non cascading wheel
On Tue, 14 Jun 2016, George Spelvin wrote:
> I think I see a buglet in your level-5 cascading.
>
> Suppose a timer is requested far in the future for a time
> that is an exact multiple of 32768 jiffies.
>
> collect_expired_timers() scans level 5 after all the previous ones,
> and will cascade it to level 0, in a level-0 bucket which has already
> been scanned, and won't be scanned again for 64 jiffies.
>
> I agree that 64 jiffies is well within your allowed rounding accuracy,
> and order of timer firing is not guaranteed when they're for the same
> time, but it is a bit odd when a timer fires 32 jiffies *before* another
> timer scheduled for 32 jiffies later. That's the sort of peculiarity
> that could lead to a subtle bug.
I thought about that and when looking at those long timeout thingies I came to
the conclusion that it's simply not worth the trouble.
> Wouldn't this all be so much simpler as
>
> #define LVL_BITS 6 /* Renamed previous LVL_SHIFT */
> #define LVL_SIZE (1 << LVL_BITS)
> #define LVL_MASK (LVL_BITS - 1)
> #define LVL_OFFS(n) ((n) * LVL_SIZE)
> #define LVL_SHIFT(n) ((n) * LVL_CLK_SHIFT)
> #define LVL_GRAN(n) (1 << LVL_SHIFT(n))
Indeed.
> Ideally, you'd like all of that
>
> + if (delta < LVL1_TSTART) {
> + idx = (expires + LVL0_GRAN) & LVL_MASK;
> + } else if (delta < LVL2_TSTART) {
> + idx = calc_index(expires, LVL1_GRAN, LVL1_SHIFT, LVL1_OFFS);
> + } else if (delta < LVL3_TSTART) {
> + idx = calc_index(expires, LVL2_GRAN, LVL2_SHIFT, LVL2_OFFS);
> + } else if (delta < LVL4_TSTART) {
> + idx = calc_index(expires, LVL3_GRAN, LVL3_SHIFT, LVL3_OFFS);
> + } else if (delta < LVL5_TSTART) {
> + idx = calc_index(expires, LVL4_GRAN, LVL4_SHIFT, LVL4_OFFS);
>
> to be replaced with __builtin_clz or similar:
Except that __fls() is noticeably slower than the if chain.
> > +static inline void detach_expired_timer(struct timer_list *timer)
> > {
> > detach_timer(timer, true);
> > - if (!(timer->flags & TIMER_DEFERRABLE))
> > - base->active_timers--;
> > - base->all_timers--;
> > }
>
> Is there even a reason to have this wrapper any more? Why not
> just replace all calls to it in the source?
That just happened to stay there for no particular reason.
> > + timer = hlist_entry(head->first, struct timer_list, entry);
> > + fn = timer->function;
> > + data = timer->data;
> > +
> > + timer_stats_account_timer(timer);
> > +
> > + base->running_timer = timer;
> > + detach_expired_timer(timer);
>
> Is there some non-obvious reason that you have to fetch fn and data
> so early? It seems like a register pressure pessimization, if the
> compiler can't figure out that timer_stats code can't change them.
>
> The cache line containing this timer was already prefetched when you
> updated its entry.pprev as part of removing the previous entry from
> the list.
>
> I see why you want to fetch them with the lock held in case there's some
> freaky race, but I'd do it all after detach_timer().
That's not new code. We kept the ordering, but yes, we definitely can turn
that around. The only restriction is that we get it before releasing the lock.
Thanks,
tglx
Powered by blists - more mailing lists