linux-kernel - Re: [PATCH] clockevents: Retry programming min delta up to 10 times

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160425155124.GA22522@jhogan-linux.le.imgtec.org>
Date:	Mon, 25 Apr 2016 16:51:24 +0100
From:	James Hogan <james.hogan@...tec.com>
To:	Martin Schwidefsky <schwidefsky@...ibm.com>
CC:	Thomas Gleixner <tglx@...utronix.de>,
	<linux-kernel@...r.kernel.org>,
	Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [PATCH] clockevents: Retry programming min delta up to 10 times

On Mon, Apr 25, 2016 at 03:48:58PM +0200, Martin Schwidefsky wrote:
> On Fri, 22 Apr 2016 11:40:11 +0100
> James Hogan <james.hogan@...tec.com> wrote:
> 
> > Under virtualisation it is possible to get unexpected latency during a
> > clockevent device's set_next_event() callback which can make it return
> > -ETIME even for a delta based on min_delta_ns.
> 
> Do you have an example for this behavior?

The place where I've observed it is arch/mips/kernel/cevt-r4k.c, which
returns -ETIME when the delay is too short for it to be able to set it
and read back the timer.

I've also recently (Friday afternoon) seen a report of it apparently
happening with the MIPS GIC clockevent driver too
(drivers/clocksource/mips-gic-timer.c) which has similar logic, probably
copied from cevt-r4k, and this patch appeared to help (I still need to
confirm that one). That wasn't with virtualisation, but was on a
multithreaded core being stress tested, a case when its also hard to
find a guaranteed min delta.

> I would call that a BUG in the implementation of the clockevent
> device, no?

Several drivers seem to do that. I'm open to alternatives. Do you think
the driver should retry itself when it detects this race may have been
hit?

> 
> > The clockevents_program_min_delta() implementation for
> > CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=n doesn't handle retries when this
> > happens, nor does clockevents_program_event() or its callers when force
> > is true (for example hrtimer_reprogram()). This can result in hangs
> > until the clock event device does a full period.
> 
> Is that because some clockevent devices can not program the minimum delta
> in some corner cases?

yes.

I think it actually ended up causing an arithmetic overflow somewhere in
ktime_get() (I'd have to dig through my notes to find specifics)
which resulted in __iter_div_u64_rem() being given an excessively large
dividend, which effectively hung the CPU.

Thanks
James

> 
> > It isn't appropriate to use MIN_ADJUST in this case as occasional
> > hypervisor induced high latency will cause min_delta_ns to quickly
> > increase to the maximum.
> 
> I agree, the whole minimum delta adjustment is quite broken on a virtualized
> system. On s390 we have seen the rise of the min_delta_ns to the maximum
> value due to a busy hypervisor.
> 
> > Instead, borrow the retry pattern from the MIN_ADJUST case, but without
> > making adjustments. We retry up to 10 times before giving up.
> 
> That will add a few unnecessary instruction for architectures that have a
> sane set_next_event function, namely those that always returns 0. Should
> not be too bad though. 
> 
> -- 
> blue skies,
>    Martin.
> 
> "Reality continues to ruin my life." - Calvin.
> 

Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)