lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100218162128.GA16827@shamino.rdu.redhat.com>
Date:	Thu, 18 Feb 2010 11:21:28 -0500
From:	Neil Horman <nhorman@...driver.com>
To:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...e.hu>,
	ltt-dev@...ts.casi.polymtl.ca, linux-kernel@...r.kernel.org,
	Steven Rostedt <rostedt@...dmis.org>,
	Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Re: add_timer_on: in-kernel users _all_ buggy ?

On Thu, Feb 18, 2010 at 09:28:00AM -0500, Mathieu Desnoyers wrote:
> * Mathieu Desnoyers (compudj@...stal.dyndns.org) wrote:
> > * Thomas Gleixner (tglx@...utronix.de) wrote:
> > > On Tue, 16 Feb 2010, Mathieu Desnoyers wrote:
> > > > > The function is called from an IPI. That's a LTTNG problem, not a RT one.
> > > > 
> > > > I use del_timer in IPI to delete lttng per-cpu timers on all CPUs. I
> > > > have to do this because timers created with add_timer_on are documented
> > > > to be incompatible with del_timer_sync():
> > > > 
> > > >  * Synchronization rules: Callers must prevent restarting of the timer,
> > > >  * otherwise this function is meaningless. It must not be called from
> > > >  * interrupt contexts. The caller must not hold locks which would prevent
> > > >  * completion of the timer's handler. The timer's handler must not call
> > > >  * add_timer_on(). Upon exit the timer is not queued and the handler is
> > > >  * not running on any CPU.
> > > 
> > > Errm. The documentation says: 
> > > 
> > >       "The timer's handler must not call add_timer_on()." 
> > > 
> > > It's not talking about a timer which was initialized with
> > > add_timer_on().
> > > 
> > >  And your per cpu timer handlers have no requirement to call
> > > add_timer_on() simply because add/mod_timer() is requeueing the timer
> > > on the same cpu on which the handler runs. 
> > > 
> > > So the IPI is just a solution for a non existing problem.
> > 
> > Oh, right. Thanks for the explanation. I'll look into moving LTTng to a
> > saner del_timer_sync() scheme to delete the timers.
> 
> Double-checking this:
> 
> add_timer_on() needs to be paired with mod_timer_pinned(), otherwise
> NO_HZ SMP config can rebalance the timer to a different CPU. I am fixing
> this in lttng 0.194. These per-cpu timers, of course, should usually be
> deferrable (they are in lttng).
> 
> (looking at kernel 2.6.32.4 here)
> Looking at the kernel/time/clocksource.c watchdog, I wonder how
> del_timer manages to synchronize the timer teardown. The handler,
> clocksource_watchdog(), uses add_timer_on(), which prohibits using
> del_timer_sync(). This seems rather odd. If we remove the watchdog and
> re-add it, it may still be in use while we initialize the timer
> structure.
> 
> Also, net/core/drop_monitor.c trace_drop_common usage of add_timer_on
> seems odd:
> 
> Executing (AFAIK) with preempt on, data points to a per-cpu timer:
> 
>        if (!timer_pending(&data->send_timer)) {
>                 data->send_timer.expires = jiffies + dm_delay * HZ;
>                 add_timer_on(&data->send_timer, smp_processor_id());
>        }
> 
> How is timer_pending synchronized with the target CPU timer wheel ?
> 
Hm, I think I see your point here.  You're suggesting that a call to one of the
tracepoint hooks in the drop monitor can race against a second call to the hook
from an interrupt context that pre-empted the first, leading to double add of
the timer?  I agree, in fact I think its likely worse that that, the shared data
on the skb that I modify there can get corrupted in that case as well.  I expect
a bit of refactoring paired with a local_irq_save/restore should fix that.

Thanks!
Neil

> Wait, there's more: arch/x86/kernel/cpu/mcheck/mce.c uses both
> add_timer_on in its handler and del_timer_sync (which is incorrect).
> 
> arch/x86/kernel/apic/x2apic_uv_x.c almost has it right, but maybe it
> should use del_timer_sync ?
> 
> arch/powerpc/platforms/chrp/setup.c should learn about
> mod_timer_pinned().
> 
> Which leads to the following question: is there _any_ add_timer_on()
> kernel user that's not currently buggy ? ;-) Maybe this calls for better
> documentation of this interface. From what I've learn from digging into
> cpufreq to debug its incorrect timer teardown last year, I fear there
> are lots and lots of buggy per-cpu _and_ standard timer interface users
> out there.
> 
> Maybe adding some debugging options, e.g. checking that a timer created
> with add_timer_on is always modified by mod_timer_pinned, and is always
> deferrable, and deleted by del_timer_sync could help discovering a
> couple of outlawyer.
> 
> Thanks,
> 
> Mathieu
> 
> 
> > 
> > Thanks,
> > 
> > Mathieu
> > 
> > > 
> > > Thanks,
> > > 
> > > 	tglx
> > > 
> > 
> > -- 
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> > 
> > _______________________________________________
> > ltt-dev mailing list
> > ltt-dev@...ts.casi.polymtl.ca
> > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > 
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ