lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 3 Apr 2013 15:03:41 +0200
From:	Christian Ruppert <christian.ruppert@...lis.com>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Vineet Gupta <Vineet.Gupta1@...opsys.com>,
	Pierrick Hascoet <pierrick.hascoet@...lis.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH] timer: Fix possible issues with non serialized
 timer_pending( )

Hello Vineet,

In the following a patch for a seemingly unrelated problem (RB-trees in
hrtimer implementation) which seems to fix the original problem at the
same time. For the moment, this patch is insufficiently tested but I am
posting it here (preliminarily) since it seems to be in line with
Thomas' comment: The implementation of irqsave/restore in other
architectures (MIPS, ARM) imply memory barriers and adding this implicit
barrier to ARC code as well seems to stabilise some of our remaining
crashes.

Greetings,
  Christian

On Wed, Apr 03, 2013 at 02:36:59PM +0200, Thomas Gleixner wrote:
> Vineet,
> 
> On Fri, 29 Mar 2013, Vineet Gupta wrote:
> 
> > When stress testing ARC Linux from 3.9-rc3, we've hit a serialization
> > issue when mod_timer() races with itself. This is on a FPGA board and
> > kernel .config among others has !SMP and !PREEMPT_COUNT.
> > 
> > The issue happens in mod_timer( ) because timer_pending( ) based early
> > exit check is NOT done inside the timer base spinlock - as a networking
> > optimization.
> > 
> > The value used in there, timer->entry.next is also used further in call
> > chain (all inlines though) for actual list manipulation. However if the
> > register containing this pointer remains live across the spinlock (in a
> > UP setup with !PREEMPT_COUNT there's nothing forcing gcc to reload) then
> > a stale value of next pointer causes incorrect list manipulation,
> > observed with following sequence in our tests.
> > 
> > (0). tv1[x] <----> t1 <---> t2
> > (1). mod_timer(t1) interrupted after it calls timer_pending()
> > (2). mod_timer(t2) completes
> > (3). mod_timer(t1) resumes but messes up the list.
> > (4). __runt_timers( ) uses bogus timer_list entry / crashes in
> >      timer->function
> > 
> > The simplest fix is to NOT rely on spinlock based compiler barrier but
> > add an explicit one in timer_pending()
> 
> That's simple, but dangerous. There is other code which relies on the
> implicit barriers of spinlocks, so I think we need to add the barrier
> to the !PREEMPT_COUNT implementation of preempt_*() macros.
> 
> Thanks,
> 
> 	tglx
> 
> > FWIW, the relevant ARCompact disassembly of mod_timer which clearly
> > shows the issue due to register reuse is:
> > 
> > mod_timer:
> >     push_s blink
> >     mov_s r13,r0	# timer, timer
> > 
> > ...
> >     ###### timer_pending( )
> >     ld_s r3,[r13]       # <------ <variable>.entry.next LOADED
> >     brne r3, 0, @.L163
> > 
> > .L163:
> > ....
> >     ###### spin_lock_irq( )
> >     lr  r5, [status32]  # flags
> >     bic r4, r5, 6       # temp, flags,
> >     and.f 0, r5, 6      # flags,
> >     flag.nz r4
> > 
> >     ###### detach_if_pending( ) begins
> > 
> >     tst_s r3,r3  <--------------
> > 			# timer_pending( ) checks timer->entry.next
> >                         # r3 is NOT reloaded by gcc, using stale value
> >     beq.d @.L169
> >     mov.eq r0,0
> > 
> >     #  detach_timer( ): __list_del( )
> > 
> >     ld r4,[r13,4]    	# <variable>.entry.prev, D.31439
> >     st r4,[r3,4]     	# <variable>.prev, D.31439
> >     st r3,[r4]       	# <variable>.next, D.30246
> > 
> > Signed-off-by: Vineet Gupta <vgupta@...opsys.com>
> > Reported-by: Christian Ruppert <christian.ruppert@...lis.com>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Cc: Christian Ruppert <christian.ruppert@...lis.com>
> > Cc: Pierrick Hascoet <pierrick.hascoet@...lis.com>
> > Cc: linux-kernel@...r.kernel.org
> > ---
> >  include/linux/timer.h |   11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/timer.h b/include/linux/timer.h
> > index 8c5a197..1537104 100644
> > --- a/include/linux/timer.h
> > +++ b/include/linux/timer.h
> > @@ -168,7 +168,16 @@ static inline void init_timer_on_stack_key(struct timer_list *timer,
> >   */
> >  static inline int timer_pending(const struct timer_list * timer)
> >  {
> > -	return timer->entry.next != NULL;
> > +	int pending = timer->entry.next != NULL;
> > +
> > +	/*
> > +	 * The check above enables timer fast path - early exit.
> > +	 * However most of the call sites are not protected by timer->base
> > +	 * spinlock. If the caller (say mod_timer) races with itself, it
> > +	 * can use the stale "next" pointer. See commit log for details.
> > +	 */
> > +	barrier();
> > +	return pending;
> >  }
> >  
> >  extern void add_timer_on(struct timer_list *timer, int cpu);
> > -- 
> > 1.7.10.4
> > 
> > 

-- 
  Christian Ruppert              ,          <christian.ruppert@...lis.com>
                                /|
  Tel: +41/(0)22 816 19-42     //|                 3, Chemin du Pré-Fleuri
                             _// | bilis Systems   CH-1228 Plan-les-Ouates
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ