lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070827145627.GB18504@Krystal>
Date:	Mon, 27 Aug 2007 10:56:27 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Pekka Enberg <penberg@...helsinki.fi>
Cc:	Christoph Lameter <clameter@....com>, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, mingo@...hat.com
Subject: Re: [PATCH] SLUB: use have_arch_cmpxchg()

* Pekka Enberg (penberg@...helsinki.fi) wrote:
> Hi Mathieu,
> 
> On 8/22/07, Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> > Cons:
> > - Does not help code readability, i.e.:
> >
> > if (have_arch_cmpxchg())
> >         preempt_disable();
> > else
> >         local_irq_save(flags);
> 
> Heh, that's an understatement, as now slub is starting to look a bit
> like slab... ;-)  We need to hide that if-else maze into helper
> functions for sure.
> 

Hrm, actually, I don't think such have_arch_cmpxchg() macro will be
required at all because of the small performance hit disabling
preemption will have on the slow and fast paths. Let's compare, for each
of the slow path and fast path, what locking looks like on architecture
with and without local cmpxchg:

What we actually do here is:

fast path:
with local_cmpxchg:
  preempt_disable()
  preempt_enable()
without local_cmpxchg:
  preempt_disable()
  local_irq_save
  local_irq_restore
  preempt_enable()
    (we therefore disable preemption _and_ disable interrupts for
    nothing)

slow path:
both with and without local_cmpxchg():
  preempt_disable()
  preempt_enable()
  local_irq_save()
  local_irq_restore()


Therefore, we would add preempt disable/enable to the fast path of
architectures where local_cmpxchg is emulated with irqs off. But since
preempt disable/enable is just a check counter increment/decrement with
barrier() and thread flag check, I doubt it would hurt performances
enough to justify the added complexity of disabling interrupts for the
whole fast path in this case.

Mathieu


> On 8/22/07, Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> > --- slab.orig/mm/slub.c 2007-08-22 10:28:22.000000000 -0400
> > +++ slab/mm/slub.c      2007-08-22 10:51:53.000000000 -0400
> > @@ -1450,7 +1450,8 @@ static void *__slab_alloc(struct kmem_ca
> >         void **freelist = NULL;
> >         unsigned long flags;
> >
> > -       local_irq_save(flags);
> > +       if (have_arch_cmpxchg())
> > +               local_irq_save(flags);
> 
> I haven't been following on the cmpxchg_local() discussion too much so
> the obvious question is: why do we do local_irq_save() for the _has_
> cmpxchg() case here...
> 
> On 8/22/07, Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> > @@ -1553,8 +1556,12 @@ static void __always_inline *slab_alloc(
> >  {
> >         void **object;
> >         struct kmem_cache_cpu *c;
> > +       unsigned long flags;
> >
> > -       preempt_disable();
> > +       if (have_arch_cmpxchg())
> > +               preempt_disable();
> > +       else
> > +               local_irq_save(flags);
> 
> ...and preempt_disable() here?
> 
> On 8/22/07, Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> > +       if (have_arch_cmpxchg()) {
> > +               if (unlikely(cmpxchg_local(&c->freelist, object,
> > +                               object[c->offset]) != object))
> > +                       goto redo;
> > +               preempt_enable();
> > +       } else {
> > +               c->freelist = object[c->offset];
> > +               local_irq_restore(flags);
> > +       }
> 
> If you move the preempt_enable/local_irq_restore pair outside of the
> if-else block, you can make a static inline function slob_get_object()
> that does:
> 
>     static inline bool slob_get_object(struct kmem_cache *c, void **object)
>     {
>         if (have_arch_cmpxchg()) {
>             if (unlikely(cmpxchg_local(&c->freelist, object,
>                     object[c->offset]) != object))
>                 return false;
>         } else {
>             c->freelist = object[c->offset];
>         }
>         return true;
>     }
> 
> And let the compiler optimize out the branch for the non-cmpxchg case.
> Same for the reverse case too (i.e. slob_put_object).
> 
>                                 Pekka

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ