lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Jan 2013 04:31:00 -0800
From:	Michel Lespinasse <walken@...gle.com>
To:	Rik van Riel <riel@...hat.com>
Cc:	linux-kernel@...r.kernel.org, aquini@...hat.com,
	eric.dumazet@...il.com, lwoodman@...hat.com, jeremy@...p.org,
	Jan Beulich <JBeulich@...ell.com>,
	Thomas Gleixner <tglx@...utronix.de>, knoel@...hat.com
Subject: Re: [RFC PATCH 3/5] x86,smp: auto tune spinlock backoff delay factor

On Wed, Jan 2, 2013 at 9:23 PM, Rik van Riel <riel@...hat.com> wrote:
> Proportional spinlock delay with a high delay factor works well
> when there is lots contention on a lock. Likewise, a smaller
> delay factor works well when a lock is lightly contended.
>
> Making the code auto-tune the delay factor results in a system
> that performs well with both light and heavy lock contention.

I don't quite like that part of the explanation - Ideally I would like
to explain that there are huge benefits in having the delay be at
least equivalent to the time taken by a no-load spin/release section,
and that beyond that there are small additional benefits to using a
longer delay when the spinlock hold times are larger, and an
explanation of why linear increase / exponential delay works nicely
here.

I'll see if I can make a more concrete proposal and still keep it
short enough :)

> +#define MIN_SPINLOCK_DELAY 1
> +#define MAX_SPINLOCK_DELAY 16000
> +DEFINE_PER_CPU(int, spinlock_delay) = { MIN_SPINLOCK_DELAY };

unsigned would seem more natural here, though it's only a tiny detail

>  void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc)
>  {
>         __ticket_t head = inc.head, ticket = inc.tail;
>         __ticket_t waiters_ahead;
> +       int delay = __this_cpu_read(spinlock_delay);

I like that you used __this_cpu_read() - your v1 version wasn't as nice.

> +
> +               /*
> +                * The lock is still busy; slowly increase the delay. If we
> +                * end up sleeping too long, the code below will reduce the
> +                * delay. Ideally we acquire the lock in the tight loop above.
> +                */
> +               if (!(head % 7) && delay < MAX_SPINLOCK_DELAY)
> +                       delay++;
> +
> +               loops = delay * waiters_ahead;

I don't like the head % 7 thing. I think using fixed point arithmetic
would be nicer:

if (delay < MAX_SPINLOCK_DELAY)
  delay += 256/7; /* Or whatever constant we choose */

loops = (delay * waiter_ahead) >> 8;

Also, we should probably skip the delay increment on the first loop
iteration - after all, we haven't waited yet, so we can't say that the
delay was too short.

> -               if (head == ticket)
> +               if (head == ticket) {
> +                       /*
> +                        * We overslept and have no idea how long the lock
> +                        * went idle. Reduce the delay as a precaution.
> +                        */
> +                       delay -= delay/32 + 1;

There is a possibility of integer underflow here. It seems that the
window to hit it would be quite small, but it can happen (at the very
least, imagine getting an interrupt at an inopportune time which would
make it look like you overslept). I think this may be what was causing
the stability issues I noticed.

I'll try to come up with a custom version of this patch. I do feel
that we're going in a good direction in general, though.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ