lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 30 Nov 2016 12:20:29 +0000
From:   Chris Wilson <chris@...is-wilson.co.uk>
To:     Nicolai Hähnle <nhaehnle@...il.com>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/11] locking/ww_mutex: Keep sorted wait list to avoid
 stampedes

On Wed, Nov 30, 2016 at 12:52:28PM +0100, Nicolai Hähnle wrote:
> On 30.11.2016 10:40, Chris Wilson wrote:
> >On Mon, Nov 28, 2016 at 01:20:01PM +0100, Nicolai Hähnle wrote:
> >>I've included timings taken from a contention-heavy stress test to some of
> >>the patches. The stress test performs actual GPU operations which take a
> >>good chunk of the wall time, but even so, the series still manages to
> >>improve the wall time quite a bit.
> >
> >In looking at your contention scenarios, what was the average/max list
> >size? Just wondering if it makes sense to use an rbtree + first_waiter
> >instead of a sorted list from the start.
> 
> I haven't measured this with the new series; previously, while I was
> debugging the deadlock on older kernels, I occasionally saw wait
> lists of up to ~20 tasks, spit-balling the average over all the
> deadlock cases I'd say the average was not more than ~5. The average
> _without_ deadlocks should be lower, if anything.

Right, I wasn't expecting the list to be large, certainly no larger than
cores typically. On the borderline of where a more complex tree starts
to pay off.
 
> I saw that your test cases go quite a bit higher, but even the
> rather extreme load I was testing with -- which is not quite a load
> from an actual application, though it is related to one -- has 40
> threads and so a theoretical maximum of 40.

The stress loads were just values plucked out of nowhere to try and have
a reasonable stab at hitting the deadlock. Certainly if we were to wrap
that up in a microbenchmark we would want to have wider coverage (so the
graph against contention is more useful).

Do you have a branch I can pull the patches for (or what did you use as
the base)?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ