lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 16 May 2010 18:40:57 -0400
From:	Chris Mason <chris.mason@...cle.com>
To:	Manfred Spraul <manfred@...orfullife.com>
Cc:	Nick Piggin <npiggin@...e.de>, zach.brown@...cle.com,
	jens.axboe@...cle.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in
 semtimedop

On Sun, May 16, 2010 at 06:57:38PM +0200, Manfred Spraul wrote:
> On 04/13/2010 08:57 PM, Nick Piggin wrote:
> >On Tue, Apr 13, 2010 at 02:19:37PM -0400, Chris Mason wrote:
> >>I don't see anything in the docs about the FIFO order.  I could add an
> >>extra sort on sequence number pretty easily, but is the starvation case
> >>really that bad?
> >Yes, because it's not just a theoretical livelock, it can be basically
> >a certainty, given the right pattern of semops.
> >
> >You could have two mostly-independent groups of processes, each taking
> >and releasing a different sem, which are always contended (eg. if it is
> >being used for a producer-consumer type situation, or even just mutual
> >exclusion with high contention).
> >
> >Then you could have some overall management process for example which
> >tries to take both sems. It will never get it.
> >
> The management process won't get the sem on Linux either:
> Linux implements FIFO, but there is no protection at all against starvation.
> 
> If I understand the benchmark numbers correctly, a 4-core, 2 GHz
> Phenom is able to do ~ 2 million semaphore operations per second in
> one semaphore array.
> That's the limit - cache line trashing on the sma structure prevent
> higher numbers.
> 
> For a NUMA system, the limit is probably lower.
> 
> Chris:
> Do you have an estimate how many semop() your app will perform in one array?

There are two different workloads at play.  The first is to just use
semaphores as a lock, which is a traditional mutex type operation (one
at a time).  This isn't a problem with the current code, aside from lock
contention created by the second case.

The second case is batched wakeup.  One process will wake hundreds or
more at once that are each waiting on their own semaphore.

> 
> Perhaps we should really remove the per-array list,
> sma->sem_perm.lock and sma->sem_otime.

So far for our uses the per-array list was the biggest trouble.  I tried
to benchmark your patches on Friday, but these are preproduction systems
and it appears to have woken up in a bad mood.

The hardware guys are giving it some TLC and I'll be able to run on
Monday.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ