[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100413181937.GM13327@think>
Date: Tue, 13 Apr 2010 14:19:37 -0400
From: Chris Mason <chris.mason@...cle.com>
To: Nick Piggin <npiggin@...e.de>
Cc: Manfred Spraul <manfred@...orfullife.com>, zach.brown@...cle.com,
jens.axboe@...cle.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in
semtimedop
On Wed, Apr 14, 2010 at 04:09:45AM +1000, Nick Piggin wrote:
> On Tue, Apr 13, 2010 at 01:39:41PM -0400, Chris Mason wrote:
> > On Tue, Apr 13, 2010 at 07:15:30PM +0200, Manfred Spraul wrote:
> > > Hi Chris,
> > >
> > >
> > > On 04/12/2010 08:49 PM, Chris Mason wrote:
> > > > /*
> > > >+ * when a semaphore is modified, we want to retry the series of operations
> > > >+ * for anyone that was blocking on that semaphore. This breaks down into
> > > >+ * a few different common operations:
> > > >+ *
> > > >+ * 1) One modification releases one or more waiters for zero.
> > > >+ * 2) Many waiters are trying to get a single lock, only one will get it.
> > > >+ * 3) Many modifications to the count will succeed.
> > > >+ *
> > > Have you thought about odd corner cases:
> > > Nick noticed the last time that it is possible to wait for arbitrary values:
> > > in one semop:
> > > - decrease semaphore 5 by 10
> > > - wait until semaphore 5 is 0
> > > - increase semaphore 5 by 10.
> >
> > Do you mean within a single sop array doing all three of these? I don't
> > know if the sort is going to leave the three operations on semaphore 5
> > in the same order (it probably won't).
> >
> > But I could change that by having it include the slot in the original
> > sop array in the sorting. That way if we have duplicate semnums in the
> > array, they will end up in the same position relative to each other in
> > the sorted result.
> >
> > (ewwww ;)
>
> I had a bit of a hack at doing per-semaphore stuff when I was looking
> at the first optimization, but it was tricky to make it work.
>
> The other thing I don't know if your patch gets right is requeueing on
> of the operations. When you requeue from one list to another, then you
> seem to lose ordering with other pending operations, so that would
> seem to break the API as well (can't remember if the API strictly
> mandates FIFO, but anyway it can open up starvation cases).
I don't see anything in the docs about the FIFO order. I could add an
extra sort on sequence number pretty easily, but is the starvation case
really that bad?
>
> I was looking at doing a sequence number to be able to sort these, but
> it ended up getting over complex (and SAP was only using simple ops so
> it didn't seem to need much better).
>
> We want to be careful not to change semantics at all. And it gets
> tricky quickly :( What about Zach's simpler wakeup API?
Yeah, that's why my patches include code to handle userland sending
duplicate semids. Zach's simpler API is cooking too, but if I can get
this done without insane complexity it helps with more than just the
post/wait oracle workload.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists