[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51A11851.2010303@colorfullife.com>
Date: Sat, 25 May 2013 22:00:17 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: Davidlohr Bueso <davidlohr.bueso@...com>
CC: Rik van Riel <riel@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>, hhuang@...hat.com,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v2] ipc/sem.c: fix lockup, restore FIFO behavior
On 05/25/2013 08:32 PM, Davidlohr Bueso wrote:
> Yep, could you please explain what benefits you see in keeping FIFO order?
a) It's user space visible.
b) It's a well-defined behavior that might even make sense for some
applications.
Right now, a 2 semop operation with "+1, then -2" is priorized over
a semop with "-1".
And: It doesn't cost much:
- no impact for users that use only single-op operations.
- no impact for users that use only multi-op operations
- for users that use both types: In the worst case some linked list
splicing.
Actually, the code is probably faster because wait-for-zero ops are only
scanned when the semaphore values are 0.
>> Acked-by: Rik van Riel <riel@...hat.com>
>>
>>> - simpler check_restart logic.
>>> - Efficient handling of wait-for-zero semops, both simple and complex.
>>> - Fewer restarts in update_queue(), because pending wait-for-zero do not
>>> force a restart anymore.
>>>
>>> Other changes:
>>> - try_atomic_semop() also performs the semop. Thus rename the function.
>>>
>>> It passes tests with qemu, but not boot-tested due to EFI problems.
> I think this still needs a *lot* of testing - I don't have my Oracle
> workload available right now, but I will definitely see how this patch
> behaves on it. That said, I believe Oracle is are already quite happy
> with the sem improvements.
Ah, ok.
The change is good for one application and the risk of breaking other
apps is considered as negligible.
>
> Furthermore, this patch is way too invasive for considering it for 3.10
> - I like Rik's patch better because it simply addresses the issue and
> nothing more.
I would disagree:
My patch is testable - with it applied, linux-3.0.10 should behave
exactly as linux-3.0.9.
Except the scalability - the new sem_lock from Rik is great.
My problem with Rik's patch is that it is untestable:
It changes the behavior and we must hope that nothing breaks.
Actually, the latest patch makes it a bit worse:
> @@ -720,16 +718,11 @@ static int update_queue(struct sem_array *sma, int semnum, struct list_head *pt)
>
> unlink_queue(sma, q);
>
> - if (error) {
> - restart = 0;
> - } else {
> - semop_completed = 1;
> - restart = check_restart(sma, q);
> - }
> + semop_completed = 1;
> + if (check_restart(sma, q))
> + *restart = 1;
>
> wake_up_sem_queue_prepare(pt, q, error);
> - if (restart)
> - goto again;
If check_restart returns "1", then the current (3.0.10-rc1) code
restarts immediately ("goto a again").
Now the rest of the queue is processed completely and only afterwards it
is scanned again.
This means that wait-for-zero now succeeds only if a semaphore value
stays zero.
For 3.0.9, it was sufficient if the value was temporarily zero.
Before the change, complex wait-for-zero would work, only simple
wait-for-zero would be starved.
Now all operations are starved.
I've attached a test case:
./test5.sh
linux-3.0.9 completes all operations
With Rik's patch, the wait-for-zero remains running.
--
Manfred
P.S.:
Btw, I found some code that uses a semop with 2 ops:
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Fapis%2Fapiexusmem.htm
View attachment "change.c" of type "text/plain" (1862 bytes)
View attachment "createary.c" of type "text/plain" (899 bytes)
View attachment "Makefile" of type "text/plain" (261 bytes)
View attachment "removeary.c" of type "text/plain" (900 bytes)
Download attachment "test5.sh" of type "application/x-shellscript" (514 bytes)
Powered by blists - more mailing lists