lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Sep 2016 01:33:13 -0700
From:   Davidlohr Bueso <dave@...olabs.net>
To:     Manfred Spraul <manfred@...orfullife.com>
Cc:     akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
        Davidlohr Bueso <dbueso@...e.de>
Subject: Re: [PATCH 3/5] ipc/sem: optimize perform_atomic_semop()

On Mon, 12 Sep 2016, Manfred Spraul wrote:

>>This patch proposes still iterating the set twice, but the first
>>scan is read-only, and we perform the actual updates afterward,
>>once we know that the call will succeed. In order to not suffer
>>from the overhead of dealing with sops that act on the same sem_num,
>>such (rare )cases use perform_atomic_semop_slow(), which is exactly
>>what we have now. Duplicates are detected before grabbing sem_lock,
>>and uses simple a 64-bit variable to enable the sem_num-th bit.
>>Of course, this means that semops calls with a sem_num larger than
>>64 (SEMOPM_FAST, for now, as this is really about the nsops), will
>>take the _slow() alternative; but many real-world workloads only
>>work on a handful of semaphores in a given set, thus good enough
>>for the common case.

>Can you create a 2nd definition, instead of reusing SEMOPM_FAST?
>SEMOPM_FAST is about nsops, to limit stack usage.
>Now you introduce a limit regarding sem_num.

Sure, I didn't really like using SEMOPM_FAST anyway (hence the 'for
now'), it was just handy at the time. I can do something like:

#define SEMNUM_FAST_MAX 64

>>+static int perform_atomic_semop(struct sem_array *sma, struct sem_queue *q)
>>+{
>Do we really have to copy the whole function? Would it be possible to 
>leave it as one function, with tests inside?

I think that having two perform_atomic_semop calls is actually keeping things
simpler, as for the common case we need not worry about the undo stuff. That said
the tests are the same for both, so let me see how I can factor them out, maybe
using callbacks...

>
>>@@ -1751,12 +1820,17 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
>>  		if (sop->sem_num >= max)
>>  			max = sop->sem_num;
>>  		if (sop->sem_flg & SEM_UNDO)
>>-			undos = 1;
>>+			undos = true;
>>  		if (sop->sem_op != 0)
>>-			alter = 1;
>>+			alter = true;
>>+		if (sop->sem_num < SEMOPM_FAST && !dupsop) {
>>+			if (dup & (1 << sop->sem_num))
>>+				dupsop = 1;
>>+			else
>>+				dup |= 1 << sop->sem_num;
>>+		}
>>  	}
>At least for nsops=2, sops[0].sem_num !=sops[1].sem_num can detect 
>absense of duplicated ops regardless of the array size.
>Should we support that?

There are various individual cases like that (ie obviously nsops == 1, alter == 0, etc)
where the dup detection would be unnecessary, but it seems like a stretch to go
at it like this. The above will work on the common case (assuming lower sem_num
of course). So I'm not particularly worried about being too smart at the dup detection.

Thanks,
Davidlohr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ