linux-kernel - Re: [PATCH 3/5] ipc/sem: optimize perform_atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 12 Sep 2016 19:56:18 +0200
From:   Manfred Spraul <manfred@...orfullife.com>
To:     Davidlohr Bueso <dave@...olabs.net>, akpm@...ux-foundation.org
Cc:     linux-kernel@...r.kernel.org, Davidlohr Bueso <dbueso@...e.de>
Subject: Re: [PATCH 3/5] ipc/sem: optimize perform_atomic_semop()

Hi Davidlohr,

On 09/12/2016 01:53 PM, Davidlohr Bueso wrote:
> This is the main workhorse that deals with semop user calls
> such that the waitforzero or semval update operations, on the
> set, can complete on not as the sma currently stands. Currently,
> the set is iterated twice (setting semval, then backwards for
> the sempid value). Slowpaths, and particularly SEM_UNDO calls,
> must undo any altered sem when it is detected that the caller
> must block or has errored-out.
>
> With larger sets, there can occur situations where this involves
> a lot of cycles and can obviously be a suboptimal use of cached
> resources in shared memory. Ie, discarding CPU caches that are
> also calling semop and have the sembuf cached (and can complete),
> while the current lock holder doing the semop will block, error,
> or does a waitforzero operation.
>
> This patch proposes still iterating the set twice, but the first
> scan is read-only, and we perform the actual updates afterward,
> once we know that the call will succeed. In order to not suffer
> from the overhead of dealing with sops that act on the same sem_num,
> such (rare )cases use perform_atomic_semop_slow(), which is exactly
> what we have now. Duplicates are detected before grabbing sem_lock,
> and uses simple a 64-bit variable to enable the sem_num-th bit.
> Of course, this means that semops calls with a sem_num larger than
> 64 (SEMOPM_FAST, for now, as this is really about the nsops), will
> take the _slow() alternative; but many real-world workloads only
> work on a handful of semaphores in a given set, thus good enough
> for the common case.
Can you create a 2nd definition, instead of reusing SEMOPM_FAST?
SEMOPM_FAST is about nsops, to limit stack usage.
Now you introduce a limit regarding sem_num.


> In addition add some comments to when we expect to the caller
> to block.
>
> Signed-off-by: Davidlohr Bueso <dbueso@...e.de>
> ---
>   ipc/sem.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>   1 file changed, 82 insertions(+), 7 deletions(-)
>
> diff --git a/ipc/sem.c b/ipc/sem.c
> index 86467b5b78ad..d9c743ac17ff 100644
> --- a/ipc/sem.c
> +++ b/ipc/sem.c
> @@ -115,7 +115,8 @@ struct sem_queue {
>   	struct sembuf		*sops;	 /* array of pending operations */
>   	struct sembuf		*blocking; /* the operation that blocked */
>   	int			nsops;	 /* number of operations */
> -	int			alter;	 /* does *sops alter the array? */
> +	bool			alter;   /* does *sops alter the array? */
> +	bool                    dupsop;	 /* sops on more than one sem_num */
>   };
>   
>   /* Each task has a list of undo requests. They are executed automatically
> @@ -595,7 +596,8 @@ SYSCALL_DEFINE3(semget, key_t, key, int, nsems, int, semflg)
>    * Returns 1 if the operation is impossible, the caller must sleep.
>    * Negative values are error codes.
>    */
> -static int perform_atomic_semop(struct sem_array *sma, struct sem_queue *q)
> +static int perform_atomic_semop_slow(struct sem_array *sma,
> +				     struct sem_queue *q)
>   {
>   	int result, sem_op, nsops, pid;
>   	struct sembuf *sop;
> @@ -666,6 +668,72 @@ undo:
>   	return result;
>   }
>   
> +static int perform_atomic_semop(struct sem_array *sma, struct sem_queue *q)
> +{
Do we really have to copy the whole function? Would it be possible to 
leave it as one function, with tests inside?

> @@ -1751,12 +1820,17 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
>   		if (sop->sem_num >= max)
>   			max = sop->sem_num;
>   		if (sop->sem_flg & SEM_UNDO)
> -			undos = 1;
> +			undos = true;
>   		if (sop->sem_op != 0)
> -			alter = 1;
> +			alter = true;
> +		if (sop->sem_num < SEMOPM_FAST && !dupsop) {
> +			if (dup & (1 << sop->sem_num))
> +				dupsop = 1;
> +			else
> +				dup |= 1 << sop->sem_num;
> +		}
>   	}
At least for nsops=2, sops[0].sem_num !=sops[1].sem_num can detect 
absense of duplicated ops regardless of the array size.
Should we support that?

--
     Manfred