lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200420170603.GC5820@bombadil.infradead.org>
Date:   Mon, 20 Apr 2020 10:06:03 -0700
From:   Matthew Wilcox <willy@...radead.org>
To:     Manfred Spraul <manfred@...orfullife.com>
Cc:     linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] ipc: Convert ipcs_idr to XArray

On Mon, Apr 20, 2020 at 05:35:20PM +0200, Manfred Spraul wrote:
> > -		max_idx = max(ids->in_use*3/2, ipc_min_cycle);
> > -		max_idx = min(max_idx, ipc_mni);
> > -
> > -		/* allocate the idx, with a NULL struct kern_ipc_perm */
> > -		idx = idr_alloc_cyclic(&ids->ipcs_idr, NULL, 0, max_idx,
> > -					GFP_NOWAIT);
> > -
> > -		if (idx >= 0) {
> > -			/*
> > -			 * idx got allocated successfully.
> > -			 * Now calculate the sequence number and set the
> > -			 * pointer for real.
> > -			 */
> > -			if (idx <= ids->last_idx) {
> > +		min_idx = ids->next_idx;
> > +		new->seq = ids->seq;
> > +
> > +		/* Modified version of __xa_alloc */
> > +		do {
> > +			xas.xa_index = min_idx;
> > +			xas_find_marked(&xas, max_idx, XA_FREE_MARK);
> > +			if (xas.xa_node == XAS_RESTART && min_idx > 0) {
> >   				ids->seq++;
> >   				if (ids->seq >= ipcid_seq_max())
> >   					ids->seq = 0;
> > +				new->seq = ids->seq;
> > +				xas.xa_index = 0;
> > +				min_idx = 0;
> > +				xas_find_marked(&xas, max_idx, XA_FREE_MARK);
> >   			}
> 
> Is is nessary to have that many details of xarray in ipc/util?
> 
> This function is not performance critical.
> 
> The core requirement is that ipc_obtain_object_check() must scale.
> 
> Would it be possible to use something like
> 
>     xa_alloc(,entry=NULL,)
> 
>     new->seq = ...
> 
>     xa_store(,entry=new,);

Yes, that would absolutely be possible, and some users do exactly that.
I thought that creating a new message queue / semaphore set / shared
memory segment would be performance sensitive.

> > -			ids->last_idx = idx;
> > -
> > -			new->seq = ids->seq;
> > -			/* no need for smp_wmb(), this is done
> > -			 * inside idr_replace, as part of
> > -			 * rcu_assign_pointer
> > -			 */
> 
> Could you leave the memory barrier comments in the code?
> 
> The rcu_assign_pointer() is the first hands-off from semget() or msgget().
> 
> Before the rcu_assign_pointer, e.g. semop() calls would return -EINVAL;
> 
> After the rcu_assign_pointer, semwhatever() must work - and e.g. the
> permission checks are lockless.

How about simply:
			/* xa_store contains a memory barrier */

> > -			idr_replace(&ids->ipcs_idr, new, idx);
> > -		}
> > +			if (xas.xa_node == XAS_RESTART)
> > +				xas_set_err(&xas, -ENOSPC);
> > +			else
> > +				new->id = (new->seq << ipcmni_seq_shift()) +
> > +					xas.xa_index;
> 
> Setting new->id should remain at the end, outside any locking:
> 
> The variable has no special protection, access is only allowed after proper
> locking, thus no need to have the initialization in the middle.
> 
> What is crucial is that the final value of new->seq is visible to all cpus
> before a storing the pointer.

The IPC locking is weird.  Most users spin_lock the xarray/idr/radix
tree for modifications, and on the read-side use RCU to protect the
lookup and a refcount on the object looked up in it (after which,
RCU is unlocked).  IPC seems to hold the RCU lock much longer, and it
has a per-object spinlock rather than refcount.  And it has an rwsem.
It feels like it could be much simpler, but I haven't had time to dig
into it and figure out why it's so different from all the other users.
Maybe it's just older code.

> > +			xas_store(&xas, new);
> > +			xas_clear_mark(&xas, XA_FREE_MARK);
> > +		} while (__xas_nomem(&xas, GFP_KERNEL));
> > +
> 
> Just for my curiosity:
> 
> If the xas is in an invalid state, then xas_store() will not store anything.
> Thus the loop will not store "new" multiple times, it will be stored only
> once.

Correct, although we're going to delete this loop entirely.

> @@ -472,7 +487,7 @@ void ipc_rmid(struct ipc_ids *ids, struct kern_ipc_perm
> *ipcp)
> >   			idx--;
> >   			if (idx == -1)
> >   				break;
> > -		} while (!idr_find(&ids->ipcs_idr, idx));
> > +		} while (!xa_load(&ids->ipcs, idx));
> >   		ids->max_idx = idx;
> >   	}
> >   }
> 
> Is there an xa_find_last() function?
> 
> It is outside of any hot path, I have a patch that does a binary search with
> idr_get_next().
> 
> If there is no xa_find_last(), then I would rebase that patch.

There is not currently an xa_find_last().  It shouldn't be too hard
to add; start at the top of the tree and walk backwards in each node
until finding a non-NULL entry.  Of course, it'll need documentation
and test cases.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ