lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d5f51c2e-daef-467d-9430-8c2d48819a56@wanadoo.fr>
Date: Tue, 14 May 2024 22:35:56 +0200
From: Christophe JAILLET <christophe.jaillet@...adoo.fr>
To: Carlos Llamas <cmllamas@...gle.com>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Arve Hjønnevåg <arve@...roid.com>,
 Todd Kjos <tkjos@...roid.com>, Martijn Coenen <maco@...roid.com>,
 Joel Fernandes <joel@...lfernandes.org>,
 Christian Brauner <brauner@...nel.org>,
 Suren Baghdasaryan <surenb@...gle.com>
Cc: linux-kernel@...r.kernel.org, kernel-team@...roid.com,
 Tim Murray <timmurray@...gle.com>, Alice Ryhl <aliceryhl@...gle.com>,
 John Stultz <jstultz@...gle.com>, Steven Moreland <smoreland@...gle.com>,
 Nick Chen <chenjia3@...o.com>
Subject: Re: [PATCH v2] binder: use bitmap for faster descriptor lookup

Le 14/05/2024 à 18:09, Carlos Llamas a écrit :
> When creating new binder references, the driver assigns a descriptor id
> that is shared with userspace. Regrettably, the driver needs to keep the
> descriptors small enough to accommodate userspace potentially using them
> as Vector indexes. Currently, the driver performs a linear search on the
> rb-tree of references to find the smallest available descriptor id. This
> approach, however, scales poorly as the number of references grows.
> 
> This patch introduces the usage of bitmaps to boost the performance of
> descriptor assignments. This optimization results in notable performance
> gains, particularly in processes with a large number of references. The
> following benchmark with 100,000 references showcases the difference in
> latency between the dbitmap implementation and the legacy approach:
> 
>    [  587.145098] get_ref_desc_olocked: 15us (dbitmap on)
>    [  602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
> 
> Note the bitmap size is dynamically adjusted in line with the number of
> references, ensuring efficient memory usage. In cases where growing the
> bitmap is not possible, the driver falls back to the slow legacy method.
> 
> A previous attempt to solve this issue was proposed in [1]. However,
> such method involved adding new ioctls which isn't great, plus older
> userspace code would not have benefited from the optimizations either.
> 
> Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]

..

> +static int get_ref_desc_olocked(struct binder_proc *proc,
> +				struct binder_node *node,
> +				u32 *desc)
> +{
> +	struct dbitmap *dmap = &proc->dmap;
> +	unsigned long *new, bit;
> +	unsigned int nbits;
> +
> +	/* 0 is reserved for the context manager */
> +	if (node == proc->context->binder_context_mgr_node) {
> +		*desc = 0;
> +		return 0;
> +	}
> +
> +	if (unlikely(!dbitmap_enabled(dmap))) {
> +		*desc = slow_desc_lookup_olocked(proc);
> +		return 0;
> +	}
> +
> +	if (dbitmap_get_first_zero_bit(dmap, &bit) == 0) {
> +		*desc = bit;
> +		return 0;
> +	}
> +
> +	/*
> +	 * The descriptors bitmap is full and needs to be expanded.
> +	 * The proc->outer_lock is briefly released to allocate the
> +	 * new bitmap safely.
> +	 */
> +	nbits = dbitmap_expand_nbits(dmap);
> +	binder_proc_unlock(proc);
> +	new = bitmap_zalloc(nbits, GFP_KERNEL | __GFP_ZERO);

Hi,

Nitpick: No need to __GFP_ZERO when using zalloc().

CJ

> +	binder_proc_lock(proc);
> +	dbitmap_expand(dmap, new, nbits);
> +
> +	return -EAGAIN;
> +}

..

> +#define NBITS_MIN	BITS_PER_TYPE(unsigned long)
> +
> +struct dbitmap {
> +	unsigned int nbits;

Should it be long (or unsigned long) to better match the bitmap API?

(not sure if it can overflow in this use case, but at least for consistancy)

> +	unsigned long *map;
> +};

..

> +static inline unsigned int dbitmap_shrink_nbits(struct dbitmap *dmap)
> +{
> +	unsigned int bit;
> +
> +	if (dmap->nbits <= NBITS_MIN)
> +		return 0;
> +
> +	bit = find_last_bit(dmap->map, dmap->nbits);

find_last_bit() returns an unsigned long.

(bitmap_get_first_zero_bit() below uses unsigned long)

CJ

> +	if (unlikely(bit == dmap->nbits))
> +		return NBITS_MIN;
> +
> +	if (unlikely(bit <= (dmap->nbits >> 2)))
> +		return dmap->nbits >> 1;
> +
> +	return 0;
> +}

..

> +static inline int
> +dbitmap_get_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
> +{
> +	unsigned long n;
> +
> +	n = find_first_zero_bit(dmap->map, dmap->nbits);
> +	if (unlikely(n == dmap->nbits))
> +		return -ENOSPC;
> +
> +	*bit = n;
> +	set_bit(n, dmap->map);
> +
> +	return 0;
> +}

..


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ