[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d5f51c2e-daef-467d-9430-8c2d48819a56@wanadoo.fr>
Date: Tue, 14 May 2024 22:35:56 +0200
From: Christophe JAILLET <christophe.jaillet@...adoo.fr>
To: Carlos Llamas <cmllamas@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Arve Hjønnevåg <arve@...roid.com>,
Todd Kjos <tkjos@...roid.com>, Martijn Coenen <maco@...roid.com>,
Joel Fernandes <joel@...lfernandes.org>,
Christian Brauner <brauner@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>
Cc: linux-kernel@...r.kernel.org, kernel-team@...roid.com,
Tim Murray <timmurray@...gle.com>, Alice Ryhl <aliceryhl@...gle.com>,
John Stultz <jstultz@...gle.com>, Steven Moreland <smoreland@...gle.com>,
Nick Chen <chenjia3@...o.com>
Subject: Re: [PATCH v2] binder: use bitmap for faster descriptor lookup
Le 14/05/2024 à 18:09, Carlos Llamas a écrit :
> When creating new binder references, the driver assigns a descriptor id
> that is shared with userspace. Regrettably, the driver needs to keep the
> descriptors small enough to accommodate userspace potentially using them
> as Vector indexes. Currently, the driver performs a linear search on the
> rb-tree of references to find the smallest available descriptor id. This
> approach, however, scales poorly as the number of references grows.
>
> This patch introduces the usage of bitmaps to boost the performance of
> descriptor assignments. This optimization results in notable performance
> gains, particularly in processes with a large number of references. The
> following benchmark with 100,000 references showcases the difference in
> latency between the dbitmap implementation and the legacy approach:
>
> [ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
> [ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
>
> Note the bitmap size is dynamically adjusted in line with the number of
> references, ensuring efficient memory usage. In cases where growing the
> bitmap is not possible, the driver falls back to the slow legacy method.
>
> A previous attempt to solve this issue was proposed in [1]. However,
> such method involved adding new ioctls which isn't great, plus older
> userspace code would not have benefited from the optimizations either.
>
> Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
..
> +static int get_ref_desc_olocked(struct binder_proc *proc,
> + struct binder_node *node,
> + u32 *desc)
> +{
> + struct dbitmap *dmap = &proc->dmap;
> + unsigned long *new, bit;
> + unsigned int nbits;
> +
> + /* 0 is reserved for the context manager */
> + if (node == proc->context->binder_context_mgr_node) {
> + *desc = 0;
> + return 0;
> + }
> +
> + if (unlikely(!dbitmap_enabled(dmap))) {
> + *desc = slow_desc_lookup_olocked(proc);
> + return 0;
> + }
> +
> + if (dbitmap_get_first_zero_bit(dmap, &bit) == 0) {
> + *desc = bit;
> + return 0;
> + }
> +
> + /*
> + * The descriptors bitmap is full and needs to be expanded.
> + * The proc->outer_lock is briefly released to allocate the
> + * new bitmap safely.
> + */
> + nbits = dbitmap_expand_nbits(dmap);
> + binder_proc_unlock(proc);
> + new = bitmap_zalloc(nbits, GFP_KERNEL | __GFP_ZERO);
Hi,
Nitpick: No need to __GFP_ZERO when using zalloc().
CJ
> + binder_proc_lock(proc);
> + dbitmap_expand(dmap, new, nbits);
> +
> + return -EAGAIN;
> +}
..
> +#define NBITS_MIN BITS_PER_TYPE(unsigned long)
> +
> +struct dbitmap {
> + unsigned int nbits;
Should it be long (or unsigned long) to better match the bitmap API?
(not sure if it can overflow in this use case, but at least for consistancy)
> + unsigned long *map;
> +};
..
> +static inline unsigned int dbitmap_shrink_nbits(struct dbitmap *dmap)
> +{
> + unsigned int bit;
> +
> + if (dmap->nbits <= NBITS_MIN)
> + return 0;
> +
> + bit = find_last_bit(dmap->map, dmap->nbits);
find_last_bit() returns an unsigned long.
(bitmap_get_first_zero_bit() below uses unsigned long)
CJ
> + if (unlikely(bit == dmap->nbits))
> + return NBITS_MIN;
> +
> + if (unlikely(bit <= (dmap->nbits >> 2)))
> + return dmap->nbits >> 1;
> +
> + return 0;
> +}
..
> +static inline int
> +dbitmap_get_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
> +{
> + unsigned long n;
> +
> + n = find_first_zero_bit(dmap->map, dmap->nbits);
> + if (unlikely(n == dmap->nbits))
> + return -ENOSPC;
> +
> + *bit = n;
> + set_bit(n, dmap->map);
> +
> + return 0;
> +}
..
Powered by blists - more mailing lists