[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAH5fLgjP8eozdA3wSari2LHyVUzaOMNTU12JWb2rzGgy9RRpsg@mail.gmail.com>
Date: Thu, 16 May 2024 16:10:40 +0200
From: Alice Ryhl <aliceryhl@...gle.com>
To: Carlos Llamas <cmllamas@...gle.com>
Cc: Christophe JAILLET <christophe.jaillet@...adoo.fr>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Arve Hjønnevåg <arve@...roid.com>,
Todd Kjos <tkjos@...roid.com>, Martijn Coenen <maco@...roid.com>,
Joel Fernandes <joel@...lfernandes.org>, Christian Brauner <brauner@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, linux-kernel@...r.kernel.org, kernel-team@...roid.com,
Tim Murray <timmurray@...gle.com>, John Stultz <jstultz@...gle.com>,
Steven Moreland <smoreland@...gle.com>, Nick Chen <chenjia3@...o.com>
Subject: Re: [PATCH v3] binder: use bitmap for faster descriptor lookup
On Thu, May 16, 2024 at 3:39 PM Carlos Llamas <cmllamas@...gle.com> wrote:
>
> When creating new binder references, the driver assigns a descriptor id
> that is shared with userspace. Regrettably, the driver needs to keep the
> descriptors small enough to accommodate userspace potentially using them
> as Vector indexes. Currently, the driver performs a linear search on the
> rb-tree of references to find the smallest available descriptor id. This
> approach, however, scales poorly as the number of references grows.
>
> This patch introduces the usage of bitmaps to boost the performance of
> descriptor assignments. This optimization results in notable performance
> gains, particularly in processes with a large number of references. The
> following benchmark with 100,000 references showcases the difference in
> latency between the dbitmap implementation and the legacy approach:
>
> [ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
> [ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
>
> Note the bitmap size is dynamically adjusted in line with the number of
> references, ensuring efficient memory usage. In cases where growing the
> bitmap is not possible, the driver falls back to the slow legacy method.
>
> A previous attempt to solve this issue was proposed in [1]. However,
> such method involved adding new ioctls which isn't great, plus older
> userspace code would not have benefited from the optimizations either.
>
> Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
> Cc: Tim Murray <timmurray@...gle.com>
> Cc: Arve Hjønnevåg <arve@...roid.com>
> Cc: Alice Ryhl <aliceryhl@...gle.com>
> Cc: Martijn Coenen <maco@...roid.com>
> Cc: Todd Kjos <tkjos@...roid.com>
> Cc: John Stultz <jstultz@...gle.com>
> Cc: Steven Moreland <smoreland@...gle.com>
> Suggested-by: Nick Chen <chenjia3@...o.com>
> Signed-off-by: Carlos Llamas <cmllamas@...gle.com>
LGTM. One nit below, but it's not a correctness issue.
Reviewed-by: Alice Ryhl <aliceryhl@...gle.com>
> +static inline unsigned int dbitmap_shrink_nbits(struct dbitmap *dmap)
> +{
> + unsigned int bit;
> +
> + if (dmap->nbits <= NBITS_MIN)
> + return 0;
> +
> + bit = find_last_bit(dmap->map, dmap->nbits);
> + if (unlikely(bit == dmap->nbits))
> + return NBITS_MIN;
> +
> + if (unlikely(bit <= (dmap->nbits >> 2)))
> + return dmap->nbits >> 1;
I think this is intended to say that we only shrink if only the lower
fourth of the bits have any bits set, but for the condition to
actually be that, you need `bit < (map->nbits >> 2)` here instead of
`<=`.
Alice
Powered by blists - more mailing lists