[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aC1PQ7tmcqMSmbHc@google.com>
Date: Wed, 21 May 2025 03:57:55 +0000
From: Carlos Llamas <cmllamas@...gle.com>
To: Yury Norov <yury.norov@...il.com>
Cc: Boqun Feng <boqun.feng@...il.com>, Jann Horn <jannh@...gle.com>,
Burak Emir <bqe@...gle.com>, Kees Cook <kees@...nel.org>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
Viresh Kumar <viresh.kumar@...aro.org>,
Miguel Ojeda <ojeda@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>,
Andreas Hindborg <a.hindborg@...nel.org>,
Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
"Gustavo A . R . Silva" <gustavoars@...nel.org>,
rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-hardening@...r.kernel.org
Subject: Re: [PATCH v8 5/5] rust: add dynamic ID pool abstraction for bitmap
On Mon, May 19, 2025 at 08:57:04PM -0400, Yury Norov wrote:
> + Carlos Llamas
>
> On Mon, May 19, 2025 at 04:56:21PM -0700, Boqun Feng wrote:
> > On Tue, May 20, 2025 at 12:51:07AM +0200, Jann Horn wrote:
> > > On Mon, May 19, 2025 at 6:20 PM Burak Emir <bqe@...gle.com> wrote:
> > > > This is a port of the Binder data structure introduced in commit
> > > > 15d9da3f818c ("binder: use bitmap for faster descriptor lookup") to
> > > > Rust.
> > >
> > > Stupid high-level side comment:
> > >
> > > That commit looks like it changed a simple linear rbtree scan (which
> > > is O(n) with slow steps) into a bitmap thing. A more elegant option
> > > might have been to use an augmented rbtree, reducing the O(n) rbtree
> > > scan to an O(log n) rbtree lookup, just like how finding a free area
> >
> > I think RBTree::cursor_lower_bound() [1] does exactly what you said
> >
> > [1]: https://rust.docs.kernel.org/kernel/rbtree/struct.RBTree.html#method.cursor_lower_bound
>
> Alice mentioned before that in many cases the whole pool of IDs will
> fit into a single machine word if represented as bitmap. If that holds,
> bitmaps will win over any other data structure that I can imagine.
>
> For very large ID pools, the algorithmic complexity will take over,
> for sure. On the other hand, the 15d9da3f818ca explicitly mentions
> that it switches implementation to bitmaps for performance reasons.
>
> Anyways, Burak and Alice, before we move forward, can you tell if you
> ran any experiments with data structures allowing logarithmic lookup,
> like rb-tree? Can you maybe measure at which point rb-tree lookup will
> win over find_bit as the size of pool growth?
>
> Can you describe how the existing dbitmap is used now? What is the
> typical size of ID pools? Which operation is the bottleneck? Looking
> forward, are there any expectations about ID pools size in future?
>
> Carlos, can you please elaborate your motivation to switch to bitmaps?
> Have you considered rb-trees with O(logn) lookup?
Yeah, we tried rb-trees. There was even a patch that implemented the
augmented logic. See this:
https://lore.kernel.org/all/20240917030203.286-1-ebpqwerty472123@gmail.com/
IIRC, it just didn't make sense for our use case because of the extra
memory bytes required for this solution. The performance ended up being
the same (from my local testing).
I'm not certain of this but one potential factor is that the rb nodes
are in-strucutre members allocated separately. This can lead to more
cache misses when traversing them. I don't know how applicable this
would be for the Rust implementation though. Take that with a grain of
salt as I didn't actually look super close while running the tests.
I would also note, this whole logic wouldn't be required if userspace
wasn't using these descriptor IDs as vector indexes. At some point this
practice will be fixed and we can remove the "dbitmap" implementation.
Cheers,
--
Carlos Llamas
Powered by blists - more mailing lists