[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAuR_0om4FI5Pb_F@Mac.home>
Date: Fri, 25 Apr 2025 06:45:35 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Burak Emir <bqe@...gle.com>
Cc: Yury Norov <yury.norov@...il.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
Viresh Kumar <viresh.kumar@...aro.org>,
Miguel Ojeda <ojeda@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>,
Andreas Hindborg <a.hindborg@...nel.org>,
Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org,
Rong Xu <xur@...gle.com>
Subject: Re: [PATCH v7 4/5] rust: add find_bit_benchmark_rust module.
On Fri, Apr 25, 2025 at 02:20:13PM +0200, Burak Emir wrote:
> On Fri, Apr 25, 2025 at 12:31 AM Boqun Feng <boqun.feng@...il.com> wrote:
> >
> > On Thu, Apr 24, 2025 at 09:48:17AM -0700, Boqun Feng wrote:
> > > On Thu, Apr 24, 2025 at 06:45:33PM +0200, Burak Emir wrote:
> > > > On Wed, Apr 23, 2025 at 6:56 PM Yury Norov <yury.norov@...il.com> wrote:
> > > > > So? Can you show your numbers?
> > > >
> > > > For now, I only have numbers that may not be very interesting:
> > > >
> > > > - for find_next_bit, find_next_zero_bit and find_next_zero_bit (sparse):
> > > > 22 ns/iteration in C, 32 ns/iteration in Rust.
> > > >
> > > > - for sparse find_next_bit (sparse):
> > > > 60 ns/iteration in C, 70 ns/iteration in Rust.
> > > >
> > > > This is a VM running nested in a VM. More importantly: the C helper
> > > > method is not inlined.
> > > > So we are likely measuring the overhead (plus the extra bounds checking).
> > > >
> > > > I would like to get cross-language inlining to work with thinLTO to
> > > > have a more realistic comparison.
> > > > However, that is not something that works out of the box.
> > > > I am looking at Gary Guo's patch for this:
> > > > https://lore.kernel.org/all/20250319205141.3528424-1-gary@garyguo.net/
> > > > Currently, I get duplicate symbol errors.
> > > >
> > >
> > > You will need to add __rust_helper attribute for the new rust helpers
> > > introduce in your patches. See Gary's patch #2 for example.
> > >
> >
> > Here you go ;-)
> >
> > https://github.com/fbq/linux/tree/rust-inline-bitmap
> >
> > I rebased on the top of rust-next and applied your patches onto it. The
> > last one is the necessary bits that enables helper inlining for the new
> > APIs in your patch. There is also a "TMP" patch in-between, otherwise
> > INLINE_HELPERS won't work, we need to dig more of it. But anyway, it
> > works on my machine (TM):
> >
> > (on x86, in your test function)
> >
> > 000000000028b090 <_RNvNtNtCs3KHxpmQFgFb_6kernel6bitmap5tests40kunit_rust_wrapper_bitmap_set_clear_find>:
> > ...
> > 28b0dd: 48 0f ba 2b 11 btsq $0x11, (%rbx)
> >
> > ^ this is the "b.set_bit(17);" in bitmap_set_clear_find() test.
>
> Thanks Boqun! I had the same state and got things to build now with
> CONFIG_LTO_NONE.
> Your work helped me narrow down the possibilities.
>
Great! Is performance number any different?
> Gary's helper-inlining patch in combination of CONFIG_CLANG_LTO_THIN
> and CONFIG_FIND_BIT_BENCHMARK_RUST=y gives "duplicate symbol" build
> errors.
>
> It does not eve matter here - because _find_next_bit C functions are
> not helpers.
>
Yes, I noticed that, but in theory we should have the same performance
as C because C also make the call-jump to _find_next_bit() instead of
inlining it.
> For cross-language inlining, we would need to change the build system
> similar to Gary's helper-inlining series, but for *all* object files.
> That is what "-flto" does for clang, and "-Clinker-plugin-lto" would
> do for rustc.
> Instead of the hack of emitting a .bc file, we need all .o files to
> contain LLVM bitcode, regardless of whether they come from clang or
> rustc.
>
There are cases where we don't want a full LTO, and so having an option
to get rid of unnecessary calls to the helpers instead of a full LTO is
reasonable IMO. But of course, no reason not to make cross-language
inlining work under LTO.
Regards,
Boqun
> Thanks,
> Burak
Powered by blists - more mailing lists