[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230515071343.GD15871@sol.localdomain>
Date: Mon, 15 May 2023 00:13:43 -0700
From: Eric Biggers <ebiggers@...nel.org>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Lorenzo Stoakes <lstoakes@...il.com>,
Christoph Hellwig <hch@...radead.org>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-bcachefs@...r.kernel.org,
Kent Overstreet <kent.overstreet@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Uladzislau Rezki <urezki@...il.com>, linux-mm@...ck.org
Subject: Re: [PATCH 07/32] mm: Bring back vmalloc_exec
On Mon, May 15, 2023 at 02:18:14AM -0400, Kent Overstreet wrote:
> On Sun, May 14, 2023 at 11:13:46PM -0700, Eric Biggers wrote:
> > On Mon, May 15, 2023 at 01:38:51AM -0400, Kent Overstreet wrote:
> > > On Sun, May 14, 2023 at 11:43:25AM -0700, Eric Biggers wrote:
> > > > I think it would also help if the generated assembly had the handling of the
> > > > fields interleaved. To achieve that, it might be necessary to interleave the C
> > > > code.
> > >
> > > No, that has negligable effect on performance - as expected, for an out
> > > of order processor. < 1% improvement.
> > >
> > > It doesn't look like this approach is going to work here. Sadly.
> >
> > I'd be glad to take a look at the code you actually tried. It would be helpful
> > if you actually provided it, instead of just this "I tried it, I'm giving up
> > now" sort of thing.
>
> https://evilpiepirate.org/git/bcachefs.git/log/?h=bkey_unpack
>
> > I was also hoping you'd take the time to split this out into a userspace
> > micro-benchmark program that we could quickly try different approaches on.
>
> I don't need to, because I already have this:
> https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/perf.ktest
Sure, given that this is an optimization problem with a very small scope
(decoding 6 fields from a bitstream), I was hoping for something easier and
faster to iterate on than setting up a full kernel + bcachefs test environment
and reverse engineering 500 lines of shell script. But sure, I can look into
that when I have a chance.
> Your approach wasn't any faster than the existing C version.
Well, it's your implementation of what you thought was "my approach". It
doesn't quite match what I had suggested. As I mentioned in my last email, it's
also unclear that your new code is ever actually executed, since you made it
conditional on all fields being byte-aligned...
- Eric
Powered by blists - more mailing lists