[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFw9=wqyA4xO1KKJoH7xsj6poWFrWTddcNBR5tkDOn8SYg@mail.gmail.com>
Date: Sun, 25 Sep 2016 12:56:17 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Cedric Blancher <cedric.blancher@...il.com>
Cc: Ross Zwisler <ross.zwisler@...ux.intel.com>,
Matthew Wilcox <mawilcox@...uxonhyperv.com>,
Konstantin Khlebnikov <koct9i@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Matthew Wilcox <mawilcox@...rosoft.com>
Subject: Re: [PATCH 2/2] radix-tree: Fix optimisation problem
On Sun, Sep 25, 2016 at 12:04 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> It gets rid of
> the ad-hoc arithmetic in radix_tree_descend(), and just makes all that
> be inside the is_sibling_entry() logic instead. Which got renamed and
> made to actually return the main sibling.
Sadly, it looks like gcc generates bad code for this approach. Looks
like it ends up testing the resulting sibling pointer twice (because
we explicitly disable -fno-delete-null-pointer-checks in the kernel,
and we have no way to say "look, I know this pointer I'm returning is
non-null").
So a smaller patch that keeps the old boolean "is_sibling_entry()" but
then actually *uses* that inside radix_tree_descend() and then tries
to make the nasty cast to "void **" more legible by making it use a
temporary variable seems to be a reasonable balance.
At least I feel like I can still read the code, but admittedly by now
that may be because I've stared at those few lines so much that I feel
like I know what's going on. So maybe the code isn't actually any more
legible after all.
.. and unlike my previous patch, it actually generates better code
than the original (while still passing the fixed test-suite, of
course). The reason seems to be exactly that temporary variable,
allowing us to just do
entry = rcu_dereference_raw(*sibentry);
rather than doing
entry = rcu_dereference_raw(parent->slots[offset]);
with the re-computed offset.
So I think I'll commit this unless somebody screams.
Linus
View attachment "patch.diff" of type "text/x-patch" (771 bytes)
Powered by blists - more mailing lists