linux-kernel - Re: [PATCH] gfs2: Revert "Fix loop in gfs2_rbm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wg87Y=39-_caKv6apLx4dfyC3ZdaoLvd3xz5zjONbw_Nw@mail.gmail.com>
Date:   Thu, 31 Jan 2019 10:40:58 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Andreas Gruenbacher <agruenba@...hat.com>
Cc:     Bob Peterson <rpeterso@...hat.com>, cluster-devel@...hat.com,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        stable@...r.kernel.org
Subject: Re: [PATCH] gfs2: Revert "Fix loop in gfs2_rbm_find"

On Wed, Jan 30, 2019 at 12:30 PM Andreas Gruenbacher
<agruenba@...hat.com> wrote:
>
> This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.
>
> It turns out that the fix can lead to a ~20 percent performance regression
> in initial writes to the page cache according to iozone.  Let's revert this
> for now to have more time for a proper fix.

Ugh. I think part of the problem is that the

        n += (rbm->bii - initial_bii);

doesn't make any sense when we just set rbm->bii to 0 a couple of
lines earlier. So it's basically a really odd way to write

        n -= initial_bii;

which in turn really doesn't make any sense _either_.

So I'l all for reverting the commit because it caused a performance
regression, but the end result really is all kinds of odd.

Is the bug as simple as "we incremented the iterator counter twice"?
Because in the -E2BIG case, we first increment it by the difference in
bii, but then we *also* increment it in res_covered_end_of_rgrp()
(which we do *not* do for the "ret > 0" case, which goes to
next_iter).

So if somebody can run the performance test again, how does it look if
*instead* of the revert, we do what looks at least _slightly_ more
sensible, and move the "increment iteration count" up to the
next-bitmap case instead?

At that point, it will actually match the bii updates (because
next_bitmap also updates rbm->bii). Hmm?

Something like the whitespace-damaged thinig below. Completely
untested. But *if* this works, it would make more sense than the
revert..

Hmm?

                   Linus

--- snip snip ---
  diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
  index 831d7cb5a49c..5b1006d5344f 100644
  --- a/fs/gfs2/rgrp.c
  +++ b/fs/gfs2/rgrp.c
  @@ -1796,10 +1796,10 @@ static int gfs2_rbm_find(struct gfs2_rbm
*rbm, u8 state, u32 *minext,
                rbm->bii++;
                if (rbm->bii == rbm->rgd->rd_length)
                        rbm->bii = 0;
  +             n++;
   res_covered_end_of_rgrp:
                if ((rbm->bii == 0) && nowrap)
                        break;
  -             n++;
   next_iter:
                if (n >= iters)
                        break;