lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPhsuW7GDauaO-VZC54uvFu7oCoBd=dMVBsi2xNJbMDkwfOqqA@mail.gmail.com>
Date:   Tue, 5 Sep 2023 13:49:30 -0700
From:   Song Liu <song@...nel.org>
To:     Kuan-Wei Chiu <visitorckw@...il.com>
Cc:     Roman Mamedov <rm@...anrm.net>, linux-raid@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] md/raid5: eliminate if-statements in cmp_stripe()

On Sun, Sep 3, 2023 at 1:10 PM Kuan-Wei Chiu <visitorckw@...il.com> wrote:
>
> On Sun, Sep 03, 2023 at 06:30:58PM +0500, Roman Mamedov wrote:
> > On Sun,  3 Sep 2023 17:50:59 +0800
> > Kuan-Wei Chiu <visitorckw@...il.com> wrote:
> >
> > > Replace the conditional statements in the cmp_stripe() function with a
> > > branchless version to improve code readability and potentially enhance
> > > performance.
> >
> > The new code will always do two comparisons and a subtraction (3
> > instructions in total), whereas the old version could return after just 1
> > comparison, or after 2 comparisons. So depending on the data values it is 3x
> > to 1.5x as much operations performed than before, there unlikely to be any
> > enhancement of performance.
> >
> > Also IMO the previous version is more easily readable.
> >
> The reason behind my proposed changes was to eliminate conditional
> branches in the code. While the original code could occasionally achieve
> early returns, many compilers, such as x86-64 gcc 13.2 compiling with
> -O2 flag, still generate branch instructions. Processors typically have
> deep pipelines, and a branch prediction miss can result in a high
> penalty. Therefore, even though early return might not be possible, the
> new branchless version of code could still offer efficiency
> improvements.

We need more information to support the efficiency improvement here.
In this case, I would like to see some benchmark results (micro
benchmark is fine).

If we cannot show the difference in performance, I would rather keep
current code.

Thanks,
Song

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ