lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 26 Nov 2021 18:39:52 -0600
From:   Noah Goldstein <goldstein.w.n@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     tglx@...utronix.de, mingo@...hat.com,
        Borislav Petkov <bp@...en8.de>, dave.hansen@...ux.intel.com,
        X86 ML <x86@...nel.org>, hpa@...or.com, peterz@...radead.org,
        alexanderduyck@...com, open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] x86/lib: Optimize 8x loop and memory clobbers in csum_partial.c

On Fri, Nov 26, 2021 at 6:15 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Fri, Nov 26, 2021 at 12:33 PM Noah Goldstein <goldstein.w.n@...il.com> wrote:
> >
> > On Fri, Nov 26, 2021 at 2:07 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > On Fri, Nov 26, 2021 at 11:50 AM Noah Goldstein <goldstein.w.n@...il.com> wrote:
> > > >
> > > > Bright :) but it will need a BMI support check.
> > >
> > > Yes, probably not worth the pain.
> >
> > Making a V2 for my patch with your optimization for the loop case. Do you think
> > 1 or 2 accum for the 32 byte case?
> >
>
> I would vote for something simpler, thus one accum, since this 32byte
> block is only run one time ?

If the one at a time performance is whats the most important wouldn't that
argue in favor of 2x accum because it lead to decreased latency? Or are you
saying it's not that important so simpler codes is the priority?

>
> Thanks !

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ