lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Dec 2018 16:22:27 +0000
From:   Matt Sealey <Matt.Sealey@....com>
To:     "Markus F.X.J. Oberhumer" <markus@...rhumer.com>,
        Dave Rodgman <dave.rodgman@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
CC:     "herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "nitingupta910@...il.com" <nitingupta910@...il.com>,
        "minchan@...nel.org" <minchan@...nel.org>,
        "sergey.senozhatsky.work@...il.com" 
        <sergey.senozhatsky.work@...il.com>,
        "sonnyrao@...gle.com" <sonnyrao@...gle.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        nd <nd@....com>, "sfr@...b.auug.org.au" <sfr@...b.auug.org.au>
Subject: RE: [PATCH v4 0/7] lib/lzo: performance improvements

Markus,

> Request 2 - add COPY16; *NOT* acked by me
> 
>   [PATCH 2/8] lib/lzo: clean-up by introducing COPY16
> 
> is still not correct because of possible overlapping copies. I'll
> address this on the weekend.

Can you give a syndrome as to why

{
	COPY8(op, ip);
	COPY8(op+8,ip+8);
	ip+=16;
	op+=16;
}

or

{ 
	COPY8(op, ip);
	ip+=8;
	op+=8;
	COPY8(op, ip);
	ip+=8;
	op+=8;
}

vs.

#define COPY16(dst,src) COPY8(dst,src); COPY8(dst+8,src+8)

{
	COPY16(op, ip);
	ip+=16;
	op+=16;
}

.. causes "overlapping copies"?

COPY8 was only ever used in pairs as above and the second method
broke compiler optimizers since it adds an artificial barrier
between the two groups. The only difference was that decompress
and compress had the pointer increments spread out. If we need
to fix that then that's a good reason, but your reasoning continues
to elude me.

I can refactor the patch to align the second method with the first
and make compress and decompress get the same codegen, which is
functionally identical to the COPY16 patch, but that would seem to
in your opinion be the whole problem..

I'll see what you've got after the weekend ;D

Ta
Matt Sealey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ