linux-kernel - RE: [PATCH 2/3] riscv: optimized memmove

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2391e924440d4e59b7859b8ede8f0954@AcuMS.aculab.com>
Date: Tue, 30 Jan 2024 11:51:05 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Jisheng Zhang' <jszhang@...nel.org>
CC: Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt
	<palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
	"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Matteo Croce
	<mcroce@...rosoft.com>, kernel test robot <lkp@...el.com>
Subject: RE: [PATCH 2/3] riscv: optimized memmove

From: Jisheng Zhang
> Sent: 30 January 2024 11:31
> 
> On Sun, Jan 28, 2024 at 12:47:00PM +0000, David Laight wrote:
> > From: Jisheng Zhang
> > > Sent: 28 January 2024 11:10
> > >
> > > When the destination buffer is before the source one, or when the
> > > buffers doesn't overlap, it's safe to use memcpy() instead, which is
> > > optimized to use a bigger data size possible.
> > >
> > ...
> > > + * Simply check if the buffer overlaps an call memcpy() in case,
> > > + * otherwise do a simple one byte at time backward copy.
> >
> > I'd at least do a 64bit copy loop if the addresses are aligned.
> >
> > Thinks a bit more....
> >
> > Put the copy 64 bytes code (the body of the memcpy() loop)
> > into it an inline function and call it with increasing addresses
> > in memcpy() are decrementing addresses in memmove.
> 
> Hi David,
> 
> Besides the 64 bytes copy, there's another optimization in __memcpy:
> word-by-word copy even if s and d are not aligned.
> So if we make the two optimizd copy as inline functions and call them
> in memmove(), we almost duplicate the __memcpy code, so I think
> directly calling __memcpy is a bit better.

If a forwards copy is valid call memcpy() - which I think you do.
If not you can still use the same 'copy 8 register' code
that memcpy() uses - just with a decrementing block address.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)