[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <88850c316ed14c7b8391cea05d875406@AcuMS.aculab.com>
Date: Mon, 13 Jul 2020 09:32:32 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Linus Torvalds' <torvalds@...ux-foundation.org>
CC: Al Viro <viro@...iv.linux.org.uk>,
Michael Ellerman <mpe@...erman.id.au>,
Christophe Leroy <christophe.leroy@....fr>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: RE: objtool clac/stac handling change..
From: Linus Torvalds
> Sent: 10 July 2020 23:37
> On Tue, Jul 7, 2020 at 5:35 AM David Laight <David.Laight@...lab.com> wrote:
> >
> >
> > So separate copy and checksum passes should easily exceed 4 bytes/clock,
> > but I suspect that doing them together never does.
> > (Unless the buffer is too big for the L1 cache.)
>
> Its' the "touch the caches twice" that is the problem".
>
> And it's not the "buffer is too big for L1", it's "the source, the
> destination and any incidentals are too big for L1" with the
> additional noise from replacement policies etc.
That's really what I meant.
L1D is actually (probably) only 32kB.
I guess that gives you 8k for the buffer.
It is a shame you can't use the AVX instructions in kernel.
(Although saving them probably costs more than the gain.)
Then you could use something based on:
10: load ymm,src+idx // 32 bytes
store ymm,tgt+idx
addq sum0,ymm // eight 32bit adds
rotate ymm,16 // Pretty sure there in an instruction for this!
addq sum1,ymm
add idx,32
jnz 10b
It is then possibly to determine the correct result from sum0/sum1.
On very recent Intel cpu that might even run at 1 iteration/clock!
(Probably needs and unroll and explicit interleave.)
At one iteration every 2 clocks it matches the ADDX[OC] loop
but includes the write.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists