[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrU6YwHHf=45vz4wbcs+XMCGWqu7wWTE_tOEzN7D3yXqEA@mail.gmail.com>
Date: Mon, 3 Mar 2014 18:23:32 -0800
From: Andy Lutomirski <luto@...capital.net>
To: discussions <discussions@...sword-hashing.net>
Subject: Re: [PHC] wider integer multiply on 32-bit x86
On Mon, Mar 3, 2014 at 6:13 PM, Solar Designer <solar@...nwall.com> wrote:
> I think 4 instructions, including the loads and stores, for a 63x63->63
> multiply is rather good. Without this trick, it'd take 4 _multiplies_
> to implement the equivalent via 32x32->32 (or perhaps 3 multiplies if we
> also use the 32x32->64). Some bigint library could use this trick,
> perhaps for some nice speedup on those older CPUs/builds (does any use
> it already?)
I think that Poly1305 and related things use a similar trick, at least
on some architectures.
Silly question, though: why are i387 instructions better than SSE2 here?
--Andy
Powered by blists - more mailing lists