lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140528230147.3263.qmail@ns.horizon.com>
Date:	28 May 2014 19:01:47 -0400
From:	"George Spelvin" <linux@...izon.com>
To:	linux@...izon.com, tim.c.chen@...ux.intel.com
Cc:	herbert@...dor.apana.org.au, james.guilford@...el.com,
	JBeulich@...e.com, linux-kernel@...r.kernel.org, sandyw@...tter.com
Subject: Re: [RFC PATCH] crypto: crc32c-pclmul - Use pmovzxdq to shrink K_table

Thanks for the reply!

> Changing from the aligned move (movdqa) to unaligned move and zeroing
> (pmovzxdq), is going to make things slower.  If the table is aligned
> on 8 byte boundary, some of the table can span 2 cache lines, which
> can slow things further.

Um, two notes:
1) This load is performed once per 3072-byte block, which
   is a minimum of 128 cycles just for the crc32q instructions,
   never mind all the pcmulqdq folderol.

   Is it really more than 2 cycles?  Heck, is it *any* overall
   time given that it's preceded by a stretch of 384 instructions
   that it's not data-dependent on?

   I'll do some benchmarking to find out.

2) The shrunk table entries are 8 bytes long, and so can't
   span a cache line.  Is there any benefit to using a
   larger alignment, other than the very small issue of the
   full table needing 1 more cache line to be fully cached?
   
> We are trading speed for only 4096 bytes of memory save,
> which is likely not a good trade for most systems except for 
> those really constrained of memory.  For this kind of non-performance
> critical system, it may as well use the generic crc32c algorithm and
> compile out this module.

I hadn't intended to cause any speed penalty at all.
Do you really think there will be one?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ